How cloud providers are tackling GPU shortages with custom chips


GPUs are the backbone of AI computing, but as demand exceeds supply, cloud providers are getting creative.

Instead of waiting for more GPUs, as Network World reported, they’re creating custom chips to meet specific workloads, delivering faster, more efficient computing while keeping costs under control.

The competition is heating up. At Microsoft’s Ignite conference last week, the company unveiled two new chips designed to boost performance for its Azure platform. All eyes are now on AWS, as it gears up for its own, custom silicon portfolio.

Why custom chips matter

GPUs have revolutionised tasks like training AI models, but they’re not always the best tool for the job. They come with significant drawbacks: high power consumption, intensive cooling needs, and, right now, a global shortage. Nvidia’s latest GPUs inventory is spoken for, for the next 12 months.

Custom accelerators are stepping in to fill the gap. Mario Morales, vice president analyst at IDC, highlights the growing importance of alternatives to GPUs: “These accelerators are becoming increasingly important in cloud infrastructure due to their superior price-performance and price-efficiency ratios, which lead to better return on investments.”

AWS and Google have been rolling out custom chips for years—AWS with Trainium and Inferentia, and Google with Tensor Processing Units (TPUs). Microsoft, however, was late to join the custom silicon trend. It wasn’t until last year that the company introduced its first custom chips, Maia and Cobalt, aimed at improving energy efficiency and handling AI workloads.

This year, Microsoft has stepped up its game, introducing two new chips:

  • Azure Boost DPU: Designed to optimise data processing by running a custom operating system.
  • Azure Integrated HSM: Focused on security, it keeps encryption and signing keys securely in hardware.

Microsoft’s Azure Boost DPU is a step forward, but it still lags behind competitors in the DPU space. Forrester senior analyst Alvin Nguyen notes that Google’s E2000 IPU, co-developed with Intel, and AWS’s Nitro system are both already well-established. Other cloud providers, including Nvidia with its Bluefield chips and AMD with Pensando, are jockeying for position.

That said, Microsoft is making notable advancements in infrastructure. The company announced new liquid-cooling solutions for AI servers and a power-efficient rack design co-developed with Meta, which can pack 35% more AI accelerators into each rack.

Security gets a custom boost

Security is another area where custom silicon is making progress. Microsoft’s new HSM chip is a dedicated solution for encryption tasks that would traditionally require a mix of hardware and software. Nguyen notes this approach reduces latency and enhances scalability, making it an addition worth considering.

AWS and Google are also using custom chips for security. AWS Nitro prevents main system CPUs from modifying firmware, and Google’s Titan establishes ‘a secure root of trust’ for validating system health.

Each provider has its own approach, Nguyen explains. “While Nitro provides the critical security function of ensuring that the main system CPUs cannot update firmware in bare metal mode, Titan provides a hardware-based root of trust that establishes the strong identity of a machine, with which we can make important security decisions and validate the health of the system.”

The future of custom chips in the cloud

The push for custom silicon isn’t slowing. According to Alexander Harrowell, principal analyst at Omdia, it’s a logical move for hyperscalers to invest in these chips to reduce costs and improve efficiency.

As the demand for faster, more specialised computing grows, custom chips are a strategy for cloud providers to stay competitive. With innovation in overdrive, the race to redefine cloud performance is just starting.

(Photo by Unsplash)

See also: IBM wants Nvidia GPUs, and AWS might be the answer

Want to learn more about cybersecurity and the cloud from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London. Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: AI, cloud, microsoft

Source link

#cloud #providers #tackling #GPU #shortages #custom #chips