Nvidia is introducing a brand new top-of-the-line chip for AI work, the HGX H200. The brand new GPU upgrades the wildly in demand H100 with 1.4x extra reminiscence bandwidth and 1.8x extra reminiscence capability, bettering its capacity to deal with intensive generative AI work.
The large query is whether or not corporations will be capable to get their palms on the brand new chips or whether or not they’ll be as provide constrained because the H100 — and Nvidia doesn’t fairly have a solution for that. The primary H200 chips will probably be launched within the second quarter of 2024, and Nvidia says it’s working with “international system producers and cloud service suppliers” to make them obtainable. Nvidia spokesperson Kristin Uchiyama declined to touch upon manufacturing numbers.
The H200 seems to be considerably the identical because the H100 outdoors of its reminiscence. However the adjustments to its reminiscence make for a significant improve. The brand new GPU is the primary to make use of a brand new, sooner reminiscence spec known as HBM3e. That brings the GPU’s reminiscence bandwidth to 4.8 terabytes per second, up from 3.35 terabytes per second on the H100, and its complete reminiscence capability to 141GB up from the 80GB of its predecessor.
“The mixing of sooner and extra intensive HBM reminiscence serves to speed up efficiency throughout computationally demanding duties together with generative AI fashions and [high-performance computing] functions whereas optimizing GPU utilization and effectivity,” Ian Buck, Nvidia’s VP of high-performance computing merchandise, mentioned in a video presentation this morning.
The H200 can also be constructed to be suitable with the identical programs that already help H100s. Nvidia says cloud suppliers gained’t must make any adjustments as they add H200s into the combo. The cloud arms of Amazon, Google, Microsoft, and Oracle will probably be among the many first to supply the brand new GPUs subsequent yr.
As soon as they launch, the brand new chips are certain to be costly. Nvidia doesn’t listing how a lot they value, however CNBC studies that the prior-generation H100s are estimated to promote for wherever between $25,000 to $40,000 every, with hundreds of them wanted to function on the highest ranges. Uchiyama mentioned pricing is about by Nvidia’s companions.
Nvidia’s announcement comes as AI corporations stay desperately on the hunt for its H100 chips. Nvidia’s chips are seen as the best choice for effectively processing the massive portions of information wanted to coach and function generative picture instruments and enormous language fashions. The chips are worthwhile sufficient that corporations are utilizing them as collateral for loans. Who has H100s is the topic of Silicon Valley gossip, and startups have been working collectively simply to share any entry to them in any respect.
Uchiyama mentioned that the H200’s debut gained’t affect manufacturing of the H100. “You’ll see us add total provide all year long and we’re persevering with to buy provide for the long run,” Uchiyama wrote in an e mail to The Verge.
Subsequent yr is shaping as much as be a extra auspicious time for GPU patrons. In August, the Monetary Occasions reported that Nvidia was planning to triple its manufacturing of the H100 in 2024. The aim was to provide as much as 2 million of them subsequent yr, up from round 500,000 in 2023. However with generative AI simply as explosive at this time because it was firstly of the yr, the demand could solely be larger — and that’s earlier than Nvidia threw an excellent hotter new chip within the combine.
Replace November thirteenth 4:35PM ET: Added extra info from Nvidia spokesperson Kristin Uchiyama.