NVIDIA, the industry leader in AI chips, has announced the H200 chip, set to revolutionize artificial intelligence capabilities.
The new chip, scheduled for release next year, promises enhanced performance and efficiency, with major cloud service providers Amazon.com, Google, and Oracle among the first to integrate the cutting-edge technology.
IDC report says the worldwide artificial intelligence (AI) software market will grow from $64 billion in 2022 to nearly $251 billion in 2027 at a compound annual growth rate (CAGR) of 31.4 percent.
The H200 chip represents a significant advancement over its predecessor, the H100, primarily due to increased high-bandwidth memory (HBM). High-bandwidth memory is a critical component influencing how rapidly a chip can process data, and the H200 boasts an impressive 141 gigabytes of HBM, a notable upgrade from the H100’s 80 gigabytes.
This improvement is particularly significant for AI services like OpenAI’s ChatGPT, which relies on NVIDIA chips to power its generative AI capabilities. The increased high-bandwidth memory and faster connection to processing elements mean that AI services can deliver quicker responses, enhancing user experience.
While NVIDIA did not disclose the suppliers for the memory on the H200 chip, Micron Technology has expressed its intention to become an NVIDIA supplier, Reuters news report said. NVIDIA also sources memory from SK Hynix, which reported a boost in sales last month, attributing it to increased demand for AI chips.
NVIDIA has secured partnerships with major cloud service providers, including Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, to offer access to the H200 chips. Specialty AI cloud providers CoreWeave, Lambda, and Vultr are also part of the initial deployment.
Ian Buck, Vice President of NVIDIA’s high-performance computing and hyperscale data center business, highlighted the transformative impact of NVIDIA’s hardware and software innovations, creating a new class of AI supercomputers. The H200 is designed to run a full stack of accelerated software for generative AI, high-performance computing (HPC), and hybrid quantum computing.
The H200 features up to 141GB of HBM3e, making it the first AI accelerator to utilize this ultrafast technology. Notably, it provides an 18x performance increase over prior-generation accelerators when running models like GPT-3, with the capability to process 12,000 tokens per second on a Llama2-13B large language model.
Additionally, NVIDIA revealed a server platform that incorporates four GH200 Grace Hopper Superchips, showcasing a staggering 288 Arm Neoverse cores and 16 petaflops of AI performance. The chip’s efficiency is evident, as one GH200 Superchip using the NVIDIA TensorRT-LLM open-source library is reported to be 100x faster than a dual-socket x86 CPU system and nearly 2x more energy-efficient than an X86 + H100 GPU server.
The latest TOP500 list of the world’s fastest supercomputers reflects NVIDIA’s dominance, with 38 out of 49 new systems powered by NVIDIA H100 Tensor Core GPUs. The shift toward accelerated, energy-efficient supercomputing is evident, with NVIDIA delivering over 2.5 exaflops of HPC performance across leading systems.
Microsoft Azure’s Eagle system, utilizing H100 GPUs in NDv5 instances, secured the No. 3 spot on the TOP500 list with 561 petaflops. Mare Nostrum5 in Barcelona ranked No. 8, and NVIDIA Eos, a record-setting AI training system, claimed the No. 9 position.
NVIDIA’s GPUs also demonstrated energy efficiency, powering 23 of the top 30 systems on the Green500. The H100 GPU-based Henri system retained the No. 1 spot, delivering 65.09 gigaflops per watt for the Flatiron Institute in New York.
The introduction of the H200 chip marks a significant leap forward in AI capabilities, solidifying NVIDIA’s position as a key player in the ever-evolving landscape of artificial intelligence and high-performance computing.