Main Facts about Intel Gaudi 3 AI Accelerator for GenAI Revolution

At the Intel Vision event, Intel took the stage to introduce the groundbreaking Intel Gaudi 3 AI accelerator, marking a significant leap forward in AI computing capabilities.
Benefits of Intel Gaudi 3 AIIntel said this accelerator promises a four-fold increase in AI compute for BF16, a 1.5x boost in memory bandwidth, and a two-fold enhancement in networking bandwidth compared to its predecessor. Such advancements signal a remarkable stride in performance and productivity for AI training and inference tasks, particularly for large language models (LLMs) and multimodal models.

Enterprises across various sectors, including finance, manufacturing, and healthcare, are transitioning their AI projects from experimental phases to full-scale implementation. The Intel Gaudi 3 accelerator addresses the pressing need for open, cost-effective, and energy-efficient solutions, catering to the increasing demand for AI accessibility and innovation.

Intel Gaudi 3 accelerator is meticulously crafted to deliver unparalleled performance and efficiency for large-scale AI computations. Built on a cutting-edge 5nm process, it boasts a unique architecture that activates all engines — Matrix Multiplication Engine (MME), Tensor Processor Cores (TPCs), and Networking Interface Cards (NICs) — in parallel. This design enables rapid and efficient deep learning computation, ensuring accelerated performance and scalability for GenAI applications.

Notable Features Include:

AI-Dedicated Compute Engine: With 64 AI-custom and programmable TPCs and eight MMEs, the Intel Gaudi 3 accelerator is tailored for high-performance GenAI compute, enabling efficient handling of complex matrix operations fundamental to deep learning algorithms.

Memory Boost: Equipped with 128GB of HBMe2 memory capacity and 3.7TB of memory bandwidth, the Intel Gaudi 3 accelerator efficiently processes large GenAI datasets, enhancing workload performance and data center cost efficiency.

Efficient System Scaling: Integrated with twenty-four 200Gb Ethernet ports, the accelerator facilitates flexible and open-standard networking, enabling seamless scaling to support large compute clusters without vendor lock-in.

Open Industry Software: The Intel Gaudi software integrates the PyTorch framework and optimized Hugging Face community-based models, simplifying GenAI development and ensuring ease of model porting across hardware types.

Performance Projections: Intel projects significant performance improvements for training and inference tasks on leading GenAI models compared to Nvidia H100, including:

50% faster time-to-train across Llama 7B and 13B parameters, and GPT-3 175B parameter models.

50% faster inference throughput and 40% greater inference power-efficiency across various models.

30% faster inferencing against Nvidia H200 on select parameters.

Intel Gaudi 3 accelerator will be available to OEMs in the second quarter of 2024, with general availability anticipated in the third quarter. Notable adopters include Dell Technologies, HPE, Lenovo, and Supermicro. Additionally, Intel Gaudi 3 accelerators will power cost-effective cloud LLM infrastructures, offering organizations like NAVER price-performance advantages.

The momentum of Intel Gaudi 3 accelerators sets the stage for Falcon Shores, Intel’s next-generation GPU for AI and high-performance computing. Integrated with Intel Gaudi and Intel Xe IP, Falcon Shores aims to revolutionize GPU programming with a single interface built on the Intel oneAPI specification.

Related News

Latest News

Latest News