Meta Platforms, the parent company of Facebook, has unveiled plans to roll out a new iteration of its own AI chip in its data centers later this year.
According to a confidential internal document obtained by Reuters on Thursday, the chip, constituting the second generation of Meta’s in-house silicon line announced last year, is geared towards supporting the company’s aggressive AI initiatives.
This development signals Meta’s effort to reduce reliance on Nvidia chips, which dominate the AI chip market. By deploying its proprietary chip, Meta aims to control the escalating costs associated with running AI workloads, particularly as it strives to launch innovative AI products across its platforms, including Facebook, Instagram, WhatsApp, and hardware devices like its Ray-Ban smartglasses.
The company has invested billions of dollars in amassing specialized chips and reconfiguring data centers to accommodate the burgeoning demand for power-hungry generative AI products.
At Meta’s scale of operations, a successful implementation of its custom chip could potentially result in substantial cost savings. Dylan Patel, founder of the silicon research group SemiAnalysis, estimates that Meta’s chip deployment could trim hundreds of millions of dollars in annual energy costs and billions in chip purchasing costs.
The necessity of chips, infrastructure, and energy to run AI applications has become a significant financial commitment for tech companies, somewhat offsetting the gains made during the initial excitement around AI technology.
A spokesperson from Meta confirmed the plan to commence production of the updated chip in 2024. The spokesperson emphasized the synergy between Meta’s internally developed accelerators and the off-the-shelf graphics processing units (GPUs) – the go-to chips for AI – that the company continues to acquire in large quantities.
“We see our internally developed accelerators to be highly complementary to commercially available GPUs in delivering the optimal mix of performance and efficiency on Meta-specific workloads,” stated the spokesperson.
Meta CEO Mark Zuckerberg revealed last month that the company aims to have approximately 350,000 flagship “H100” processors from Nvidia in operation by the end of the year. This, combined with other suppliers, would equate to a total compute capacity of 600,000 H100s.
The decision to deploy Meta’s own chip is a positive shift for the company’s in-house AI silicon project. This comes after executives decided in 2022 to discontinue the first iteration of the chip and opt for the purchase of billions of dollars worth of Nvidia’s GPUs, which dominate the AI training process.
The new chip, internally referred to as “Artemis,” shares similarities with its predecessor in that it is designed for inference, a process where models use algorithms to make ranking judgments and generate responses to user prompts.
Last year, Reuters reported that Meta is also actively developing a more ambitious chip capable of handling both training and inference, mirroring the capabilities of GPUs.
Despite early setbacks, the inference chip is anticipated to be considerably more efficient in processing Meta’s recommendation models compared to the energy-intensive Nvidia processors, according to Patel. “There is a lot of money and power being spent that could be saved,” he remarked.