Nvidia’s Vera Rubin AI System Promises a Leap in Performance and Efficiency
Nvidia is poised to launch its next-generation AI system, Vera Rubin, later this year, following strong sales of its current rack-scale offerings. This new platform is engineered to deliver a tenfold increase in performance per watt compared to its predecessor, Grace Blackwell, a critical advancement as energy consumption becomes a paramount concern in the expanding landscape of artificial intelligence.
CNBC has obtained an exclusive preview of the Vera Rubin system at Nvidia’s headquarters in Santa Clara, California. The system, comprised of an impressive 1.3 million components, represents a significant step forward in AI infrastructure. At its heart are 72 Rubin graphics processing units (GPUs) and 36 Vera central processing units (CPUs), primarily manufactured by Taiwan Semiconductor Manufacturing Co. The complex supply chain for Vera Rubin extends globally, incorporating over 80 suppliers across at least 20 countries, including China, Vietnam, Thailand, Mexico, Israel, and the U.S. These suppliers contribute a wide array of components, from advanced liquid cooling solutions to robust power systems and specialized compute trays.
A notable challenge in the current AI hardware market is the escalating cost of high-bandwidth memory (HBM), driven by unprecedented demand. Nvidia’s head of AI infrastructure, Dion Harris, addressed this, stating the company is actively collaborating with its suppliers, providing “very detailed forecasts” to ensure supply chain alignment. “We’re in good shape,” Harris affirmed, indicating confidence in their ability to meet production targets despite market pressures.
This development arrives at a pivotal moment for Nvidia, a dominant force in the AI processor market. However, the company faces mounting competition. Advanced Micro Devices (AMD) is making significant strides, while tech giants like Broadcom and Google are developing their own custom silicon solutions, such as Google’s Tensor Processing Units (TPUs). Nvidia has outlined ambitious plans to invest up to $500 billion in AI infrastructure manufacturing within the U.S. through 2029, including the production of Blackwell GPUs at TSMC’s new facilities in Arizona.
The Grace Blackwell system, which entered production in 2024, already revolutionized compute capabilities within a single system. Vera Rubin, slated for shipment in the latter half of 2026, is set to elevate this further. Nvidia CEO Jensen Huang announced in January that the system is now in full production, signaling its readiness for widespread deployment.
“These are massive systems, integrating all the necessary compute, networking, cabling, and cooling,” noted Daniel Newman of the Futurum Group. “They are designed for peak efficiency and performance, a departure from traditional server architectures.”
Meta has publicly announced its intention to deploy Vera Rubin in its data centers by 2027, joining a growing roster of anticipated customers that includes OpenAI, Anthropic, Amazon, Google, and Microsoft. The Vera Rubin racks, assembled in the U.S. and other global locations like Taiwan and a new Foxconn facility in Mexico, weigh close to two tons and house approximately 1,300 microchips, a substantial increase from Grace Blackwell’s 864.
Designed for enhanced usability, Vera Rubin features a simpler, modular architecture that facilitates easier installation and maintenance. Each superchip can be quickly removed from one of the rack’s 18 compute trays in seconds, a marked improvement over the soldered components in the Blackwell system. Nvidia asserts that while Vera Rubin will consume roughly twice the power of its predecessor, its efficiency will be dramatically higher due to the tenfold improvement in performance per watt.
Jordan Klein, an analyst at Mizuho Securities, emphasized the critical metric for AI systems: “how many tokens per power consumed can you get.” He added, “The higher the return on your investment, the more advantageous it is.”
Vera Rubin is also Nvidia’s first system to feature 100% liquid cooling, a design choice Harris highlighted as contributing to significantly reduced water consumption in data centers compared to traditional cooling methods.
While Nvidia does not publicly disclose rack pricing, industry estimates from the Futurum Group suggest a price increase of approximately 25% over Grace Blackwell, placing the system in the range of $3.5 million to $4 million.
As major enterprises diversify their technology sourcing, many are integrating their own custom-designed silicon into AI servers. Amazon Web Services, for instance, utilizes its Trainium 2 chips in its “ultra-servers,” while Google employs its TPUs.
Looking ahead, Nvidia will encounter direct competition later this year when AMD launches its own rack-scale system, Helios. AMD has already secured a significant commitment from Meta for up to 6 gigawatts of capacity. “Customers are seeking greater capacity and a reliable alternative to ensure competitive pricing and innovation,” Klein commented.
Responding to the competitive landscape, Harris stated, “Hats off to anyone who’s going to try. But this is certainly not a simple endeavor.”
Original article, Author: Tobias. If you wish to reprint this article, please indicate the source:https://aicnbc.com/19345.html