“`html
CNBC AI News, July 26th – The 2025 World Artificial Intelligence Conference (WAIC) kicked off with a bang today at the Shanghai World Expo Exhibition & Convention Center. A major highlight: Huawei is showcasing its Ascend 384 Super Node, officially named the Atlas 900 A3 SuperPoD, in person for the first time.
Built on a super-node architecture, the Atlas 900 A3 SuperPoD leverages advanced bus technology to achieve high-bandwidth, low-latency interconnection between 384 NPUs. This effectively addresses the communication bottlenecks that often plague inter-resource communication between compute and storage within large-scale clusters.
Furthermore, system-level optimizations ensure efficient resource scheduling, enabling the super-node to operate with the stability and predictability of a single, unified computing entity.
Huawei previously unveiled its Ascend super-node concept at the Kunpeng Ascend Developer Conference in May, showcasing its ability to achieve high-speed bus interconnection across a record-breaking 384 cards.
The Ascend super-node boasts significant advantages, including ultra-high bandwidth, ultra-low latency, and exceptional overall performance, supporting a wide range of training and inference workloads.
Its innovative architecture is specifically designed to meet the demanding requirements of model training and inference, providing the low-latency, high-bandwidth, and long-term reliability needed for cutting-edge AI applications.
According to official statements, Huawei’s CloudMatrix 384 (CM384) AI cluster solution is built around 384 Ascend chips, leveraging a fully interconnected topology for highly efficient inter-chip collaboration.
This solution delivers a staggering 300 PFLOPs of dense BF16 compute power, reportedly approaching double the performance of NVIDIA’s GB200 NVL72 system.
The CM384 also boasts significantly more memory and bandwidth, with a total memory capacity 3.6 times greater and memory bandwidth 2.1 times greater than NVIDIA’s comparable solutions, providing more efficient hardware support for large-scale AI training and inference tasks.
While the performance of a single Ascend chip is approximately one-third that of an NVIDIA Blackwell architecture GPU, Huawei’s scaled-out system design has enabled a significant leap in overall compute capabilities, offering greater competitiveness in ultra-large-scale model training and real-time inference scenarios.
The consensus from some international investment banks is that Huawei’s scaled solution “leads NVIDIA and AMD’s current market offerings by a generation.” They believe that China’s breakthrough in AI infrastructure will have a profound impact on the global AI landscape.
“`
Original article, Author: Tobias. If you wish to reprint this article, please indicate the source:https://aicnbc.com/5706.html