Huawei Ascend Chips Drive World’s Most Powerful Cluster

Chinese tech heavyweight Huawei unveiled its ambitious roadmap for the next generation of its Ascend chip series this week at Huawei Connect 2025 in Shanghai, signaling a continued push into the high-performance computing arena amid ongoing geopolitical pressures.

During his keynote address, Eric Xu, Deputy Chairman of the Huawei Board, characterized 2025 as a “memorable year,” highlighting the January debut of DeepSeek-R1 as a critical milestone for the company. Xu also candidly acknowledged the challenging reality of China’s semiconductor manufacturing capabilities, stating that the nation is likely to trail behind in advanced process node technology “for a relatively long time.” This honest assessment underscores the strategic complexities Huawei faces as it seeks to compete with global rivals.

Huawei’s response to tariffs and trade restrictions has been multi-pronged, focusing on advancing domestic infrastructure design, developing proprietary technology, and embracing open-source strategies. The company has opted to open-source significant portions of its software ecosystem, including the openPangu foundation AI models and the Mind series Software Development Kits (SDKs). This move aims to foster collaboration, accelerate innovation within China’s tech ecosystem, and potentially attract international developers.

The Ascend Chip Offensive

Huawei’s plans involve the introduction of three new Ascend chip series: the 950, 960, and 970. These chips are specifically designed for AI and high-performance computing workloads, directly challenging the dominance of NVIDIA in these critical markets.

The Ascend 950PR and 950TO variants, built on the same silicon die, will offer enhanced support for low-precision data formats, including FP8 (Floating Point 8-bit). The Ascend 950 is projected to deliver one PFLOP (Peta Floating Point Operations Per Second) of performance and two PFLOPs using the MXFP8 data format. The architecture will also benefit from improved vector processing capabilities and more granular memory access, with a reduction to 128-byte chunks from the previous 512 bytes. This finer-grained control allows for more efficient data handling, particularly important for AI workloads with diverse data types and sizes.

The Ascend 950 chips will boast a 2 TB/s interconnect bandwidth, representing a 2.5x increase compared to the existing Ascend 910C. This boost in interconnect speed is crucial for scaling AI systems, enabling faster communication between processors and memory, thereby reducing bottlenecks. The Ascend 950PR is slated for release in Q1 2026, followed by the Ascend 950DT in Q4 2026.

The Ascend 960, scheduled for Q4 2027, promises a significant leap in performance, with double the computing power, memory access bandwidth, memory capacity, and interconnect port count compared to its predecessor, the 950. This chip will support Huawei’s proprietary HiF4 data format, which the company claims delivers superior precision compared to other FP4 (Floating Point 4-bit) technologies. While the specifics of HiF4 are still under wraps, its potential to improve accuracy at low precision levels could be a significant advantage for AI training and inference tasks.

The most advanced chip in the lineup is the Ascend 970, targeting a Q4 2028 release. While specific details are still being finalized, Xu indicated a commitment to pushing all performance metrics significantly higher. The Ascend 970 series is projected to offer an interconnect bandwidth of 4TB/s, deliver 8 PFLOPs of FP4 performance, and feature expanded memory capacity. These specifications suggest a focus on addressing the computationally intensive demands of future AI models and applications.

SuperPods: Huawei’s Scale-Out Strategy

Huawei’s strategy extends beyond individual chips, focusing on offering hyperscale data centers clusters of raw compute in the form of SuperPoDs. These integrated solutions are designed to accelerate AI development and deployment, providing a complete hardware and software platform. The Atlas 950 SuperPoD, equipped with the new Ascend 950DT chips, is expected to debut in Q4 2026.

Huawei is directly comparing its offerings to NVIDIA’s NVL144 system, a SuperPod analog. Huawei claims that its first SuperPoD will offer significantly more NPUs (Neural Processing Units) and deliver nearly seven times the processing power compared to the NVL144. The comparison continues even with NVIDIA’s planned NVL576 release in 2027, with Huawei arguing that the Atlas 950 SuperPoD will still maintain a performance advantage. These claims reflect Huawei aggressive positioning in the AI infrastructure market.

Expanding into General-Purpose Computing

Beyond specialized AI chips, Huawei is also targeting the general-purpose computing market with its Kunpeng 950 processors. Two models are planned for release in Q1 2026, featuring 96 cores & 192 threads and 192 cores & 384 threads, respectively. The faster model represents a significant core count, indicating a focus on handling demanding multi-threaded workloads.

Huawei is also developing what it calls “the world’s first general-purpose computing SuperPoD,” the Kunpeng 950-based TaiShan 950 SuperPod, also slated for availability in the first quarter of 2026. This move signals Huawei’s intention to compete more broadly in the data center market, offering solutions for a wider range of applications beyond AI.

UnifiedBus 2.0: Open-Source Connectivity

Underpinning Huawei’s SuperPoD and SuperCluster strategy is UnifiedBus 2.0, the next iteration of its existing high-speed interconnect technology. UnifiedBus 1.0 is currently utilized in the Atlas 900 A3 SuperPoD, which has seen over 300 installations since its introduction in March of this year.

Notably, UnifiedBus 2.0 will be an open protocol, with the technical specifications released to the developer community. This open-source approach aims to encourage adoption and further development of the technology, potentially creating a wider ecosystem around Huawei’s SuperPoD architecture. UnifiedBus 2.0 will be used internally to connect individual processors within the new SuperPoDs and to link clusters of SuperPoDs, forming even larger SuperClusters.

The first cluster product is the Atlas 950 SuperCluster, which Huawei claims will offer significantly more NPUs and higher computing power than xAI’s Colossus, currently considered the world’s most powerful computing cluster. In the last quarter of 2027, Huawei intends to launch the Atlas 960 SuperCluster, integrating over a million NPUs and delivering 4 ZFLOPS (Zetta Floating Point Operations Per Second) in FP4. This massive scale highlights Huawei aspirations to become a leading provider of advanced computing infrastructure.

“SuperPoDs and SuperClusters powered by UnifiedBus are our answer to surging demand for computing, both today and tomorrow,” Xu stated. This pronouncement encapsulates Huawei’s vision of providing scalable, high-performance computing solutions to meet the growing needs of AI, scientific research, and other demanding applications.

Original article, Author: Samuel Thompson. If you wish to reprint this article, please indicate the source:https://aicnbc.com/9562.html

Huawei Ascend Chips Drive World’s Most Powerful Cluster

The Ascend Chip Offensive

SuperPods: Huawei’s Scale-Out Strategy

Expanding into General-Purpose Computing

UnifiedBus 2.0: Open-Source Connectivity

About Author

Samuel Thompson

Related News

VMware Ventures into AI, But Keeps Focus on Core Business

OpenAI’s Data Residency Enhancements Bolster Enterprise AI Governance

Alibaba’s Qwen Model Powers Up AI Transcription Tools