AI inference

  • First AI GPU Chipset Comparison Report: NVIDIA Dominates, Huawei Surpasses AMD

    A Morgan Stanley report reveals high profitability in AI inference, with average profit margins exceeding 50% for “AI inference factories.” NVIDIA’s GB200 NVL72 leads with a near 78% profit margin, followed by Google’s TPU v6e pod (74.9%) and AWS’s Trn2 UltraServer (62.5%). Huawei’s Ascend CloudMatrix 384 achieves 47.9%. AMD’s MI300X and MI355X, however, show significant negative profit margins due to insufficient token generation efficiency.

    2025年8月16日
  • Huawei’s Upcoming AI Breakthrough: Potential Reduction in HBM Memory Dependency

    Huawei is expected to unveil a significant AI inference technology at an upcoming forum, potentially reducing China’s reliance on High Bandwidth Memory (HBM). HBM is crucial for AI inference due to its high bandwidth and capacity, enabling faster access to large language model parameters. However, HBM supply constraints and export restrictions are pushing Chinese companies to seek alternatives. This innovation could improve the performance of Chinese AI models and strengthen the Chinese AI ecosystem.

    2025年8月9日