AI inference

Tech

Qualcomm Enters AI Chip Market, Challenging AMD and Nvidia

Qualcomm is entering the data center AI accelerator market, challenging Nvidia’s dominance with its AI200 and AI250 chips planned for 2026 and 2027. Leveraging its expertise in mobile NPUs, Qualcomm aims to capitalize on the booming AI server market. Qualcomm emphasizes its total cost of ownership benefits and higher memory capacity (768GB per AI card). The company initially focuses on AI inference and offers flexible system configurations. A partnership with Saudi Arabia’s Humain demonstrates Qualcomm’s commitment to the sector.

2025年11月7日

Tech

Moving AI Workloads from Cloud to On-Premise: A Strategy for Reducing Power Consumption

Arm CEO Rene Haas advocates for distributing AI workloads from cloud-based infrastructure to local devices to reduce energy consumption and improve sustainability. He highlights a shift towards hybrid computing, with AI training in the cloud and inference occurring on devices like smartphones and AR glasses. Arm’s expanded partnership with Meta aims to optimize AI efficiency across the entire compute stack, exemplified by localized speech recognition in Meta’s Ray-Ban Wayfarer glasses. This localized processing enhances responsiveness and reduces reliance on cloud servers.

2025年10月19日

Markets

First AI GPU Chipset Comparison Report: NVIDIA Dominates, Huawei Surpasses AMD

A Morgan Stanley report reveals high profitability in AI inference, with average profit margins exceeding 50% for “AI inference factories.” NVIDIA’s GB200 NVL72 leads with a near 78% profit margin, followed by Google’s TPU v6e pod (74.9%) and AWS’s Trn2 UltraServer (62.5%). Huawei’s Ascend CloudMatrix 384 achieves 47.9%. AMD’s MI300X and MI355X, however, show significant negative profit margins due to insufficient token generation efficiency.

2025年8月16日

Markets

Huawei’s Upcoming AI Breakthrough: Potential Reduction in HBM Memory Dependency

Huawei is expected to unveil a significant AI inference technology at an upcoming forum, potentially reducing China’s reliance on High Bandwidth Memory (HBM). HBM is crucial for AI inference due to its high bandwidth and capacity, enabling faster access to large language model parameters. However, HBM supply constraints and export restrictions are pushing Chinese companies to seek alternatives. This innovation could improve the performance of Chinese AI models and strengthen the Chinese AI ecosystem.

2025年8月9日