AI inference
-
Analyst: Nvidia-Groq Deal Fuels Fiction of Competition
Nvidia is reportedly licensing AI inference technology from Groq for $20 billion, a deal that also brings Groq’s founder and leadership to Nvidia. This move strengthens Nvidia’s position in AI inference and potentially blocks competitors from accessing Groq’s technology. The strategy avoids antitrust scrutiny associated with acquisitions and bolsters Nvidia’s comprehensive AI offerings.
-
Europe’s Cautious AI Strategy May Be Its Competitive Edge
Europe’s data‑center market, long seen as lagging, can turn its regulatory and power‑supply constraints into strengths as AI drives a global capacity boom. While the US will lead spending, Europe aims to double its capacity, focusing on regions with abundant renewable power (e.g., Nordics, Spain) and faster grid connections. Policies are shifting to prioritize ready projects and support AI‑inference, a high‑density, low‑latency niche. Tight environmental and community rules may raise costs but create higher‑quality, resilient assets, offering investors durable returns amid the AI surge.
-
Qualcomm Enters AI Chip Market, Challenging AMD and Nvidia
Qualcomm is entering the data center AI accelerator market, challenging Nvidia’s dominance with its AI200 and AI250 chips planned for 2026 and 2027. Leveraging its expertise in mobile NPUs, Qualcomm aims to capitalize on the booming AI server market. Qualcomm emphasizes its total cost of ownership benefits and higher memory capacity (768GB per AI card). The company initially focuses on AI inference and offers flexible system configurations. A partnership with Saudi Arabia’s Humain demonstrates Qualcomm’s commitment to the sector.
-
Moving AI Workloads from Cloud to On-Premise: A Strategy for Reducing Power Consumption
Arm CEO Rene Haas advocates for distributing AI workloads from cloud-based infrastructure to local devices to reduce energy consumption and improve sustainability. He highlights a shift towards hybrid computing, with AI training in the cloud and inference occurring on devices like smartphones and AR glasses. Arm’s expanded partnership with Meta aims to optimize AI efficiency across the entire compute stack, exemplified by localized speech recognition in Meta’s Ray-Ban Wayfarer glasses. This localized processing enhances responsiveness and reduces reliance on cloud servers.
-
First AI GPU Chipset Comparison Report: NVIDIA Dominates, Huawei Surpasses AMD
A Morgan Stanley report reveals high profitability in AI inference, with average profit margins exceeding 50% for “AI inference factories.” NVIDIA’s GB200 NVL72 leads with a near 78% profit margin, followed by Google’s TPU v6e pod (74.9%) and AWS’s Trn2 UltraServer (62.5%). Huawei’s Ascend CloudMatrix 384 achieves 47.9%. AMD’s MI300X and MI355X, however, show significant negative profit margins due to insufficient token generation efficiency.
-
Huawei’s Upcoming AI Breakthrough: Potential Reduction in HBM Memory Dependency
Huawei is expected to unveil a significant AI inference technology at an upcoming forum, potentially reducing China’s reliance on High Bandwidth Memory (HBM). HBM is crucial for AI inference due to its high bandwidth and capacity, enabling faster access to large language model parameters. However, HBM supply constraints and export restrictions are pushing Chinese companies to seek alternatives. This innovation could improve the performance of Chinese AI models and strengthen the Chinese AI ecosystem.