AI inference

  • Rebellions Secures $400M Ahead of Samsung-Backed IPO

    South Korean AI chip startup Rebellions has secured $400 million in funding, valuing the company at $2.34 billion. The company aims to challenge Nvidia in the AI inference chip market with its Rebel-Quad technology, focusing on energy efficiency and performance. The funding will support expansion into the U.S. market, targeting research labs like Meta and xAI, and prepare for a potential IPO. Strategic investors, including Samsung and SK Hynix, are helping Rebellions navigate memory chip supply challenges.

    2026年3月30日
  • Arm Stock Surges 20% on Anticipated Revenue Boom from New Chip

    Arm Holdings’ stock surged 20% after announcing its new AGI CPU, designed for AI inference in data centers. The company projects the chip will generate $15 billion in revenue by 2031, a significant shift from its traditional IP licensing model. This move aims to capture a larger share of the growing AI market, with early adopters including Meta and OpenAI.

    2026年3月25日
  • Analyst: Nvidia-Groq Deal Fuels Fiction of Competition

    Nvidia is reportedly licensing AI inference technology from Groq for $20 billion, a deal that also brings Groq’s founder and leadership to Nvidia. This move strengthens Nvidia’s position in AI inference and potentially blocks competitors from accessing Groq’s technology. The strategy avoids antitrust scrutiny associated with acquisitions and bolsters Nvidia’s comprehensive AI offerings.

    2026年2月13日
  • Europe’s Cautious AI Strategy May Be Its Competitive Edge

    Europe’s data‑center market, long seen as lagging, can turn its regulatory and power‑supply constraints into strengths as AI drives a global capacity boom. While the US will lead spending, Europe aims to double its capacity, focusing on regions with abundant renewable power (e.g., Nordics, Spain) and faster grid connections. Policies are shifting to prioritize ready projects and support AI‑inference, a high‑density, low‑latency niche. Tight environmental and community rules may raise costs but create higher‑quality, resilient assets, offering investors durable returns amid the AI surge.

    2026年1月18日
  • Qualcomm Enters AI Chip Market, Challenging AMD and Nvidia

    Qualcomm is entering the data center AI accelerator market, challenging Nvidia’s dominance with its AI200 and AI250 chips planned for 2026 and 2027. Leveraging its expertise in mobile NPUs, Qualcomm aims to capitalize on the booming AI server market. Qualcomm emphasizes its total cost of ownership benefits and higher memory capacity (768GB per AI card). The company initially focuses on AI inference and offers flexible system configurations. A partnership with Saudi Arabia’s Humain demonstrates Qualcomm’s commitment to the sector.

    2025年11月7日
  • Moving AI Workloads from Cloud to On-Premise: A Strategy for Reducing Power Consumption

    Arm CEO Rene Haas advocates for distributing AI workloads from cloud-based infrastructure to local devices to reduce energy consumption and improve sustainability. He highlights a shift towards hybrid computing, with AI training in the cloud and inference occurring on devices like smartphones and AR glasses. Arm’s expanded partnership with Meta aims to optimize AI efficiency across the entire compute stack, exemplified by localized speech recognition in Meta’s Ray-Ban Wayfarer glasses. This localized processing enhances responsiveness and reduces reliance on cloud servers.

    2025年10月19日
  • First AI GPU Chipset Comparison Report: NVIDIA Dominates, Huawei Surpasses AMD

    A Morgan Stanley report reveals high profitability in AI inference, with average profit margins exceeding 50% for “AI inference factories.” NVIDIA’s GB200 NVL72 leads with a near 78% profit margin, followed by Google’s TPU v6e pod (74.9%) and AWS’s Trn2 UltraServer (62.5%). Huawei’s Ascend CloudMatrix 384 achieves 47.9%. AMD’s MI300X and MI355X, however, show significant negative profit margins due to insufficient token generation efficiency.

    2025年8月16日
  • Huawei’s Upcoming AI Breakthrough: Potential Reduction in HBM Memory Dependency

    Huawei is expected to unveil a significant AI inference technology at an upcoming forum, potentially reducing China’s reliance on High Bandwidth Memory (HBM). HBM is crucial for AI inference due to its high bandwidth and capacity, enabling faster access to large language model parameters. However, HBM supply constraints and export restrictions are pushing Chinese companies to seek alternatives. This innovation could improve the performance of Chinese AI models and strengthen the Chinese AI ecosystem.

    2025年8月9日