Inference

Tech

Rebellion Targets South Korean IPO Next Year, CEO Tells CNBC

South Korean AI chip startup Rebellions plans an IPO in Q1-Q2 next year, likely on the KOSPI, to capitalize on AI hardware investor interest. With revenue now being generated and supported by major Korean tech firms and government initiatives, Rebellions aims to compete in the growing inference chip market, potentially exploring U.S. listings as well.

2026年7月8日

Tech

Microsoft-Backed D-Matrix Begins Nvidia Chip Production

D-Matrix, a new AI chip startup, challenges Nvidia’s dominance with its “Corsair” inference chip. Offering tenfold speed and fivefold energy savings for smaller AI tasks, Corsair utilizes a novel memory architecture integrating SRAM. While excelling in low-latency inference, its SRAM capacity limits handling massive models. D-Matrix targets interactive AI applications and partners with industry leaders for data center solutions, positioning itself as a significant player in the trillion-dollar AI inference market.

2026年6月9日

Tech

Nvidia’s Earnings Slip: What the Sellers Miss

Nvidia’s stock dip post-earnings defies stellar performance and surging AI chip demand. This disconnect signals investor disbelief, not business weakness. While hyperscalers drove past growth, the future lies in the “AI Clouds, Industrial and Enterprise” (ACIE) segment, encompassing diverse AI players. Nvidia’s vertically integrated platform offers unique advantages, especially in the inference market, where it holds near-monopoly. Despite strong fundamentals and expanding opportunities, the market reaction remains perplexing, suggesting patient investors will ultimately be rewarded.

2026年5月21日

AGI

Nvidia Vera Chip Aims for $200 Billion Market in Huang’s Second Offensive

Nvidia reported strong first-quarter earnings and unveiled its Vera CPU, signaling a strategic pivot into the burgeoning AI inference market. Targeting a distinct $200 billion segment, Vera aims to complement Nvidia’s GPU dominance, projected to generate $20 billion in revenue this fiscal year. This move addresses cloud providers’ increasing demand for custom silicon to optimize inference workloads, where Nvidia faces growing competition. Despite supply chain constraints, Nvidia is investing heavily to secure production, underscoring the chip’s critical role in its future growth.

2026年5月21日

AGI

NVIDIA and Google Slash AI Inference Costs

Google and NVIDIA are partnering to significantly reduce AI inference costs with new A5X bare-metal instances powered by NVIDIA Vera Rubin NVL72 systems. This collaboration focuses on hardware-software co-design for a tenfold cost reduction per token and increased throughput. The initiative also enhances data governance with Gemini models on Google Distributed Cloud and introduces Confidential Computing for secure AI deployments. Managed Training Clusters and integrated platforms aim to streamline agentic AI development and physical simulations, benefiting diverse industries and fostering a growing developer community.

2026年4月23日

Tech

Record Funding Fuels Nvidia AI Chip Rivalry Amidst Intensifying Competition

The AI chip landscape is intensifying as startups challenge Nvidia’s dominance with innovative solutions focused on AI inference efficiency. These challengers are attracting significant investor capital, signaling a strategic shift in hardware design. Despite Nvidia’s substantial investments and acquisitions, new companies are securing funding, highlighting a growing belief in specialized AI hardware. Key developments also include expansion by AI leaders and strong performance from TSMC, driven by AI chip demand.

2026年4月17日

Tech

Nvidia’s $20 Billion Gamble on Next-Gen AI Chips

Nvidia reportedly secured a $20 billion deal with AI startup Groq, licensing its inference technology and key personnel, including CEO Jonathan Ross. This strategic move, ahead of GTC, aims to bolster Nvidia’s position in the cost-sensitive inference market, complementing its training dominance. Ross’s expertise in specialized LPUs and SRAM optimization is expected to enhance Nvidia’s AI ecosystem, potentially mirroring the success of its Mellanox acquisition.

2026年3月15日

AGI

Enterprises Rethink AI Infrastructure Amid Rising Inference Costs

AI spending in Asia Pacific faces challenges in ROI due to infrastructure limitations hindering speed and scale. Akamai, partnering with NVIDIA, addresses this with “Inference Cloud,” decentralizing AI decision-making for reduced latency and costs. Enterprises struggle to scale AI projects, with inference now the primary bottleneck. Edge infrastructure enhances performance and cost-efficiency, especially for latency-sensitive applications. Key sectors adopting edge-based AI include retail and finance. Cloud and GPU partnerships are crucial for meeting expanding AI workload demands, with security as a vital component. Future AI infrastructure will require distributed management and robust security.

2026年1月2日