Microsoft-Backed D-Matrix Begins Nvidia Chip Production

D-Matrix, a new AI chip startup, challenges Nvidia’s dominance with its “Corsair” inference chip. Offering tenfold speed and fivefold energy savings for smaller AI tasks, Corsair utilizes a novel memory architecture integrating SRAM. While excelling in low-latency inference, its SRAM capacity limits handling massive models. D-Matrix targets interactive AI applications and partners with industry leaders for data center solutions, positioning itself as a significant player in the trillion-dollar AI inference market.

In the fiercely competitive landscape of AI chips, a new contender has emerged, throwing down the gauntlet to Nvidia, the undisputed titan of the industry. D-Matrix, strategically positioned just miles from Nvidia’s Silicon Valley stronghold, asserts that its groundbreaking “Corsair” chip can deliver AI inference workloads with a remarkable tenfold increase in speed and a fivefold reduction in energy consumption compared to Nvidia’s standalone graphics processing units (GPUs), particularly for smaller-scale tasks.

This innovative inference chip adopts a novel memory architecture, a strategy reminiscent of pioneers like Cerebras and Groq. As major tech corporations relentlessly pursue every available computing resource, the market is increasingly demonstrating a significant appetite for specialized solutions, creating fertile ground for agile startups to carve out their niches.

The burgeoning success of these AI chip innovators is palpable. Cerebras, founded in 2015, recently executed a blockbuster initial public offering, raising over $5.5 billion and achieving a valuation exceeding $50 billion. In a landmark move, Groq’s assets were acquired by Nvidia in December for a staggering $20 billion, marking the AI giant’s most substantial acquisition to date. Nvidia subsequently unveiled a new Groq-designed chip, dubbed a language processing unit, at its GTC conference in March, further solidifying its dominance.

“This is a market poised to reach a trillion dollars,” remarked Sid Sheth, co-founder and CEO of D-Matrix, in a recent interview. He expressed no immediate plans for an acquisition, confidently stating, “Can the market support yet another public company? Absolutely.”

Established in 2019, D-Matrix has secured approximately $500 million in funding, valuing the company at around $2 billion. Notably, Microsoft, through its venture arm M12, is among its investors. This investment is particularly significant given Microsoft’s own ambitious chip development initiatives, including its “Maia 200” chip for AI inference, new PC processors co-developed with Nvidia, and an in-house quantum computing chip announced just last week.

While Sheth remains tight-lipped about specific customer names, he revealed commitments from prominent hyperscalers, neocloud providers, and frontier AI research labs, all eager to secure substantial compute power. D-Matrix is set to commence shipments to these clients this month, with approximately 90% based in the U.S. and the remainder in the Middle East and Southeast Asia.

Semiconductor analyst Stacy Rasgon of Bernstein Research observes that these specialized chips often complement, rather than replace, Nvidia’s offerings, excelling at different tasks. “Sounds like he’s got a fair number of actual, real customer engagements,” Rasgon commented, underscoring the tangible demand for D-Matrix’s technology.

The Corsair chip’s prowess in achieving low-latency inference with minimal power stems from its tight integration of memory and compute units on a single piece of silicon. Similar to Groq and Cerebras, D-Matrix leverages Static Random-Access Memory (SRAM). This type of memory can be manufactured at conventional logic fabrication facilities, such as those operated by Taiwan Semiconductor Manufacturing Company (TSMC), and integrated directly onto the same chip. In contrast, GPUs typically rely on substantial amounts of Dynamic Random-Access Memory (DRAM), which is often packaged in high-bandwidth memory stacks external to the primary logic chip. This reliance on DRAM also places these solutions at the mercy of supply constraints from memory giants like Micron, Samsung, and SK Hynix.

“We’re not running into a chokepoint around DRAM with our product because our product doesn’t really rely on DRAM to be successful,” Sheth emphasized.

However, the significant drawback of this tightly integrated SRAM approach, according to Rick Bahr, an adjunct professor of electrical engineering at Stanford University, is its limited capacity for handling the colossal reasoning models that now characterize leading AI platforms from companies like OpenAI and Anthropic. While on-chip SRAM enables “remarkable inference speeds” due to the minimal distance data must travel, it struggles to accommodate the trillions of parameters that define these massive models. “That number of parameters just simply can’t be put onto an SRAM-based design. That’s the big challenge,” Bahr explained.

Sheth posits that Corsair is strategically engineered for AI inference tasks where “you’re optimizing for interactivity or speed” rather than sheer language size. This focus targets applications such as advanced chatbots, sophisticated voice agents, and agentic tools like Claude Code and OpenClaw. D-Matrix further claims, citing research from Gimlet Labs, that when paired with an Nvidia Blackwell GPU, Corsair can achieve inference speeds that are ten times faster, up to three times more cost-effective, and up to five times more energy-efficient than a standalone GPU.

Nvidia CEO Jensen Huang, speaking at Computex in Taiwan, reiterated his company’s leadership in cost-effective inference with its Vera Rubin system, emphasizing that performance is not solely about speed but a holistic integration. “The reason for that is we integrate everything, we design everything from the ground up, we simulate the entire system and we use extreme co-design,” Huang stated.

D-Matrix’s Corsair offering consists of four chips packaged together on a card designed to slide into server rack slots within data centers, with pricing in the tens of thousands of dollars. Sheth highlighted this “plug-and-play” approach as a key differentiator from Cerebras and Groq, describing Corsair as “the densest SRAM solution in the market today,” capable of delivering up to 128 gigabytes of SRAM memory within a single server.

Further strengthening its ecosystem, D-Matrix has partnered with Arista, Broadcom, and Super Micro to develop a comprehensive rack-scale system known as SquadRack, specifically engineered for deploying its chips in AI data centers. The Corsair chips are manufactured in Taiwan on TSMC’s 6-nanometer process node. Looking ahead, D-Matrix’s next-generation chip, codenamed Raptor, is slated for launch next year on TSMC’s 4-nanometer node, with Sheth suggesting potential manufacturing at TSMC’s Arizona facility.

“Building a computing solution for AI inference is going to be the grand prize,” Sheth concluded, underscoring the immense potential of this rapidly evolving sector.

Original article, Author: Tobias. If you wish to reprint this article, please indicate the source:https://aicnbc.com/22631.html

Like (0)
Previous 4 hours ago
Next 2 hours ago

Related News