“`html
Nvidia (NVDA) has exceeded expectations, reporting substantial profits driven by its graphics processing units (GPUs), which excel in AI workloads. However, other categories of AI chips are rapidly gaining traction, reshaping the competitive landscape.
Custom Application-Specific Integrated Circuits (ASICs) are increasingly being designed by major hyperscalers, ranging from Google (GOOGL)’s Tensor Processing Units (TPUs) to Amazon (AMZN)’s Trainium and OpenAI’s planned collaboration with Broadcom (AVGO). These custom chips offer advantages in terms of size, cost, and accessibility, potentially reducing these companies’ reliance on Nvidia GPUs. Daniel Newman of the Futurum Group predicts that custom ASICs will experience “even faster growth than the GPU market over the next few years.” This shift highlights a strategic move towards tailored silicon solutions optimized for specific AI tasks.
Beyond GPUs and ASICs, Field-Programmable Gate Arrays (FPGAs) offer a reconfigurable solution for various applications, including signal processing, networking, and AI. Another category comprises AI chips powering on-device AI, eliminating the need for cloud connectivity. Qualcomm (QCOM), Apple (AAPL), and others are leading the charge in on-device AI chip development, enabling real-time AI processing and enhanced user experiences directly on devices.
CNBC spoke with industry experts and insiders at Big Tech companies to dissect the diverse landscape of AI chips and their respective applications.
GPUs for General Compute
Originally designed for gaming, GPUs propelled Nvidia to become the world’s most valuable public company due to their suitability for AI workloads. In the past year, Nvidia shipped approximately 6 million of its latest-generation Blackwell GPUs, demonstrating the robust demand for GPUs in the AI sector.
Nvidia senior director of AI infrastructure Dion Harris demonstrates how 72 Blackwell GPUs work together as a single GB200 NVL72 rack-scale server system for AI at Nvidia headquarters in Santa Clara, California, on November 12, 2025.
Marc Ganley
The transition from gaming to AI began around 2012, when Nvidia’s GPUs were utilized by researchers to develop AlexNet, a pivotal advancement in modern AI. AlexNet, an image recognition tool, leveraged GPUs’ parallel processing capabilities to achieve remarkable accuracy, surpassing competitors employing central processing units (CPUs).
The creators of AlexNet recognized that the same parallel processing responsible for rendering lifelike graphics was exceedingly effective at training neural networks, enabling computers to learn from data instead of relying solely on programmed code.
Presently, GPUs are frequently paired with CPUs in server rack systems within data centers, facilitating AI workloads in the cloud. CPUs are characterized by a limited number of powerful cores executing sequential general-purpose tasks, whereas GPUs feature numerous smaller cores specialized for parallel math operations, such as matrix multiplication.
GPUs’ ability to perform numerous operations concurrently makes them ideal for both the training and inference phases of AI computation. Training involves teaching the AI model to discern patterns within extensive data sets, while inference entails leveraging the AI to make decisions based on new information.
GPUs serve as the general-purpose workhorses for Nvidia and its primary rival, Advanced Micro Devices (AMD). Software constitutes a significant differentiator between the two leading GPU manufacturers. Nvidia GPUs are tightly optimized around CUDA, Nvidia’s proprietary software platform. Meanwhile, AMD GPUs utilize a largely open-source software ecosystem, attracting developers who prefer greater flexibility and interoperability.
AMD and Nvidia offer their GPUs to cloud providers such as Amazon, Microsoft (MSFT), Google, Oracle (ORCL), and CoreWeave (CRWV), who then rent out the GPUs to AI companies on an hourly or per-minute basis. For instance, Anthropic’s $30 billion deal with Nvidia and Microsoft incorporates 1 gigawatt of compute capacity based on Nvidia GPUs. AMD has also recently secured substantial commitments from OpenAI and Oracle.
Nvidia also sells GPUs directly to AI companies, as evidenced by a recent agreement to supply at least 4 million GPUs to OpenAI, as well as to foreign governments, including South Korea, Saudi Arabia, and the United Kingdom. The company reported to CNBC that it charges roughly $3 million for a server rack equipped with 72 Blackwell GPUs operating as a single unit, shipping approximately 1,000 racks each week. This reflects not only the scale of current AI compute demand but also the premium attached to advanced GPU-based solutions.
Dion Harris, Nvidia’s senior director of AI infrastructure, expressed astonishment at the overwhelming demand, saying, “When we were talking to people about building a system that had eight GPUs, they thought that was overkill.”
ASICs for Custom Cloud AI
While GPU-based training has been essential during the initial boom of large language models, inference is increasingly important as these models evolve. Inference can be executed on less powerful chips tailored for specific tasks, which is where ASICs become valuable.
In contrast to a GPU, which can perform numerous parallel math functions for various AI workloads, an ASIC is analogous to a specialized tool, highly efficient and designed for a specific task.
Google released its 7th generation TPU, Ironwood, in November 2025, a decade after creating its first custom ASIC for AI in 2015.
Chris Miller, author of “Chip War,” notes, “You can’t change them once they’re already carved into silicon, so there’s a trade off in terms of flexibility.”
Nvidia’s GPUs offer adequate flexibility for adoption by numerous AI companies, but they can cost upwards of $40,000 and may be challenging to acquire. Nevertheless, startups often rely on GPUs because designing a custom ASIC incurs a higher upfront cost, starting at tens of millions of dollars, according to Miller. However, the efficiency gains and long term cost savings can be substantial for large cloud providers.
As stated by Newsom, “They want to have a little bit more control over the workloads that they build. At the same time, they’re going to continue to work very closely with Nvidia, with AMD, because they also need the capacity. The demand is so insatiable.” This dual-sourcing strategy allows cloud providers to balance performance, cost, and supply chain security.
Google pioneered the development of custom ASICs for AI acceleration, coining the term Tensor Processing Unit (TPU). Google contemplated developing a TPU as early as 2006. However, in 2013 the situation became “urgent” as AI was expected to double the number of data centers it needed. In 2017, the TPU contributed to Google’s invention of the Transformer, the architecture powering the vast majority of modern AI.
A decade after its initial TPU, Google unveiled its seventh-generation TPU, Ironwood, in November. Anthropic announced plans to train its LLM Claude using up to 1 million TPUs, underscoring the growing demand for specialized AI hardware. According to Miller, some consider TPUs technically on par with or potentially superior to Nvidia’s GPUs.
According to Miller, “Traditionally, Google has only used them for in-house purposes. There’s a lot of speculation that in the longer run, Google might open up access to TPUs more broadly.” This potential shift could significantly alter the competitive dynamics of the AI compute market.
After acquiring Israeli chip startup Annapurna Labs in 2015, Amazon Web Services (AWS) became the next cloud provider to design its own AI chips. AWS announced Inferentia in 2018 and launched Trainium in 2022. AWS is anticipated to announce the third generation of Trainium in December.
Ron Diamant, Trainium’s head architect, indicated to CNBC that Amazon’s ASIC offers a 30% to 40% improvement in price performance compared to other hardware vendors in AWS.
“Over time, we’ve seen that Trainium chips can serve both inference and training workloads quite well,” Diamant stated.
CNBC’s Katie Tarasov at Amazon Web Services’ new AI data center in New Carlisle, Indiana, on October 8, 2025, holding a Trainium 2 AI chip.
Erin Black
In October, CNBC visited Indiana for a tour of Amazon’s largest AI data center, where Anthropic is training its models using half a million Trainium2 chips. AWS also fills its other data centers with Nvidia GPUs to meet the demand from AI customers like OpenAI.
Due to the challenges associated with building ASICs, companies often turn to chip designers Broadcom and Marvell. Miller explains that they “provide the IP and the know-how and the networking” to assist their clients in building their ASICs. These companies offer critical design expertise and intellectual property that streamlines the development of custom AI chips.
“So you’ve seen Broadcom in particular be one of the biggest beneficiaries of the AI boom,” Miller said.
Broadcom assisted in building Google’s TPUs and Meta (META)’s Training and Inference Accelerator, which launched in 2023. Broadcom also has a new agreement to assist OpenAI in creating its own custom ASICs beginning in 2026. This partnership suggests a long-term strategy to control critical compute resources.
Microsoft is also entering the ASIC market, sharing with CNBC that its in-house Maia 100 chips are currently deployed in its data centers in the eastern U.S. Other companies involved in ASIC development include Qualcomm with the A1200 and AI200/AI250, Intel (INTC) with its Gaudi AI accelerators, and Tesla (TSLA) with its AI5 chip. Numerous startups are also fully committed to custom AI chips, like Cerebras which creates massive full-wafer AI chips, and Groq which designs inference-focused language processing units. These startups aim to differentiate themselves through novel architectures and specialized capabilities.
In China, Huawei, ByteDance, and Alibaba are developing custom ASICs, though export restrictions on advanced equipment and AI chips pose a hurdle. This signifies a broader trend of localization and self-sufficiency in AI chip development.
Edge AI with NPUs and FPGAs
The final major category of AI chips includes those designed to run on devices rather than in the cloud. These chips are generally integrated into a device’s main System-on-a-Chip (SoC). Edge AI chips allow devices to possess AI capabilities while minimizing battery consumption and component space.
Saif Khan, former White House AI and semiconductor policy advisor, suggests, “You’ll be able to do that right on your phone with very low latency, so you don’t have to have communication all the way back to a data center. And you can preserve privacy of your data on your phone.”
Neural Processing Units (NPUs) represent a key type of edge AI chip. Qualcomm, Intel, and AMD are producing NPUs that facilitate AI capabilities in personal computers.
Although Apple does not explicitly use the term NPU, the in-house M-series chips within its MacBooks incorporate a dedicated neural engine. Additionally, Apple has integrated neural accelerators into the most recent iPhone A-series chips.
Tim Millet, Apple’s vice president of platform architecture, stated to CNBC in September, “It is efficient for us. It is responsive. We know that we are much more in control over the experience.” This approach offers better performance and privacy control.
The newest Android phones come equipped with NPUs integrated into their primary Qualcomm Snapdragon chips, and Samsung (OTC:SSNLF) has its own NPU on its Galaxy phones. NPUs manufactured by companies like NXP Semiconductors (NXPI) and Nvidia power AI embedded in cars, robots, cameras, and smart home devices.
According to Miller, “Most of the dollars are going towards the data center, but over time that’s going to change because we’ll have AI deployed in our phones and our cars and wearables, all sorts of other applications to a much greater degree than today.”
Then there are Field-Programmable Gate Arrays, or FPGAs, which allow reconfiguration through software after production. While offering greater flexibility than NPUs or ASICs, FPGAs generally provide poorer raw performance and energy efficiency for AI workloads.
Following its $49 billion acquisition of Xilinx in 2022, AMD became the largest FPGA manufacturer. Intel is second thanks to its $16.7 billion purchase of Altera in 2015.
These players designing AI chips rely on Taiwan Semiconductor Manufacturing Company (TSMC) for manufacturing. TSMC’s Arizona chip fabrication plant provides domestic manufacturing options.
Nvidia CEO Jensen Huang announced in October that Blackwell GPUs were in “full production” in Arizona. The presence of a major TSMC fabrication plant in the US is a step in securing both supply chain and meeting demands.
Although the AI chip space is crowded, dethroning Nvidia will require considerable effort and innovation.
As Newman asserts, “They have that position because they’ve earned it and they’ve spent the years building it. They’ve won that developer ecosystem.” Nvidia’s established ecosystem presents a significant competitive advantage for the company.
“`
Original article, Author: Tobias. If you wish to reprint this article, please indicate the source:https://aicnbc.com/13335.html