Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Two $20 billion battles: OpenAI and Nvidia are fighting the "Inference War"
NVIDIA and OpenAI each invest $20 billion to stake out the AI inference chip market, while Cerebras files for an IPO with a valuation of $35 billion. This quiet war for control of the future of AI computing power is reshaping the landscape of a tech market worth hundreds of billions of dollars. This article is sourced from Wall Street Insights and organized and reported by PANews.
(Background: NVIDIA’s financial report revenue grew 114% and beat expectations; why Jensen Huang isn’t afraid of the impact of DeepSeek?)
(Additional context: The Economist calls it the “AI Agent” era in 2025, but three difficulties must be noted)
Table of Contents
Toggle
In December 2025, NVIDIA quietly spent $20 billion to buy an AI chip company called Groq.
On April 17, 2026, OpenAI announced it will purchase more than $20 billion worth of chips from another AI chip company, Cerebras. On the same day, Cerebras officially submitted its IPO filing to Nasdaq, targeting a valuation of $35 billion.
The two sums are almost exactly the same amount. One is an acquisition, the other is a purchase. One comes from the world’s largest AI chip seller, and the other comes from the world’s largest AI buyer.
These are not two independent things; they are two symmetrical moves in the same war. The battlefield is called: AI inference.
Most people haven’t noticed this war. Because it has no explosions—only line-by-line financial announcements and technical discussions circulated among Silicon Valley engineers. But its impact could be even more far-reaching than any AI launch in the past two years—because it is redistributing control over what is almost certain to become the largest tech market in history.
What is inference, and why “training” is no longer the keyword for 2026
Before discussing the two $20 billion moves, we need to understand a background: the battleground for AI chips is undergoing a shift in its center of gravity.
Training and inference are the two stages of AI compute power consumption. Training is about building models—feeding massive amounts of data into neural networks so they learn certain capabilities. This process typically happens only once, or is updated periodically. Inference is using models—each time a user asks a question, ChatGPT gives an answer; behind it is a request for inference.
In 2023, the biggest share of global AI computing spend went to training, with inference playing a supporting role.
But this ratio is rapidly flipping.
According to market research data from Deloitte and CES 2026, in 2025 inference already accounted for 50% of all AI compute spending; in 2026, this proportion will jump to 2/3. Lenovo CEO Yang Yuanqing put it even more plainly at CES: the structure of AI spending will flip completely from “80% training + 20% inference” to “20% training + 80% inference.”
The logic isn’t complicated. Training is a one-time cost, while inference is an ongoing cost. GPT-4 was trained once, but it answers questions from hundreds of millions of users every day—each conversation is an inference request. After large-scale deployment, the cumulative inference consumption far exceeds training.
What does that mean? It means the most profitable slice of the AI industry’s “cake” is moving from “training chips” to “inference chips.” And these two kinds of chips require drastically different architecture designs.
NVIDIA’s problem: chips designed for training are inherently not good at inference
NVIDIA’s H100 and H200 are monsters built for training. Their core advantage is extremely high computational throughput—training requires a huge number of multiply operations on massive matrices, and GPUs are good at this kind of “multi-core parallel computing.”
But the bottleneck in inference isn’t computation; it’s memory bandwidth.
When a user asks a question, the chip needs to “move” all the model’s weights from memory into the compute units before it can generate an answer. This “moving” process is the true source of inference latency. NVIDIA’s GPUs use external high-bandwidth memory (HBM); this transfer inevitably introduces delays. For ChatGPT, which has to handle thousands of requests per second, once you multiply that delay by scale, it becomes a real performance bottleneck.
When OpenAI engineers noticed this problem, they were optimizing Codex (the code generation tool) and found that no matter how they tuned parameters, response speed was still constrained by the architectural upper limit of NVIDIA GPUs.
In other words, NVIDIA’s disadvantage on the inference side is not a matter of effort—it’s an architecture issue.
Cerebras’s WSE-3 chip takes an entirely different approach. This chip is so large that it needs wafer-level packaging—an area of 46,255 square millimeters, larger than a human palm—integrating 900,000 AI cores and 44GB of ultra-high-speed SRAM on a single silicon die. Memory is directly placed next to the compute cores, shrinking the “moving” distance from the centimeter level to the micrometer level. The result: inference speed is 15 to 20 times faster than NVIDIA H100.
It also needs to be added that NVIDIA isn’t standing still. Its latest Blackwell (B200) architecture improves inference performance by 4 times compared with H100 and is being deployed at scale. But Blackwell is chasing a moving target—Cerebras is iterating at the same time, and competitors emerging in the chip market are no longer just Cerebras.
NVIDIA’s $20 billion: a letter of acknowledgment behind the largest acquisition in history
On December 24, 2025, NVIDIA announced what would be its largest acquisition ever.
The target is Groq.
Groq is a competitor in the same category as Cerebras, also focused on SRAM-architecture chips optimized for inference—it is called LPU (Language Processing Unit). At the time, in public benchmarking it was the fastest inference chip service in the world. NVIDIA spent $20 billion to buy all of Groq’s core technology and founding team, including founder Jonathan Ross and multiple top chip engineers who previously came from Google’s TPU teams.
This was NVIDIA’s largest acquisition since its 2019 $7 billion purchase of Mellanox—tripling in scale.
In the view of many analysts, the message conveyed behind this money is far more important than the amount itself: NVIDIA believes it has a structural gap on the inference side, and that this gap is so large that it’s worth spending $20 billion to plug it.
If NVIDIA truly believed its GPUs are invincible in inference, it would not need to acquire Groq at all. In essence, this is a $20 billion technology procurement order—acknowledging that the embedded SRAM architecture has real technical advantages in inference scenarios, and acknowledging that NVIDIA cannot naturally cover this advantage with its existing product line. At the highest price, it bought a technical gap that it cannot fill on its own.
Of course, NVIDIA’s official narrative after the acquisition is a different story—“deep integration with Groq to provide a more complete inference solution.” The translation of the technical language is: we realized our own stuff isn’t enough, so we bought someone else’s.
OpenAI’s $20 billion: buying chips is surface-level—the equity stake is the key
Now back to OpenAI.
In January 2026, OpenAI and Cerebras signed a three-year compute procurement agreement worth $10 billion. At the time, media coverage focused on “OpenAI is diversifying its chip suppliers,” in a light tone.
But the latest details exposed on April 17 fundamentally changed the nature of this matter:
First, the procurement amount increased from $10 billion to $20 billion, doubling.
Second, OpenAI will receive warrants for Cerebras’ shares. As the procurement scale increases, the ownership ratio can reach up to 10% of Cerebras’ total shares.
Third, OpenAI will also provide Cerebras with $1 billion in data center construction funding—meaning OpenAI is helping Cerebras build factories.
Put these three details together and the picture that emerges is entirely different: OpenAI isn’t just buying chips—OpenAI is incubating a supplier.
This logic has a clear precedent in the history of technology. In 2006, Apple began working with Samsung to custom-build A-series chips. At first it was also a bulk procurement agreement, but as Apple deepened its involvement and ultimately developed its own M-series chips, control of the supply chain shifted entirely from Intel and Samsung to Apple itself. What OpenAI is doing is somewhat similar—but with one important boundary: Apple held the chip design rights from the beginning, while OpenAI is still the purchaser. After Cerebras goes public, it will develop independently and serve more customers. The end point may not necessarily be OpenAI fully controlling Cerebras; more likely, it is a jointly built ecosystem with deep interdependence.
On one side, tying up Cerebras with $20 billion and an equity stake ensures that non-NVIDIA inference compute capacity continues to be supplied. On the other side, OpenAI is collaborating with Broadcom to develop its own ASIC chips, expected to go into mass production by the end of 2026. Walking on two legs at the same time—the end point is compute autonomy.
Cerebras IPO today: what are you actually buying
On April 17, Cerebras officially submitted its IPO application to Nasdaq, targeting a valuation of $35 billion, with plans to raise $3 billion.
This valuation is more than four times its $8.1 billion valuation in September 2025. It just completed a new funding round in February, when the valuation had already risen to $23 billion; the IPO target of $35 billion adds a further 52% premium on that.
People familiar with Cerebras’ history know this is its second attempt to list. The first time was in 2024. It was forced to withdraw because its core customer G42 (the Abu Dhabi sovereign wealth investment fund) accounted for 83%~97% of that year’s revenue, and the CFIUS stepped in for review on national security grounds.
This time, G42 has disappeared from the list of shareholders, replaced by OpenAI.
In other words, the structural issue of Cerebras’ customer concentration has not been fundamentally solved—only the name of the large customer has changed, while the dependency pattern remains. The judgment investors have to make is: is this big customer better or worse? From a credit perspective, OpenAI is clearly better than G42. From a strategic perspective, OpenAI is also a competitor incubator for Cerebras—once its self-developed ASIC matures, it poses a real replacement threat to Cerebras.
For fairness, Cerebras is also actively expanding other customers. The prospectus is expected to list more diversified revenue sources, which should improve concentration. But before OpenAI’s self-developed chips reach mass production, the answer to this question has not yet been revealed.
Buying Cerebras stock is, in fact, a bet on two things: that OpenAI will continue to choose Cerebras, and that OpenAI’s self-developed ASIC will not arrive early. Neither of these is certain.
Of course, the bullish reasons are also real: if the market size for inference grows along the projected track, even if Cerebras captures only a small share of that market, the absolute numbers would still be substantial. The issue isn’t whether Cerebras has a chance—it’s whether the $35 billion valuation already reflects the most optimistic scenario.
Two $20 billion figures appear symmetrically between the end of 2025 and April 2026.
One comes from the world’s largest AI chip seller, buying the technology of a competitor in the inference market.
One comes from the world’s largest AI buyer, incubating a company that challenges NVIDIA in the inference market.
NVIDIA’s $20 billion is defense—it plugs a technical gap it can’t fill itself with the most expensive price.
OpenAI’s $20 billion is offense—it is burning money to build an inference superhighway that does not rely on NVIDIA, while also securing equity warrants for a toll station on that road.
This war has no gunfire, but the flow of capital never lies. The two transactions tell you more clearly than any AI launch: control over AI inference infrastructure is being contested. And in this market, in 2026 it will account for two-thirds of the industry’s total compute spending.
Cerebras’ IPO is the clarion call of this war.