The relationship between OpenAI and Nvidia may not be breaking — but it’s clearly being tested.
According to a Reuters report, OpenAI has been quietly exploring alternatives to Nvidia’s AI chips since last year, driven by dissatisfaction with how some of Nvidia’s latest hardware performs during AI inference — the stage where models like ChatGPT respond to real-time user queries.
The shift adds a new layer of complexity to what was expected to be one of the biggest partnerships in the AI boom.
Why Inference Is Suddenly a Big Deal
While Nvidia still dominates AI training chips, the industry’s focus is rapidly moving toward inference and reasoning — the moment users actually experience speed, responsiveness, and usefulness.
For OpenAI, inference performance is becoming critical, especially as products like ChatGPT and Codex scale to millions of users simultaneously. Several sources told Reuters that OpenAI is unhappy with how fast Nvidia’s GPUs generate responses for certain tasks, particularly software development and AI-to-software interactions.
One insider said OpenAI is now seeking hardware that could eventually handle around 10% of its inference workload — a small slice on paper, but strategically significant.
The $100 Billion Deal That Slowed Down
This change in priorities may help explain why Nvidia’s proposed $100 billion investment in OpenAI, first revealed in September, has yet to close.
The deal was expected to wrap up within weeks, giving Nvidia a stake in OpenAI and OpenAI the cash needed to secure massive chip supply. Instead, negotiations have dragged on for months, reportedly complicated by OpenAI’s changing compute roadmap.
Despite the friction, both sides are publicly downplaying tensions. Nvidia CEO Jensen Huang recently dismissed reports of strain as “nonsense,” while OpenAI CEO Sam Altman said Nvidia still makes “the best AI chips in the world” and remains a key partner.
Who OpenAI Is Talking To Instead
Behind the scenes, however, OpenAI has explored inference-focused deals with:
- AMD
- Cerebras
- Groq
The appeal lies in SRAM-heavy chip designs, where large amounts of memory are embedded directly onto the chip. This architecture can dramatically reduce the time it takes to fetch data — a crucial advantage for chatbots responding in real time.
Inference workloads rely more on memory access than raw math power, making traditional GPU designs — which use external memory — less efficient for this stage.
Codex Exposed the Problem
Inside OpenAI, the issue reportedly became most visible with Codex, the company’s AI coding product. Staff internally linked some of Codex’s speed limitations to Nvidia’s GPU-based infrastructure.
Altman himself acknowledged in late January that customers using OpenAI’s coding tools place a “big premium on speed”, confirming that inference performance is now a top priority.
OpenAI has already announced a deal with Cerebras to address these needs, with Altman noting that speed matters less for casual ChatGPT users — but is critical for professional and developer-facing tools.
Nvidia Moves to Protect Its Turf
As OpenAI began testing alternatives, Nvidia reportedly moved quickly to shore up its position, approaching several SRAM-focused chipmakers about potential acquisitions.
Cerebras declined and instead signed a commercial agreement with OpenAI. Groq, meanwhile, became a more complicated case. Sources say Nvidia struck a $20 billion licensing deal with Groq — effectively freezing OpenAI’s discussions with the startup — while also hiring away Groq’s chip designers.
Nvidia says Groq’s technology complements its roadmap, but industry insiders see the move as a defensive play in a fast-shifting market.
Big Picture: A Turning Point for AI Hardware
The developments highlight a deeper shift in AI economics. Training massive models got the industry here — but inference is where the next phase of competition will be won.
Rivals like Google already rely heavily on their own TPUs, giving products like Gemini an advantage in inference-heavy tasks. OpenAI’s willingness to look beyond Nvidia suggests no single chipmaker can dominate forever.
Final Words
Nvidia remains the backbone of OpenAI’s infrastructure — for now. But OpenAI’s exploration of alternative inference chips signals a future where speed, memory architecture, and task-specific hardware matter more than brute-force training power.
The AI boom isn’t slowing down — it’s evolving. And as inference becomes the real battlefield, even the strongest alliances may need to adapt.
