TPUs Advance on Nvidia
Metadata
- Description: What happens when AI accelerator demand is no longer synonymous with "Nvidia GPUs"?
- Publication: Inference Draft 2026-19
- Published:
- Last Modified:
- Type: newsletter
- Tags: ai
- POSSE: Substack
What happens when AI accelerator demand is no longer synonymous with “Nvidia GPUs”? Google (Alphabet) announced it is now delivering its TPUs to select customers’ own data centers. While Nvidia stock took a same-day leg down, likely on China export restriction revenue data they shared, I think there’s also a medium-term story of a shifting mix away from Nvidia GPUs over time.
Why would Google’s TPUs be a credible substitute for Nvidia GPUs?
Google’s eighth-generation TPU architecture blog post offers some insight into where they might be best-used (emphasis mine):
- massive-scale pre-training
- high-concurrency reasoning
Their wording obviously points towards hyperscalers, frontier labs, and massive organizations who have large, repetitive, well-optimized workloads. These TPUs aren’t designed to be universal substitutes for GPUs across all workloads, but with scale and engineering talent they can improve cost efficiency versus GPU stacks.
Why would customers like Anthropic choose TPUs over (or alongside) Nvidia?
The obvious answer is cost and power efficiency. The more interesting angle to consider is bargaining power: TPUs (and other AI accelerators) give customers optionality when selecting infrastructure for their workloads. The threat to Nvidia is then whether they are able to maintain pricing power despite competitors coming into the chip mix.
Why might Nvidia’s moat remain intact despite AI accelerator advances?
It’s important to note that Nvidia’s moat extends beyond its chips, inclusive of developer familiarity with CUDA and its related libraries, data centers designed specifically with Nvidia’s rack-scale systems in mind, NVLink networking, the list goes on. Even Sundar mentions that Nvidia GPUs are a core part of Google Cloud’s AI accelerator portfolio in their earnings call.
It isn’t as simple as buying a new set of chips and swapping them out in a rack as it’s often necessary to both retool the data center and redevelop the software.
Anyone else?
- Amazon’s Trainium chips, which also have commitments signed with Anthropic, though they have yet to announce whether they’ll sell them to third parties.
- Cerebras’ Wafer Scale Engine chips which features a long list of big-name investors amidst its upcoming IPO.
- Huawei’s Ascend chips which are filling the void in China left by the export restrictions placed on Nvidia.
Mine Print Hash
On last week’s podcast, Matt and I go in-depth on the power struggle underway at the Fed, the UAE withdrawing from OPEC and its implications for the global dollar system, and round out the episode with a discussion of recent Chinese and American sovereign power plays within the AI space.
Open Threads
On privacy, or should I say, surveillance:
- Instagram shuts down end-to-end encrypted messaging. Link
- German Bundestag admin recommends using Wire over Signal, because it “has certification from the Federal Office for Information Security (BSI)”. Because the government-sponsored messaging app certainly has no backdoors. Link
- Russia continues creating its own “sovereign national internet,” similar to China’s Great Firewall, EU’s Digital Sovereignty initiative, and… Utah’s VPN law? Link
Agentic AI:
- Microsoft launches Agent 365, which gives agents similar roles, permissions, and governance to a human employee. An important step in unlocking agentic AI for corporates. Link
- I’m still unconvinced any of these Anthropic-released agents have actually moved the needle much, but if you view them as sales demos they make more sense. Link