GPU and Compute Economics

Definition

The economic dynamics of AI compute — how GPUs are priced, rented, depreciated, and monetized — and why supply constraints are inverting traditional hardware depreciation assumptions.

Key Points

GPUs are appreciating, not depreciating: better models produce more valuable tokens per GPU, making older hardware (H100s) worth more today than at launch (dwarkesh dylan patel interview)
H100 1Y rental prices up 40% in 6 months ($1.70 → $2.35/hr) despite newer hardware shipping (great gpu shortage rental capacity)
GPU rental market is shifting from hourly rental (cost of silicon) to token-based pricing (value of output), with 2-4x revenue uplift (clouded judgement per token pricing)
Anthropic ARR nearly tripled in one quarter ($9B → $25B+), driving demand that overwhelms supply (great gpu shortage rental capacity)
All on-demand GPU capacity sold out; Blackwell booked through Q3 2026 (great gpu shortage rental capacity)
5-10x ROI on AI tools leaves substantial pricing headroom before demand destruction (great gpu shortage rental capacity)
Alchian-Allen effect: rising fixed GPU costs push users toward the best (most expensive) models (dwarkesh dylan patel interview)

Open Questions

When will GPU supply catch up with demand? Dylan Patel suggests not this decade.
Will credit-based pricing replace per-token and per-seat models across the industry?
How will Nvidia's Groq LPU acquisition change the inference cost curve?
At what point does demand destruction from high prices actually kick in?
If "the age of scaling is over" [UNVERIFIED] (dwarkesh ilya sutskever 2), does GPU demand growth shift from training to inference and agent workloads — and what does that mean for capital allocation between training clusters and inference fleets?

Related Concepts

semiconductor supply chain bottlenecks
inference architecture and scaling
token economics and pricing
ai agent ecosystem
ai scaling limits and research paradigm — scaling limits thesis challenges assumptions about indefinitely growing training compute demand