last_validated: 2026-04-06 decay_rate: slow

Ilya Sutskever — We're Moving from the Age of Scaling to the Age of Research

Source: https://www.dwarkesh.com/p/ilya-sutskever-2
Date: November 25, 2025
Host: Dwarkesh Patel
Guest: Ilya Sutskever (co-founder, Safe Superintelligence Inc.)

Summary

In his second appearance on the Dwarkesh Podcast, Ilya Sutskever — now leading Safe Superintelligence Inc. (SSI) — declares that "the age of scaling is over" and that AI progress will hinge on fundamental research breakthroughs rather than brute-force compute scaling. He argues that pre-training on internet data has hit a ceiling ("there is only one internet") and that current models suffer from "jagged generalization" — acing complex benchmarks while failing simple tasks. SSI's approach treats alignment as a core design constraint, not an afterthought, with the goal of building AI that "cares for sentient life." Sutskever contends that RL is consuming increasing compute for only modest gains compared to pre-training, and that a currently unknown machine learning principle is needed to unlock human-like generalization efficiency. The interview frames the competitive frontier as a research race rather than a resource race.

Key Claims

"The age of scaling is over" — pre-training on static internet data has reached diminishing returns; "there is only one internet"
Progress now requires the "age of research": new algorithmic breakthroughs, synthetic data, and models that learn from deployment and interaction, not just static pre-training
Models exhibit "jagged generalization" — solving graduate-level problems but failing basic reasoning or debugging loops; over-optimized RL risks benchmark overfitting
RL is consuming increasing compute relative to pre-training but yields only modest learning gains — raising questions about RL as a shortcut to AGI
SSI is structured research-first: alignment and safety are core design constraints, not add-ons; goal is AI that inherently values "caring for sentient life"
A "specific, currently unknown machine learning principle" is needed for efficient, robust generalization akin to human cognition
Another 100× compute scaling might move the needle but would not fundamentally transform capabilities — algorithmic innovation is essential
Superintelligence deployment should be gradual, with systems continually learning from real-world use

last_validated: 2026-04-06 decay_rate: slow

Ilya Sutskever — We're Moving from the Age of Scaling to the Age of Research

Summary

Key Claims

Tags

Related