AGI: Francois Chollet + Sam Altman
TL;DR
Chollet and Altman split on what AI means for kids, but both think adaptability matters more than raw IQ — Chollet worries about “human agency” and outsourcing cognition in a Star Trek-like future, while Altman argues kids will grow up with more ambition and agency than we did because asking computers for anything will feel normal.
ARC-AGI-3 is the benchmark both men treat as real, but Chollet says 85% is likely at least a year away if labs target it aggressively — He contrasts ARC-AGI-3’s current 37 score with ARC-AGI-2’s rapid jump to 88% and Gemini Deep Thinking’s 84.6, arguing ARC-AGI-3 was deliberately made more out-of-distribution and harder to brute-force.
Altman’s main benchmark obsession is no longer just test scores but scientific and economic output — He says the important question is whether models can discover new science and accelerate the economy, not just survive on “meter”-style long-horizon tasks, which both he and Chollet call interesting but incomplete.
Chollet laid out his alternative to mainstream deep learning: symbolic learning via minimal programs instead of parameterized curves — His bet is that replacing gradient-trained parametric models with the shortest symbolic program that explains the data could yield stronger generalization, better compute efficiency, and far lower data requirements than today’s trillion-token regime.
Altman says OpenAI is narrowing its focus because the next models look unusually consequential for science and the economy — He frames the tradeoff bluntly: if OpenAI can spend a million GPUs on “curing all cancers” or on better video generation like Sora, he’d choose cancer research every time.
On AGI timing, Chollet says we’re at a ‘5’ out of 10 while Altman says it already feels very close except for memory and continuous learning — Chollet stresses it’s still easy to find tasks regular humans can do and frontier models can’t, whereas Altman says the remaining gap is narrower and more infrastructural.
The Breakdown
Parenting in the Star Trek world
The conversation opens on a very human note: both founders as dads thinking about what AI means for their kids. Chollet is openly uneasy about what happens to human agency when children grow up able to “just talk to the computer,” while Altman sounds almost exhilarated, saying his kids will likely look back at our lives the way he looked at his parents’ pre-computer world — as unnecessarily hard.
Adaptability beats old-school intelligence
When asked whether intelligence still matters, Chollet reframes it as adaptability. His point is simple and sharp: the faster the world changes, the more valuable it is to adjust quickly, and kids may be better built for that than adults are. He grounds it with a sweet anecdote about building a Minecraft clone with his nearly five-year-old son via Codex — the kid imagines features, Codex implements them, and Chollet says he hasn’t written a single line of code.
Altman says the social contract conversation is finally real
Altman’s “behind the scenes” story isn’t about product development but politics. He says that in the last few weeks he’s felt, for the first time, that governments and people across ideological lines are seriously ready to discuss how the social contract must evolve as AI transforms the economy. For him, the exciting part is not just that people acknowledge change is coming, but that the Overton window has shifted toward actually working on solutions.
ARC-AGI-3, benchmark saturation, and the moving goalposts of AGI
The benchmark section gets concrete fast: ARC-AGI-1 was crushed five years after launch, ARC-AGI-2 now sits around 84.6 for Gemini Deep Thinking, and ARC-AGI-3 is currently at 37. Chollet predicts it would take “several years” to saturate if labs ignored it, but because labs inevitably optimize for scoreboards, he thinks 85% is still at least a year away; Altman agrees the timing mainly depends on how hard people target it and uses the moment to note that AGI was never supposed to come in numbered versions at all.
Why neither man thinks the ‘12-hour agent’ story settles it
On the Meter benchmark — where a model called Opus reportedly handles 12-hour tasks at over 50% success — both push back, though in different ways. Chollet says it’s useful but noisy, weak on human baselines, and too tied to a software-engineering-style harness, whereas ARC is trying to measure the “G” in general intelligence. Altman adds that time-on-task misses the deeper question: whether models can create new knowledge, solve genuinely valuable problems, and accelerate science.
Chollet’s symbolic learning pitch vs. OpenAI’s scaling bet
Chollet’s most technical segment is also the cleanest articulation of his thesis: instead of fitting parameterized curves with gradient descent, learn the shortest symbolic program that explains the data. He argues that this “symbolic learning” could get much closer to optimal learning, with better generalization, lower compute needs, and far fewer data points. Altman’s response is generous and strategic — OpenAI is betting hard on its current paradigm, but he explicitly says many approaches should be tried because any dominant method could hit an unexpected wall.
Underexplored research and the case for accelerating everything else
Asked what fields deserve more attention, Chollet points to evolutionary algorithms and program synthesis as areas where simply throwing more compute at them could produce surprising results. Altman recalls a post-GPT-4 dinner where a top researcher lamented that academia just wanted to reproduce GPT-4 instead of exploring the wider space of ideas. But then he widens the aperture: the really underexplored opportunity, he says, is using frontier AI to speed up physics, math, biology, and other sciences that haven’t had enough leverage.
OpenAI’s narrowing focus, AGI distance, and the rapid-fire tells
The final stretch is revealing. Altman says OpenAI is concentrating compute because the next models could have “absolutely massive” impact on science and the economy, and he uses a memorable comparison — a million GPUs for curing cancer versus making videos — to justify deprioritizing things like Sora. In rapid fire, Chollet rates us a 5/10 from AGI because it’s still easy to find ordinary tasks humans can do and AI can’t, while Altman says it already feels close aside from long-term memory and continuous learning; their closing personal picks — Asimov, hardware and drones, Pier Vier, and Altman simply naming his mom — bring the discussion back from abstraction to personality.