The Fall of the
Theorem Economy
The product of mathematics is clarity and understanding. Not theorems, by themselves. Bill Thurston, 1982 Fields medallist
David Bessis's central claim is audacious and, if taken seriously, disorienting: for two millennia, mathematics has measured itself by the wrong unit. The honor code of mathematicians — prove theorems and shut up — made sense when theorem-proving was an excellent proxy for the deeper thing mathematics is actually about: the neuroplastic expansion of human understanding. AI is about to break that proxy.
This piece walks through the essay's key moves — the personal anecdote of a theorem never written, the diagnosis of an honor code, the comparison with closed games like chess, the three caveats that every outside observer misses, and the Overhang that LLMs are uniquely positioned to harvest — making each conceptual structure tangible, manipulable, explorable.
The prose is mine, the ideas are Bessis's, and the original essay repays reading in full. What follows is not a summary but an instrument — a device for thinking alongside the argument.
The Theorem Never Written
Bessis opens with a confession. His best theorem crystallized one morning in Lausanne, clear enough that he jotted it as an informal remark at the bottom of a slide. He never wrote it up. His second-best — Theorem 0.5 in his preprint on Garside categories — was, in his own words, shockingly easy to prove. A few pages of pretty terrestrial group theory.
The hard part, he insists, was not proving it. It was not even conjecturing it. The hard part was intuiting that there should be something like Theorem 0.5, and building the conceptual framework in which it became trivial to state. Once the definitions were right, everything fell into place. The visible theorem is the tip of an iceberg — beneath it, a vast submerged structure of definitions, intuitions, refactors, dead ends, and the slow cognitive process of finding the right frame.
Drag the waterline to see what the honor code rewards, and what it cannot see:
Bessis's provocation: the David who solved the K(π,1) conjecture is a social parasite of the much better mathematician, the David who crafted Definitions 2.4 and 9.3. The system rewarded him for the theorem. But the theorem was just the visible reward ticket for a piece of conceptual work that was, in fact, the real contribution.
Official Math, Secret Math
In his book Mathematica: a Secret World of Intuition and Curiosity, Bessis frames the public misunderstanding as a tension between two versions of mathematics.
Official math is the formal deduction system — a nerd's paradise where truth takes binary values, reasoning is valid or invalid, and there is technically no room for bullshit. It lives in textbooks, peer-reviewed journals, and Lean source files.
Secret math is the human part. Why official math was invented. How mathematicians successfully interact with it. Its effects on their brains. The bizarre mental techniques through which they continuously expand its territory. Secret math never made it to the curriculum, because it lacks the defining qualities of official math — it is soft, fuzzy, subjective — and by contrast looks like cheap pedagogical backstory.
Click on an activity to see how each version of mathematics weights it. The gap between the two columns is where the honor code does its damage.
Official Math
Secret Math
The punchline: every activity a mathematician actually spends her day on — conjecturing, renaming, reframing, chasing intuitions, failed drafts, seminar arguments, half-lit hunches — scores high in Secret Math and near zero in Official Math. The honor code has institutionalized the denial of the former.
Hardy's Curse
G. H. Hardy, in his celebrated and insufferable Mathematician's Apology, wrote:
There is no scorn more profound, or on the whole more justifiable, than that of the men who make for the men who explain. Exposition, criticism, appreciation, is work for second-rate minds. G. H. Hardy, A Mathematician's Apology
Behind closed doors, mathematicians complain. They insist on the importance of teaching — even for their own comprehension. They lament the pathological obsession with theorem-proving priority. Yet in public, they are bound by the honor code. Prove theorems and shut up.
The honor code persisted for millennia because it worked: theorem-proving was a cryptographic proof of conceptual innovation. The two facets were in symbiosis. You could measure one and trust the other would follow. Bessis argues that AI has just broken that cryptographic link.
Adjust each activity's social reward. Try Hardy's distribution. Try Thurston's. Watch what kind of mathematician each ledger selects for:
A Closed System?
The trigger for Bessis's essay was a remark by Geoff Hinton — Turing awardee, Nobel laureate — that mathematics is "particularly easy" terrain for AI because it is "a closed system with rules," "much like Go or Chess."
The comparison is seductive and dangerously wrong. Chess and Go aren't ordinary closed systems; they are finite two-person games with perfect information, and by Zermelo's 1913 theorem they admit optimal strategies. Mathematics has none of these properties.
Click through to compare:
The comparison Hinton made does not survive a single column. But the comparison is doing enormous political work. If mathematics is just another Go — a finite closed system awaiting AlphaZero — then the pitch writes itself: "DeepMind solved Go and chess, we're going to solve mathematics!" At a moment when frontier labs are betting trillions that humans are about to become obsolete, solving mathematics — the crown jewel, the pride of the human race — is simply irresistible.
But if mathematics isn't a game, there is no solution to converge on. There is only an ongoing, collective, cognitive project. The question becomes: what does AI actually do to that project?
Oceans of Significance
On February 5th, an eleven-mathematician team (including 2014 Fields medallist Martin Hairer) released the First Proof project — ten research-level math questions to benchmark AI. Google, OpenAI and others scrambled. By Daniel Litt's count, "somewhere between 6 and 8 of the 10 problems were solved correctly if one combines all attempts."
Headlines wrote themselves. But Litt added a caveat the headlines missed: the problems were closer to technical lemmas — well-calibrated intermediate sub-problems, the kind that occur in a few paragraphs of a real paper. There is an ocean between a technical lemma and a serious paper. Another ocean between a serious paper and a breakthrough. Another between a breakthrough and a Fields-medal-level contribution. And several oceans above that.
Drag the slider to see where different mathematical achievements sit on this logarithmic scale. First Proof is pinned where it actually lived:
The First Proof team picked the right difficulty for a first benchmark — above zero, below ten — and did it scientifically. Bessis's critique is not of the science but of the optics. The general public doesn't read the fine print, and may not even know what a lemma is. If Google had scored a perfect 10, the headline would have read: GAME OVER.
Mathlib vs Mathslop
A few weeks before Bessis wrote, Math Inc produced a Lean formalization of Maryna Viazovska's sphere-packing theorems — Fields-medal-level results from 2022. A major technical achievement. Yet the formal-mathematics community pushed back hard. Why?
Alex Kontorovich has a name for what's missing — canonization:
By canonization, I mean the process of taking a local, one-off formalization and turning it into library mathematics: general, reusable, coherent, efficient, and compatible with the rest… Canonization often changes the picture itself: the definitions, the abstractions, the API, and sometimes even the statement… This is extremely difficult and time-consuming. Alex Kontorovich
Mathlib is a human-curated library with clean APIs, shared abstractions, and coherent definitions. Adding to Mathlib means making your proof play well with everyone else's. Math Inc's Viazovska formalization is a 200,000-line orphan — a vibe-coded blob that sits in a repository exposing no intelligible interface. It is correct, but it is not accretive. It does not grow the commons; it clogs it.
Toggle between the two paradigms. The counter that matters — human-understandable theorems — only increments on one side:
Mathlib — Canonized
Mathslop — Autoformalized Blob
Bessis's metaphysical point cuts deep: the problem with unintelligible mathematics isn't that it might be false. It is that it is literally meaningless — it doesn't compile on the only hardware currently able to make sense of it and appreciate its value. The human brain.
And because of the honor code, Patrick Massot's warning bites: there is no social reward left for cleaning up behind Math Inc. The prize was captured. Mathlib contributors are advised to work on less shiny projects less likely to be destroyed.
The Overhang
Bessis reports a mystifying experience from his own career. Working on the K(π,1) conjecture, he came up with what felt like a stroke of genius — divided Garside categories. As he was canonizing the idea, he made a shocking discovery:
My proud invention… was essentially equivalent to a seemingly esoteric construction in an entirely different branch of mathematics, namely the Bökstedt–Hsiang–Madsen subdivision of Connes's cyclic category in algebraic K-theory. David Bessis
His "one genius idea" was, in a sense, déjà vu. A profound correspondence between two seemingly unrelated fields. Connecting those dots is one of the most celebrated aspects of mathematics — Descartes connecting algebra and geometry, John McKay noticing that 196,883 + 1 = 196,884 (seeding Monstrous Moonshine).
But modern mathematics is so baroque that most correspondences go unnoticed. The Overhang is Bessis's name for the latent value — unrealized capital gains of past creativity — that sits waiting in the existing corpus for someone to connect the dots. A good LLM, with its phenomenal memorization and pattern-matching, can sweep through it at a rate no human can match. Professional mathematicians read at most a few hundred papers in their careers; LLMs are trained on the entire corpus.
The interactive below shows a conceptual space of mathematical ideas. Solid lines are known connections. Dashed lines are latent — correspondences that exist in the mathematical reality but have never been noticed. Reveal them. Then sweep with an LLM:
The Overhang is Bessis's explanation for a puzzle Litt raised: a human with the LLMs' capabilities would, almost certainly, be proving amazing theorems constantly. Why haven't we seen this from the models yet? What are they missing? They are missing — still — meaning-making. But they can already harvest meaning made by humans, and that alone is enough to flood the system. Research sits atop unrealized capital gains; AI can strip-mine the commons.
The Broken Symbiosis
Peter Scholze, 2018 Fields medallist, put it cleanly:
What I care most about are definitions… Unfortunately, it is impossible to find the right definitions by pure thought; one needs to detect the correct problems where progress will require the isolation of a new key concept. Peter Scholze
Concept-building and problem-solving are symbiotic. New concepts enable new problems; new problems force new concepts. Theorem-proving was the externally-visible proxy. But AI does not need to pass through the concept-building station to reach problem-solving. It can decouple them — winning the proxy while leaving the underlying activity stranded.
Toggle between the two eras:
The point is not that AI is "cheating." The point is that the measurement — theorem-proving — was always a proxy for something else. As long as the only agents in the game were humans, the proxy was reliable. Now that another kind of agent can crack it independently, the proxy reveals what it always was: a proxy, not the thing itself.
A Mathematical Intelligence Scale
Bessis proposes a Mathematical Intelligence Scale, modeled on the SAE levels of self-driving autonomy. When AI crushes the next iteration of First Proof, the press release should read not "AI solves mathematics" but "AI achieves Level 4 (out of 10) on the Mathematical Intelligence Scale." The scale should be open-ended, perhaps with seemingly unreachable upper rungs, because mathematics is inexhaustible. The point is to prevent premature victory-declarations that will "demonetize" working mathematicians before anyone has thought carefully.
Below is one plausible draft. Drag the marker to explore each level. Current frontier AI, per Bessis's assessment, sits somewhere around 3–4 — able to solve technical lemmas autonomously, but unable to propose, define, canonize, or restructure:
The key design choice: levels 7 through 10 involve activities — proposing conjectures, crafting definitions, building theory, establishing entirely new branches — that are not benchmarkable in the way problem-solving is. There is no objectively verifiable outcome. Which was precisely the problem the honor code was trying to solve. Now, Bessis argues, we may have to embrace the subjectivity we once tried to abolish.
Mathematicians shouldn't be afraid of subjectivity. It is not about cheating. It is about being fair and not committing collective seppuku. David Bessis
Coming Clean
Bessis's recommendation to the mathematical community is not technical. It is rhetorical, almost political. Finally come clean about the nature of mathematics. All the active mathematicians he cites — Tao, Avigad, Litt, Kontorovich — are careful to note that there is something in mathematics beyond theorem-proving. But they frame it as a passing remark. It needs to become the headline.
We are not trying to meet some abstract production quota of definitions, theorems and proofs. The measure of our success is whether what we do enables people to understand and think more clearly and effectively about mathematics. Bill Thurston
Long term, this forces a rewriting of the ethos and evaluation criteria of the profession. The second rule of the Intuition Club — if you really want to talk about intuition, make it sound casual and accessory, because we ain't the psychology department — must be revoked. If Thurston is right, Bessis notes, then mathematicians are the psychology department.
The essay's three speculative forecasts deserve repetition:
1 · A clearer cut between pure and applied
Not abstract vs concrete, but different sweet spots on the intelligibility-vs-applicability trade-off. Rational to spend ten billion dollars of compute on plasma containment equations even if intelligibility waits a century. Less rational for the Riemann Hypothesis, which is a meaning-making project, not a utility one.
2 · The rise of intuition-maxxers
With AI handling the technical barriers, a new generation of mathematicians will deploy unorthodox tactics to survey entire continents of mathematics at a pace that will mystify their elders. They will see further than anyone before, standing on the shoulders of giant machines.
3 · A renewed interest in the philosophy and neuroscience of mathematics
Because if mathematics is just problem-solving, and machines solve problems, then there is no reason to teach mathematics at all — and everyone vaguely senses that would lead to idiocracy but can't articulate why. The articulation is due.
A note on the present reading
This interactive essay is a companion to, not a substitute for, Bessis's original piece. The visualizations above are conceptual schematics — metaphorical renderings of arguments, not formal simulations of mathematical activity. The Overhang graph, in particular, is an illustration of the structural claim; the real graph is incomprehensibly larger, and the real ratio of latent to known connections is almost certainly more extreme than anything a 2D canvas can convey.
Bessis ends by citing Jacobi. We ended by citing Jacobi. It is, in truth, a sentence worth citing twice.