Dominic Rigby

The Bitter Lesson’s Btter Lessons (opinion piece)

Date read: 20th October 2025

Addresses Richard Sutton’s claims on Dwarkesh Podcast in which he roughly said that babies don’t learn through imitation, so LLMs shouldn’t either.
Not mimicing humans and learning from scratch gives up huge potential compute savings… it gives up all the knowledge humans have strived so hard to accumulate.
Estimates that to get to human intelligence took 10^50 operations (10^30 parallel organisms alive for 4.9x10^9 years)
LLMs tend to train on 10^26 FLOPs… so we’re a long way of this
Humans utilised information technology to overcome this:
1. Broadcasting: spread info to others
2. Broad listening: distill info from multiple sources into a single world model
LLMs excell a broad listening… but still only act on tiny proportion of human data as most of it is locked up in private databases
This private data could allow huge performance increases in LLMs as a lot of it is also very high quality (business records, health records etc)
Challenge: allow knowledge transfer whilst maintaining copyright, security, privacy etc