How much data does it take to pretrain a (human) brain? I conducted a (fairer) Fermi estimate.
The post goes through the following questions:
- How long does it take to grow a human brain?
- How many waking seconds do we have in our life?
- How many “tokens” or “data points” does a human brain process in a second?
- Can we simply count the spikes?
- How many bits (spikes and non-spikes) does it take for the brain to process 1 sensory “piece of information”?
- How do those numbers stack up against LLMs?
To get to this conclusion table:
Yeah, I don't think it makes sense to add sleep if you are estimating "data points", since it's rehearsing remixes of the data from awake times.
On the other hand, if you are estimating "training steps", then it does make sense to count sleep. Just as you'd count additional passes over the same data.