All of Alice Wanderland's Comments + Replies

Perhaps? I'm not fully understanding your point, could you explain a bit more what I'm missing - how does accounting for sleep and memory replay add to the point of comparing the pretraining dataset sizes between human brains and LLMs? At first glance, my understanding of your point is that adding in sleep seconds would increase the training set size for humans by a third or more. I wanted to make my estimate conservative so I didn't add in sleep seconds, but I'm sure there's a case for an adjustment adding it in.

4Nathan Helm-Burger
Yeah, I don't think it makes sense to add sleep if you are estimating "data points", since it's rehearsing remixes of the data from awake times. On the other hand, if you are estimating "training steps", then it does make sense to count sleep. Just as you'd count additional passes over the same data.

Fair! Though when the alternative is my own fiction writing skills... let's just say I appreciated Claude's version the most amongst the set of possible options available ^^;

Thanks for investigating this! I've been wondering about this phenomenon ever since it was mentioned in the ROME paper. This "reversal curse" fits well with my working hypothesis that we should expect the basic associative network of LLMs to be most similar to system 1 in humans (without addition plugins or symbolic processing capabilities added on afterwards, which would be more similar to system 2), and the auto-regressive nature of the masking for GPT style models makes it more similar to the human sense of sound (because humans don't have a direct "sen... (read more)