Fair! Though when the alternative is my own fiction writing skills... let's just say I appreciated Claude's version the most amongst the set of possible options available ^^;
Thanks for investigating this! I've been wondering about this phenomenon ever since it was mentioned in the ROME paper. This "reversal curse" fits well with my working hypothesis that we should expect the basic associative network of LLMs to be most similar to system 1 in humans (without addition plugins or symbolic processing capabilities added on afterwards, which would be more similar to system 2), and the auto-regressive nature of the masking for GPT style models makes it more similar to the human sense of sound (because humans don't have a direct "sen...
Perhaps? I'm not fully understanding your point, could you explain a bit more what I'm missing - how does accounting for sleep and memory replay add to the point of comparing the pretraining dataset sizes between human brains and LLMs? At first glance, my understanding of your point is that adding in sleep seconds would increase the training set size for humans by a third or more. I wanted to make my estimate conservative so I didn't add in sleep seconds, but I'm sure there's a case for an adjustment adding it in.