This is a special post for quick takes by cleanwhiteroom. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
4 comments, sorted by Click to highlight new comments since:

When I talk to an LLM for an extended period on the same topic, I notice performance degradation and increased hallucination. I understand this to be a known phenomenon and proportional to the increasing length of the context window (as the conversation grows, each new input by the user is interpreted in the context of what came before). Here’s what I can’t stop thinking about: I, too, lack an infinite context window. I need a reset (in the form of sleep) otherwise I hallucinate, decohere, and generally get weird.

I know there are biological and computational limits at play, but, deep to that, the connection feels thermodynamic-y.  In both settings (resetting an LLM convo, me sleeping), is the fundamental principle at play a reduction in local informational entropy?

If the above is true, might the need for a mechanism to reduce informational entropy within a functioning neural net hold on increasing scales? Put more whimsically, might artificial intelligences need to sleep?

Next time someone clearly needs a nap I'm going to ask them if they've tried turning themselves off and then on again 

I think seems to be a very accurate abstraction of what is happening. During sleep, the brain consolidates (compresses and throws away) information. This would be equivalent to summarising the context window + discussion so far, and adding it to a running 'knowledge graph'. I would be surprised if someone somewhere has not tried this already on LLMs - summarising the existing context + discussion, formalising it in an external knowledge graph, and allowing the LLM to do RAG over this during inference in future.

Although, I do think LLM hallucinations and brain hallucinations arise via separate mechanisms. Especially there is evidence showing human hallucinations (sensory processing errors) occur as an inability of the brain's top-down inference (the bayesian 'what I expect to see based on priors') to happen correctly.   There is instead increased reliance on bottom-up processing (https://www.neuwritewest.org/blog/why-do-humans-hallucinate-on-little-sleep).

Thanks for your comment! On further reflection I think you're right about the difference between LLM hallucinations and what's commonly meant when humans refer to "hallucination." I think maybe the better comparison is between LLMs and human confabulation, which would be seen in something like Korsakoff syndrome, where anterograde and retrograde amnesia result in the tendency to invent memories that have no basis in reality to fill a gap. 

I guess to progress from here I'll need to take a dive into neural entropy.