Your post led me down an interesting path. Thank you. I would love to know your thoughts of the congressional hearing.
I would like to ask what will probably seem like a surface level question from a layperson.
It is because I am—but I appreciate reading as much as I can on LW.
The end-of-text prompt causes the model to “hallucinate”? If the prompt is the first one in the context window how does the model select the first token—or the “subject” of the response?
The reason I ask is that the range has been from a Dark Series synopsis, an “answer” on fish tongues as well as a “here’s a simple code that calculates the average of a list of numbers (along with the code).”
I’ve searched online and have not found an answer. Is this because endoftext is well known, not a “glitch” and just how GPT works? I apologize for asking here but if someone can point to a post with the answer (“endoftext causes the model to…”) it would be greatly appreciated.
Note: I found this below—but how does it select the “uncorrelated text.? How does it “choose” the first token that begins the uncorrelated text?
“You will see that it starts to answer like "The <lendoftext|> " and after that it simply answers with an uncorrelated text. That is because it learned to not attend to tokens that are before the [EOS] token.”
this falls perfectly into a thought/feeling “shape” in my mind. i know simple thanks are useless. but thank you.
i will now absorb your words and forget you wrote them