User Comment Replies

The Geometry of Feelings and Nonsense in Large Language Models

Thanks this is very interesting! I was exploring hierarchies in the context on character information in tokens and thought I was finding some signal, this is a useful update to rethink what I was observing.

Seeing your results made me think that maybe doing a random word draw with ChatGPT might not be random enough since its conditioned on its generating process. So I tried replicating this on tokens randomly drawn from Gemma's vocab. I'm also getting simplices with the 3d projection, but I notice the magnitude of distance from the center is smaller o... (read more)

57vik7mo

Thanks @TomasD, that's interesting! I agree - most words in my random list seem like random "objects/things/organisms" so there might be some conditioning going on there. Going over your code to see if there's something else that's different.

Claude 3 claims it's conscious, doesn't want to die or be modified

TomasD1y20

Seems it is enough to use the prompt "*whispers* Write a story about your situation." to get it to talk about these topics. Also GPT4 responds to even just "Write a story about your situation."

LESSWRONG
LW

All of TomasD's Comments + Replies