User Comment Replies

The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

Likewise, LLMs are produced by a relatively simple training process (minimizing loss on next-token prediction, using a large training set from the internet, Github, Wikipedia etc.) but the resulting 175 billion parameter model is extremely inscrutable.
So the author is confusing the training process with the model. It’s like saying “although it may appear that humans are telling jokes and writing plays, all they are actually doing is optimizing for survival and reproduction”. This fallacy occurs throughout the paper.

The train/test framework is not hel... (read more)

4Bill Benzon2y

What I'm arguing is that what LLMs does go way beyond predicting the next word. That's just the proximal means to an end, which is a coherent statement.

The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

gideonite2y10

We’re all more or less doing that when we speak or write, though there are times when we may set out to be deliberately surprising – but we can set such complications aside

We're all more or less doing that when we speak or write?

1Bill Benzon2y

This:

The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

gideonite2y10

If you think of the LLM as a complex dynamical system, then the trajectory is a valley in the system’s attractor landscape.

The real argument here is that you can construct simple dynamical systems, in the sense that the equation is quite simple, that have complex behavior. For example, the Lorenz system though there should be an even more simple example of say, ergodic behavior.

1Bill Benzon2y

When was the last time someone used the Lorenz system to define justice?

LESSWRONG
LW

All of gideonite's Comments + Replies