
There’s a frequent misconception that assumes that a large language model will never achieve superhuman text creation ability because such models try to create texts that are maximally unsurprising. This article will explain why that assumption is wrong.
In 1906, Sir Francis Galton conducted an experiment at a fair, where he asked fair-goers to guess the weight of an ox in a weight-judging competition. The median of 787 guesses was 1,207 pounds, while the actual weight of the ox was 1,198 pounds. The error in making guesses was a result of a combination of systematic bias and random noise. The fair-goers, having knowledge of oxen, had no bias in their guesses, thus the error was entirely due to random noise. By polling the 787 guesses, Galton averaged out the random noise of each individual guess.
This phenomenon was coined wisdom of the crowd. In areas where reasoning errors are mostly random noise, crowds are smarter than individual members of the crowd. By training on large data sets, large language models can access the wisdom of the crowd. The ceiling of the ability of a large language model is the wisdom of the crowd instead of the wisdom of individual members of the crowd.
The fact that each word of a text is massively unsurprising based on preceding words in the text does not imply that the text overall would be massively unsurprising. If you have a text you can calculate for every word in the text the likelihood (Ltext) how likely it would follow the preceding words in the text. You can also calculate the likelihood (Lideal) of the most likely word that would follow the preceding text.
Lideal - Ltext is noise. If you look at a given text you can calculate the average of the noise for each word. A well-trained large language model is able to produce texts with a lot less noise than the average of the text in its training corpus.
For further reading, Kahneman wrote Noise: A Flaw in Human Judgment which goes into more detail on how a machine learning model can eliminate noise and thus make better decisions than the average of its training data.
You don't need to change anything in the underlying machine learning algorithms to make a model like ChatGPT generate new training data that could be used for recursive self-improvement.
Especially, if you give it access to a console so that it can reliably run code, it could create its own training data and get into recursive self-improvement.
If you for example want it to learn to reliably multiply two 4-digit numbers you can randomly generate 4-digit numbers. Then you let it generate a text answer with individual steps. You let a second model create python code to validate all the individual calculations in the individual steps. If the python code validates that all the calculations are correct, you can have a new piece of training data on how to multiply two 4-digit numbers.
Based on ChatGPT user data it might be possible to create an automated system that finds problems where ChatGPT currently most of the time gives a wrong answer and figure out how to create code that analyses newly created examples to see whether they are correct.