mfb comments on Assessing Kurzweil: the results - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (59)
There are a variety of reasons interpreters might think that a prediction didn't come true, while Kurzweil boldly claims that it did:
Kurzweil didn't express himself clearly, so interpreters misunderstood what the prediction really was. Miscommunication adds random noise, and most randomly generated predictions will turn out false, so this will skew the results against Kurzweil.
Kurzweil's prediction was vague. So charitable interpreters will think they're basically true, while less charitable interpreters will think they're basically false. And we can expect random LessWrongers to be less charitable toward Kurzweil than Kurzweil is toward Kurzweil.
Interpreters tend to be factually mistaken about current events, in a specific direction: They are ignorant of the nature, existence, or prevalence of the latest innovations in technology and culture.
Kurzweil tends to be factually mistaken about current events, in a specific direction: He thinks a variety of technologies are more advanced, and more widespread, than they really are.
There are systemic differences in the evaluation scales used by Kurzweil and by others. For instance, Kurzweil and Armstrong individuate 'predictions' differently, lumping and splitting at different points in the source text. There may also be systemic disagreements about how (temporally and technologically) precise an interpretation must be to count as 'correct,' and about whether grammatical forms like 'X is Y' most closely means 'X is always Y', 'X is usually Y', 'X is commonly Y', 'X is sometimes (occasionally) Y', or 'X is Y at least once'. This ties into vagueness, but may bias the results due to linguistic variation rather than just as a result of generic degree of interpretive charity.
I'm particularly curious about testing 3, since the strongest criticism Kurzweil could make of our methodology for assessing his accuracy is that our reviewers simply got the facts wrong. We can calibrate our assumptions about the accuracy and up-to-dateness of LessWrongers regarding technology generally. Or more specifically we can expose them to Kurzweil's arguments and see how much their assessment of his predictive success changes after hearing why he thinks he got a certain prediction 'correct'.
I think (5.) can give a significant difference (together with 1 and 2 - I would not expect so much trouble from 3 and 4). Imagine a series of 4 statements, where the last three basically require the first one. If all 4 are correct, it is easy to check every single statement, giving 4 correct predictions. But if the first one is wrong - and the others have to be wrong as consequence - Kurzweil might count the whole block as one wrong prediction.
For predictions judged by multiple volunteers, it might be interesting to check how much they deviate from each other. This gives some insight how important (1.) to (3.) are. satt looked at that, but I don't know which conclusion we can draw from that.