they put substantial probability on the trend being superexponential
I think that's too speculative.
I also think that around 25-50% of the questions are impossible or mislabeled.
I wouldn't be surprised if 3-5% of questions were mislabeled or impossible to answer, but 25-50%? You're basically saying that HLE is worthless. I'm curious why. I mean, I don't know much about the people who had to sift through all of the submissions, but I'd be surprised if they failed that badly. Plus, there was a "bug bounty" aimed at improving the quality of the dataset.
TBC, my median to superhuman coder is more like 2031.
Guess I'm a pessimist then, mine is more like 2034.
I don't have one knock-down counterargument why the timelines will be much longer, so here's a whole lot of convincing-but-not-super-convincing counterarguments:
Not a criticism, but I think you overlooked a very interesting possibility: developing a near-perfect speech-to-text transcription AI and transcribing the entire YouTube. The biggest issue with training multi-modal models is acquiring the right ("paired") training data. If YouTube had 99.9% accurate subtitles for every video, this would no longer be a problem.
You might be interested in reading about aspiration adaptation theory: https://www.sciencedirect.com/science/article/abs/pii/S0022249697912050
To me the most appealing part of it is that goals are incomparable and multiple goals can be pursued at the same time without the need for a function that aggregates them and assigns a single value to a combination of goals.
I'm quite late (the post was made 4 years ago), and I'm also new to LessWrong, so it's entirely possible that other, more experienced members, will find flaws in my argument.
That being said, I have a very simple, short and straightforward explanation of why rationalists aren't winning.
Domain-specific knowledge is king.
That's it.
If you are a programmer and your code keeps throwing errors at you, then no matter how many logical fallacies and cognitive biases you can identify and name, posting your code on stackoverflow is going to provide orders of magnitude more benefit.
If you are an entrepreneur and you're trying to start your new business, then no matter how many hours you spend assessing your priors and calibrating your beliefs, it's not going to help you nearly as much as being able to tell a good manager apart from a bad manager.
I'm not saying that learning rationality isn't going to help at all, rather I'm saying that the impact of learning rationality on your chances of success will be many times smaller than the impact of learning domain-specific knowledge.
Ok, thank you for the clarification!
I'm very new to Less Wrong in general, and to Eliezer's writing in particular, so I have a newbie question.
any more than you've ever argued that "we have to take AGI risk seriously even if there's only a tiny chance of it" or similar crazy things that other people hallucinate you arguing.
just like how people who helpfully try to defend MIRI by saying "Well, but even if there's a tiny chance..." are not thereby making their epistemic sins into mine.
I've read AGI Ruin: A List of Lethalities, and I legitimately have no idea what is wrong with "we have to take AGI risk seriously even if there's only a tiny chance of it". What is wrong with it? If anything, this seems like something I would say if I had to explain the gist of AGI Ruin: A List of Lethalities to someone else very briefly and using very few words.
The fact that I have absolutely no clue what is wrong with it probably means that I'm still very far from understanding anything about AGI and Eliezer's position.
My point was that it's surprising that AI is so bad at generalizing to tasks that it hasn't been trained on. I would've predicted that generalization would be much better (I also added a link to a post with more examples). This is also why I think creating AGI will be very hard, unless there will be a massive paradigm shift (some new NN architecture or a new way to train NNs).
EDIT: It's not "Gemini can't count how many words it has in its output" that surprises me, it's "Gemini can't count how many words it has in its output, given that it can code in Python and in a dozen other languages and can also do calculus".