Comparing IQ and codeforces doesn't make much sense. Please stop doing this.
Attaching IQs to LLMs makes even less sense. Except as a very loose metaphor. But please also stop doing this.
That's not right. You could easily spend a billion dollars just on better evals and better interpretability.
For the real alignment problem, the fact that 0.1 bill a year hasn't yielded returns, doesn't mean 100 billion won't. It's one problem. No one has gotten much traction on it. You'd expect it to look like a step function, not a smooth curve.
I don't really understand. Why wouldn't you just test to see if you are deficient in things?
I did that, and I wasn't deficient in anything.
I've also (somewhat involuntarily) done the thing you suggest, and I unsurprisingly didn't notice any difference. If anything, I feel a lot better on a vegan diet.
If you want to do the thing hes suggesting here, I'd recommend eating bivalves, like blue mussels or oysters. They are very unlikely to be sentient, they are usually quite cheap, they contain the nutrients you'd be at risk of becoming deficient in as a vegan, and other beneficient things like DHA.
I think for the fundraiser, Lightcone should sell (overpriced) lw hoodies. Lesswrong has a very nice aesthetic now, and while this is probably a byproduct of a piece of my mind I shouldn't encourage, I find it quite appealing to buy a 450$ lw hoodie, even though I don't have that much money. I'd probably not donate to the fundraiser otherwise. And if I did, I'd donate less than the margins on such a hoodie would be.
People seem to disagree with this comment. There's two statements and one argument in it
What are people disagreeing with? Is it mostly the former? I think the latter is rather clear. I'm very confident it is true. Both the argument and the conclusion. The former, I'm quite confident is true as well (~90% ish?), but only for my set of values.
https://bsky.app/profile/hmys.bsky.social/post/3lbd7wacakn25
I made one. A lot of people are not here, but many people are.
Seems unlikely to me. I mean, I think, in large part due to factory farming, that the current immediate existence of humanity, and also its history, are net negatives. The reason I'm not a full blown antinatalist is because these issues are likely to be remedied in the future, and the goodness of the future will astronomically dwarf the current negativity humanity has and is bringing about. (assuming we survive and realize a non-negligible fraction of our cosmic endowment)
The reason I think this is, well, the way I view it, its an immediate corollary of the standard yudkowsky/bostrom AI arguments. Animals existing and suffering is an extremely specific state of affairs, just like humans existing and being happy is an extremely specific state of affairs. This means that, if you optimize hard enough for anything, thats not exactly that (humans happy or animals suffering), you're not gonna get it.
And, maybe this is me being too optimistic (but I really hope not, and I really don't think so), but I don't think many humans want animals to suffer for its own sake. They'd eat lab-grown meat if it was cheaper and better tasting than animal-grown meat. Lab-grown meat is a good example of the general principle I'm talking about. Suffering of sentient minds is a complex thing. If you have a powerful optimizer, about its way optimizing the universe, you're virtually never gonna get suffering sentient minds unless that is what the optimizer is deliberately aiming for.
I agree with this analysis. I mean, I'm not certain further optimization will erode the interpretability of the generated CoT, its possible the fact its pretrained to use human natural language pushes it in a stable equilibrium, but I don't think so, there are ways the CoT can become less interpretable in a step-wise fashion.
But this is the way its going, seems inevitable to me. Just scaling up models and then training them on English language internet text, is clearly less efficient (from a "build AGI" perspective, and from a profit-perspective) than training them to do the specific tasks that the users of the technology want. So thats the way its going.
And once you're training the models this way, the tether between human-understandable concepts and the CoT will be completely destroyed. If they stay together, it will just be because its kind of a stable initial condition.
I just meant not primarily motivated by truth.
I specifically disagree with the IQ part and the codeforces part. Meaning, I think they're misleading.
IQ and coding ability are useful measures of intelligence in humans because they correlate with a bunch of other things we care about. Not to say its useless to measure "IQ" or coding ability in LLMs, but presenting like they mean anything like what they mean in humans is wrong, or at least will give many people reading it the wrong impression.
As for the overall point of this post. I roughly agree? I mean, I think the timelines are not too unreasonable, and think the tri/quad lemma you put up can be a useful framing. I mostly disagree with using the metrics you put up first to quantify any of this. I think we should look at specific abilities current models have/lack, which are necessary for the scenarios you outlined, and how soon we're likely to get them. But you do go through that somewhat in the post.