Artaxerxes - LessWrong

But ultimately, for the parts that really matter here, this is a matter of explaining, not of defeating

Of course, defeating people who are mistakenly doing the wrong thing could also work, no? Even if we take the assumption that people doing the wrong thing are merely making a mistake by their own lights to be doing so, it might be practically much more feasible to divert them away from doing it or otherwise prevent them from doing it, rather than to rely on successfully convincing them not to do it.

Not all people are going to be equally amenable to explanation. It's not obvious to me at least that we should limit ourselves to that tool in the toolbox as a rule, even under an assumption where everyone chasing bad outcomes is simply mistaken/confused.

But I'm pretty sure nobody in charge is on purpose trying to kill everyone; they're just on accident functionally trying to kill everyone.

I'm less sure about this. I've met plenty of human extinctionists. You could argue that they're just making a mistake, that it's just an accident. But I do think it is meaningful that there are people who are willing to profess that they want humanity to go extinct and take actions in the world that they think nudge us towards that direction, and other people that don't do those things. The distinction is a meaningful one, even under a model where you claim that such people are fundamentally confused and that if they were somehow less confused they would pursue better things.

When do "brains beat brawn" in Chess? An experiment

Artaxerxes2y51

On the other hand, the potential resource imbalance could be ridiculously high, particularly if a rogue AI is caught early on it’s plot, with all the worlds militaries combined against them while they still have to rely on humans for electricity and physical computing servers. It’s somewhat hard to outthink a missile headed for your server farm at 800 km/h. ... I hope this little experiment at least explains why I don’t think the victory of brain over brawn is “obvious”. Intelligence counts for a lot, but it ain’t everything.

While this is a true and important thing to realise, I don't think of it as the kind of information that does much to comfort me with regards to AI risk. Yes, if we catch a misaligned AI sufficicently early enough, such that it is below whatever threshold of combined intelligence and resources that is needed to kill us, then there is a good chance we will choose to prevent it from doing so. But this is something that could happen thousands of times and it would still feel rather besides the point, because it only takes one situation where one isn't below that threshold and therefore does still kill us all.

If we can identify even roughly where various thresholds are, and find some equivalent of leaving the AI with a king and three pawns where we have a ~100% chance of stopping it, then sure, that information could be useful and perhaps we could coordinate around ensuring that no AI that would kill us all should it get more material from indeed ever getting more than that. But even after clearing the technical challenge of finding such thresholds with much certainty in such a complex world, the coordination challenge of actually getting everyone to stick to them despite incentives to make more useful AI by giving it more capability and resources, would still remain.

Still worthwhile research to do of course, even if it ends up being the kind of thing that only buys some time.

Musk on AGI Timeframes

Artaxerxes3y20

The "10 years at most" part of the prediction is still open, to be fair.

Five Areas I Wish EAs Gave More Focus

Artaxerxes3y20

You also appeal to just open-ended uncertainty

I think it would be more accurate to say that I'm simply acknowledging the sheer complexity of the world and the massive ramifications that such a large change would have. Hypothesizing about a few possible downstream effects of something like life extension on something as far away from it causally as AI risk is all well and good, but I think you would need to put a lot of time and effort into it in order to be very confident at all about things like directionality of net effects overall.

I would go as far as to say the implementation details of how we get life extension itself could change the sign of the impact with regards to AI risk - there are enough different possible scenarios as to how it could go that could each amplify different components of its impact on AI risk to produce a different overall net effect.

What are some additional concrete scenarios where longevity research makes things better or worse?

So first you didn't respond to the example I gave with regards to preventing human capital waste (preventing people with experience/education/knowledge/expertise dying from aging-related disease), and the additional slack from the additional general productive capacity in the economy more broadly that is able to go into AI capabilities research.

Here's another one. Lets say medicine and healthcare becomes a much smaller field after the advent of popularly available regenerative therapies that prevent diseases of old age. In this world people only need to go see a medical professional when they face injury or the increasingly rare infection by a communicable disease. The demand for medical professionals disappears to a massive extent, and the best and brightest (medical programs often have the highest/most competitive entry requirements) that would have gone into medicine are routed elsewhere, including AI which accelerating capabilities and causing faster overall timelines.

An assumption that much might hinge on is that I expect differential technological development with regards to capability versus safety to be pretty heavily favouring accelerating capabilities over safety in circumstances where additional resources are made available for both. This isn't necessarily going to be the case of course, for example the resources in theory could be exclusively routed towards safety, but I just don't expect most worlds to go that way, or even for the ratio of resources to be allocated towards safety enough such that you get better posistive expected value from the additional resources very often. But even something as basic as this is subject to a lot of uncertainty.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments