Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Yosarian2 03 June 2017 06:42:12PM *  3 points [-]

Interesting idea.

So, let's think about this. Say we're talking about the Xrisk of nuclear war. Which of these would count as "near misses" in your mind?

Cuban missile crisis (probably the clearest case)

End of the Korean war when General Douglas MacArthur was pushing for nuclear weapons to be used against China

Berlin Airlift (IMHO this came very close to a WWIII scenario between Russia and the US, but the USSR didn't test their first nuclear weapon until 1949, so while a WWIII that started in 1948 could have gone nuclear, it probably wouldn't have been enough nuclear weapons to be a true x-risk?)

The incident in 1983 when Lieutenant Colonel Stanislav Petrov got the false report of a US launch and decided to not pass it on to his superiors

A bomber carrying 2 nuclear weapons crashed in California, and apparnelty one of the bombs came very close to detonating: https://en.wikipedia.org/wiki/1961_Goldsboro_B-52_crash

And wikipedia lists another 8 nuclear close calls I hadn't even heard about before searching for it:


It seems like it might be hard to define exactally what a "close call" is or what counts as a close call, but out of those, which ones would you count?

Edit: And nuclear war is probably the easiest one to measure. If there was a near miss that almost resulted in, say, a genetically engineered bioweapon being released, odds are we never would have heard about it. And I doubt anyone would even recognize something that was a near miss in terms of UFAI or similar technologies.

Comment author: AlexMennen 26 May 2017 05:43:10PM 4 points [-]

Good point. This seems like an important oversight on my part, so I added a note about it.

Comment author: Yosarian2 26 May 2017 09:26:22PM 3 points [-]


One more point you might want to mention, is that in a world with AI but no intelligent explosion, where AI's are not able to rapidly develop better AI's, augmented human intelligence through various transhuman technologies and various forms of brain/computer interfaces could be a much more important factor; that kind of technology could allow humans to "keep up with" AI's (at least for a time), and it's possible that humans and AI's working together on tasks could remain competitive with pure AI's for a significant time period.

Comment author: Yosarian2 26 May 2017 10:01:10AM 3 points [-]

Another big difference is if there's no intelligence explosion, we're probably not talking about a singleton. If someone manages to create an AI that's, say, roughly human level intelligence (probably stronger in some areas and weaker in others, but human-ish on average) and progress slows or stalls after that, then the most likely scenario is that a lot of those human-level AI's would be created and sold for different purposes all over the world. We would probably be dealing with a complex world that has a lot of different AI's and humans interacting with each other. That could create it's own risks, but they would probably have to be handled in a different way.

In response to Cheating Omega
Comment author: Yosarian2 14 April 2017 02:59:20AM 0 points [-]

Yeah, creating a situation where you are making a choice random can be a way to win some game-theory situations.

Disturbingly, this is a "solution" to real world MAD nuclear war game theory; you probably can't credibly threaten to INTENTIONALLY start a nuclear war, because that would kill you and nearly all the people in your country as well, but you CAN create situations of uncertainty where there is (or appears to be) a real risk that a nuclear war may randomly happen without you actually wanting it to; if the other side has a lower tolerance for existential risk then you do, the other side may back down and try to negotiate at that point.

There are a lot of variations on this; saber rattling and deliberate near-misses, stationing troops in places they can't possibly defend (like Berlin in the cold war) just as a "trip wire" to increase the odds of WWIII happening if the Soviet Union invaded Berlin, Nixon "madman" theory, probably most of North Korea's recent moves, ect. It all comes down to the same thing; not actual randomness, which humans can't really do, but uncertainty, which is almost as effective.

Comment author: gilch 13 April 2017 04:36:04PM 1 point [-]

This is probably more true of some animals than others. From what I've read, most baboons and hyenas (for example) are pretty miserable because of their social structures. I remember reading about a case where the dominant members of a baboon troop died of disease and their culture shifted because of it. The surviving baboons were much happier.

Nature (evolution) literally invented pain in the first place, and it's under no obligation to turn it off when it doesn't impact genetic fitness. Elephants pass the mirror test. That's very strong evidence that they're conscious and self-aware. Yet they slowly starve to death once they've run out of teeth.

Comment author: Yosarian2 14 April 2017 01:30:43AM 0 points [-]

Oh, there is a lot of suffering in nature, no question. The world, as it evolved, isn't anywhere close to optimal, for anything.

I do think it's highly unlikely that net utility for your average animal over the course of it's lifetime is going to be negitive, though. The "default state" of an animal when it is not under stress does not seem to be an unhappy one, in general.

Comment author: Stuart_Armstrong 12 April 2017 07:22:44AM 2 points [-]

I can't bias its information search (looking for evidence for X rather than evidence against it), but it can play on the variance.

Suppose you want to have a belief in X in the 0.4 to 0.6 range, and there's a video tape that would clear the matter up completely. Then not watching the video is a good move! If you currently have a belief of 0.3, then you can't bias your video watching, but you could get an idiot to watch the video and recount it vaguely to you; then you might end up with a higher chance (say 20%) of being in the 0.4 to 0.6 range.

Comment author: Yosarian2 12 April 2017 02:24:11PM *  0 points [-]

If it's capable of self-modifying, then it could do weirder things.

For example, let's say the AI knows that news source X will almost always push stories in favor of action Y. (Fox News will almost always push information that supports the argument we should bomb the middle east, The Guardian will almost always push information that supports us becoming more socailist, whatever.) If the AI wants to bias itself in favor of thinking that action Y will create more utility, what if it self-modifies to first convince itself that news source X is a much more reliable source of information then it actually is and to weigh that information more heavily in it's future analysis?

If it can't self-modify directly, it could maybe do tricky things involving only observing the desired information source at key moments with the goal of increasing it's own confidence in that information source, and then once it has modified it's own confidence sufficiently then it looks at that information source to find the information it is looking for.

(Again, this sounds crazy, but keep in mind humans do this stuff to themselves all the time.)

Ect. Basically what this all boils down to is the AI doesn't really care about what happens in the real world, it's not trying to actually accomplish a goal; instead it's primary objective is to make itself think that it has an 80% chance of accomplishing the goal (or whatever), and once it does that it doesn't really matter if the goal happens or not. It has a built in motivation to try to trick itself.

Comment author: Stuart_Armstrong 11 April 2017 08:53:41AM 3 points [-]

I'd call that a mix of extreme optimised policy with inefficiency (not in the exact technical sense, but informally).

There's nothing to stop the agent from doing that, but it's also not required. This is expected utility we're talking about, so "expected utility in the range 0.8-1" is achieved - with certainty - by a policy that has a 90% probability of achieving 1 utility (and 10% of achieving 0). You may say there's also a tiny chance of the AI's estimates being wrong, its sensors, its probability calculation... but all that would just be absorbed into, say, a 89% chance of success.

In a sense, this was the hope for the satisficer - that it would do a half-assed effort. But it can choose to do a optimal maximising policy instead. This type of agent can also choose a maximising-style policy, but mix it with deliberate inefficiency. ie it isn't really any better.

Comment author: Yosarian2 11 April 2017 03:06:09PM 0 points [-]

Ah, interesting, I understand better now what you're saying. That makes more sense, thank you.

Here's another possible failure mode then; if the AI's goal is just to manipulate it's own expected utility, and it calculates expected utility using some Bayesian method of modifying priors with new information, could it selectively seek out new information to convince itself that what it was already going to do is going to have an expected utility in the range of .8 and game the system that way? I know that sounds strange but humans do stuff like that all the time.

Comment author: Yosarian2 10 April 2017 03:21:42PM *  2 points [-]

I used the app "HabitRPG" on my phone. It's an app where you basically set a series of tasks you want to do every day or every week, and it basically "gamifies" them; a little RPG character earns experience and gold when you do the task and check it off and loses hitpoints if you don't.


Overall I'd give it about a +5. I used it for about two months when I was starting a new job, and I think it did help me create good habits for my new situation, although I can't say for sure since I don't have a control or a baseline to compare it to. After about 2 months I felt like I didn't need it anymore and stopped, and the good habits more or less persisted. Downside is probably that it's a little time consuming and can itself be a distraction sometimes, and you still need some degree of self motivation for it to work.

Comment author: entirelyuseless 10 April 2017 02:15:00PM 0 points [-]

I think the problem here is that you are trying to argue for a side: the progressives are trying to "to let things develop naturally," and so are behaving in a better way than the conservatives who are trying to "destroy" things.

The truth is this: technology does create new options, as well as taking away some options for some [e.g. many people no longer have a realistic option to "never use a computer"]. This means that technological change drives moral and cultural change.

And yes there tend to be two attitudes both to the technologies and to the behaviors that they allow or prevent. But it is just false that one of those attitudes is right and the other wrong. Rather, in some cases those behaviors are beneficial, and in others they are not. Very often, it will not be clear at first whether the results of the new behavior will be good overall, and only later people figure out that they need still another technology, or they still need to fix remaining problems, or whatever.

Comment author: Yosarian2 10 April 2017 02:34:12PM 0 points [-]

I think the problem here is that you are trying to argue for a side: the progressives are trying to "to let things develop naturally," and so are behaving in a better way than the conservatives who are trying to "destroy" things.

Not exactly. I am giving as a counterexample a class of situations where conservatives try to shape society while progressives are trying to let "nature take it's course".

I also didn't put any value judgement in, at least not intentionally. I described it as "destroying weeds", which is not a negitive thing. I would say that both sides are trying to "destroy weeds" and shape society they just have a different idea what those weeds are.

But it is just false that one of those attitudes is right and the other wrong.

I did not say there was.

I do think that in the majority of cases new technology overall makes our lives better, and it's usually better to embrace the new possibilities first and then error-correct later to eliminate uses where it turns out the new technology was not as helpful as it appeared; usually the only way to find that out is to try it, and attempts to restrict it beforehand usually targets the wrong problems anyway. But that's an object-level question that depends on the technology in question, not a universal truth.

Comment author: Yosarian2 10 April 2017 01:45:05PM 0 points [-]

Its failure mode, though, is that it don't preclude, for instance, a probabilistic mix of extreme optimised policy with a random inefficient one.

I think there is a more serious failure mode here.

If a AI wants to keep a utility function within a certain range, what's to stop it from dramatically increasing it's own intelligence, access to resources, ect towards infinitely just to increase the probability of staying within that range in the future from 99.9% up to 99.999%? You still might run into the same "instrumental goals" problem.

View more: Next