Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: AABoyles 05 September 2017 07:17:19PM 1 point [-]

If there was randomness such that you had some probability of a strongly positive event, would this incline you towards life?

Comment author: Brian_Tomasik 03 October 2017 03:51:17AM 1 point [-]

Currently I don't care much about strongly positive events, so at this point I'd say no. In the throes of such a positive event I might change my mind. :)

Comment author: AABoyles 05 September 2017 07:14:02PM 1 point [-]

Even if the probability was trivial?

Comment author: Brian_Tomasik 03 October 2017 03:49:47AM 1 point [-]

Yes, because I don't see any significant selfish upside to life, only possible downside in cases of torture/etc. Life is often fun, but I don't strongly care about experiencing it.

Comment author: Lumifer 05 September 2017 07:21:16PM 2 points [-]

However, if there was randomness, such that I had some probability of, e.g., being tortured by a serial killer, then I would certainly choose not to repeat life.

Your future life as of this moment certainly has a large amount of randomness.

Comment author: Brian_Tomasik 03 October 2017 03:46:47AM 0 points [-]

Yeah, but it would be very bad relative to my altruistic goals if I died any time soon. The thought experiment in the OP ignores altruistic considerations.

Comment author: Brian_Tomasik 23 September 2017 05:11:16AM 1 point [-]

However, if you believe that the agent in world 2 is not an instantiation of you, then naturalized induction concludes that world 2 isn't actual and so pressing the button is safe.

By "isn't actual" do you just mean that the agent isn't in world 2? World 2 might still exist, though?

Comment author: Brian_Tomasik 03 September 2017 06:35:05AM 1 point [-]

I assume the thought experiment ignores instrumental considerations like altruistic impact.

For re-living my actual life, I wouldn't care that much either way, because most of my experiences haven't been extremely good or extremely bad. However, if there was randomness, such that I had some probability of, e.g., being tortured by a serial killer, then I would certainly choose not to repeat life.

Comment author: Lumifer 20 June 2017 02:51:27PM 2 points [-]

Direct quote: "So, s-risks are roughly as severe as factory farming"

/facepalm

Comment author: Brian_Tomasik 20 June 2017 10:06:55PM *  3 points [-]

Is it still a facepalm given the rest of the sentence? "So, s-risks are roughly as severe as factory farming, but with an even larger scope." The word "severe" is being used in a technical sense (discussed a few paragraphs earlier) to mean something like "per individual badness" without considering scope.

Comment author: username2 20 June 2017 06:59:28PM 2 points [-]

Feedback: I had to scroll a very long way until I found out what "s-risk" even was. By then I had lost interest, mainly because generalizing from fiction is not useful.

Comment author: Brian_Tomasik 20 June 2017 10:04:12PM 1 point [-]

Thanks for the feedback! The first sentence below the title slide says: "I’ll talk about risks of severe suffering in the far future, or s-risks." Was this an insufficient definition for you? Would you recommend a different definition?

Comment author: Stuart_Armstrong 13 August 2015 10:18:16AM 2 points [-]

One naive and useful security precaution is to only make the AI care about world where the high explosives inside it won't actually ever detonate... (and place someone ready to blow them up if the AI misbehaves).

There are other, more general versions of that idea, and other uses to which this can be put.

Comment author: Brian_Tomasik 14 August 2015 08:28:49AM 1 point [-]

I guess you mean that the AGI would care about worlds where the explosives won't detonate even if the AGI does nothing to stop the person from pressing the detonation button. If the AGI only cared about worlds where the bomb didn't detonate for any reason, it would try hard to stop the button from being pushed.

But to make the AGI care about only worlds where the bomb doesn't go off even if it does nothing to avert the explosion, we have to define what it means for the AGI to "try to avert the explosion" vs. just doing ordinary actions. That gets pretty tricky pretty quickly.

Anyway, you've convinced me that these scenarios are at least interesting. I just want to point out that they may not be as straightforward as they seem once it comes time to implement them.

Comment author: Stuart_Armstrong 12 August 2015 10:11:20AM 1 point [-]

The AGI does not need to be tricked - it knows everything about the setup, it just doesn't care. The point of this is that it allows a lot of extra control methods to be considered, if friendliness turns out to be as hard as we think.

Comment author: Brian_Tomasik 12 August 2015 11:07:33PM 2 points [-]

Fair enough. I just meant that this setup requires building an AGI with a particular utility function that behaves as expected and building extra machinery around it, which could be more complicated than just building an AGI with the utility function you wanted. On the other hand, maybe it's easier to build an AGI that only cares about worlds where one particular bitstring shows up than to build a friendly AGI in general.

Comment author: Brian_Tomasik 12 August 2015 12:43:50AM *  0 points [-]

I'm nervous about designing elaborate mechanisms to trick an AGI, since if we can't even correctly implement an ordinary friendly AGI without bugs and mistakes, it seems even less likely we'd implement the weird/clever AGI setups without bugs and mistakes. I would tend to focus on just getting the AGI to behave properly from the start, without need for clever tricks, though I suppose that limited exploration into more fanciful scenarios might yield insight.

View more: Next