How much friendliness is enough?

cousin_it

According to Eliezer, making AI safe requires solving two problems:

1) Formalize a utility function whose fulfillment would constitute "good" to us. CEV is intended as a step toward that.

2) Invent a way to code an AI so that it's mathematically guaranteed not to change its goals after many cycles of self-improvement, negotiations etc. TDT is intended as a step toward that.

It is obvious to me that (2) must be solved, but I'm not sure about (1). The problem in (1) is that we're asked to formalize a whole lot of things that don't look like they should be necessary. If the AI is tasked with building a faster and more efficient airplane, does it really need to understand that humans don't like to be bored?

To put the question sharply, which of the following looks easier to formalize:

a) Please output a proof of the Riemann hypothesis, and please don't get out of your box along the way.

b) Please do whatever the CEV of humanity wants.

Note that I'm not asking if (a) is easy in absolute terms, only if it's easier than (b). If you disagree that (a) looks easier than (b), why?

According to Eliezer, making AI safe requires solving two problems:

1) Formalize a utility function whose fulfillment would constitute "good" to us. CEV is intended as a step toward that.

2) Invent a way to code an AI so that it's mathematically guaranteed not to change its goals after many cycles of self-improvement, negotiations etc. TDT is intended as a step toward that.

To put the question sharply, which of the following looks easier to formalize:

a) Please output a proof of the Riemann hypothesis, and please don't get out of your box along the way.

b) Please do whatever the CEV of humanity wants.

Note that I'm not asking if (a) is easy in absolute terms, only if it's easier than (b). If you disagree that (a) looks easier than (b), why?

I am familiar with the libertarian argument that if everyone has more destructive power, the society is safer. The analogous position would be that if everyone pursues (Friendly) AGI vigorously, existential risk would be reduced. That might well be reasonable, but as far as I can tell, that's NOT what is advocated.

Rather, we are all asked to avoid AGI research (and go into software development and make money and donate? How much safer is general software development for a corporation than careful AGI research?) and instead sponsor SIAI/EY doing (Friendly) AGI research while SIAI/EY is fairly closed-mouth about it.

It just seems to me like it would take a terribly delicate balance of probabilities to make this the safest course forward.

10

How much friendliness is enough?

10

10

10

How much friendliness is enough?

10

10