DanielLC comments on Steelmanning MIRI critics - Less Wrong

6 Post author: fowlertm 19 August 2014 03:14AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread. Show more comments above.

Comment author: DanielLC 20 August 2014 03:29:25AM 1 point [-]

For one thing, you'd have to explicitly come up with the utility function before you can prove the AI follows it.

You can either make an AI that will proveably do what you mean, or make one that will hopefully figure out what you meant when you said "do what I mean," and do that.

Comment author: VAuroch 20 August 2014 07:10:54AM 0 points [-]

When I picture what a proven-Friendly AI looks like, I think of something where it's goals are 1)Using a sample of simulated humans, generalize to unpack 'do what I mean' followed by 2)Make satisfying that your utility function.

Proving those two steps each rigorously would produce a proven-Friendly AI without an explicit utility function. Proving step 1 to be safe would obviously be very difficult; proving step 2 to be safe would probably be comparatively easy. Both, however, are plausibly rigorously provable.

Comment author: DanielLC 20 August 2014 04:27:02PM 1 point [-]

2)Make satisfying that your utility function.

This is what I mean by an explicit utility function. An implicit one is where it never actually calculates utility, like how humans work.