loqi comments on Open Thread: June 2009 - Less Wrong

4 Post author: Cyan 01 June 2009 06:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: scav 02 June 2009 02:03:19PM 0 points [-]

I haven't thought this through very deeply, but couldn't the working of the machine be bounded by hard safety constraints that the AI was not allowed to change, rather than trying to work safety into the overall utility function?

e.g. the AI is not allowed to construct more resources for itself. No matter how inefficient it may be, the AI has to ask a human for more hardware and wait for the hardware to be installed by humans.

What I would want of a super-intellient AI is more or less what I would want of a human who has power over me: don't do things to me or my stuff without asking, don't coerce me, don't lie to me.

I don't know how you would code all that, but if we can't design simple constraints, we can't correctly design more complex ones. I'm thinking layers of simple constraints would be safer than one unprovably-friendly utility function.

Comment author: loqi 03 June 2009 06:51:52PM 0 points [-]

I don't know how you would code all that, but if we can't design simple constraints, we can't correctly design more complex ones.

What makes you think these constraints are at all simple?