Open thread, Jul. 11 - Jul. 17, 2016

MrMind

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

I think that it is difficult but may be possible to create superintelligent program which will provably do some formally specified thing.

But the main problem is that we can't specify formally what is "harming human". Or we can, but we can't be sure that it is safe definition.

So it results in some kind of circularity: we could prove that the machine will do X, but we can't prove that X is actually good and safe.

We may try to return the burden of prove to the machine. We must prove that it will prove that X is really good and safe. I have bad feelings about computability of this task.

That is why I generally skeptical of idea of mathematical prove of AI safety. It doesn't provide 100 per cent safety, because prove can have holes in it and the task is too complex to be solved in time.

This is a real and important difficulty, but it isn't what the paper is about -- they assume one can always readily tell whether people are being harmed.