ZoltanBerrigomo comments on What can go wrong with the following protocol for AI containment? - Less Wrong

0 Post author: ZoltanBerrigomo 11 January 2016 11:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (29)

You are viewing a single comment's thread. Show more comments above.

Comment author: HungryHobo 13 January 2016 12:22:56PM 0 points [-]

I'm not sure it's just a matter of what we know right now, it's mathematically provable that you can't create a program which can find all security flaws or prove any provable code so bugs are pretty much inevitable no matter how advanced we become.

Comment author: ZoltanBerrigomo 14 January 2016 03:37:00AM *  0 points [-]

The theorem you cite (provided I understood you correctly) does not preclude the possibility of checking whether a program written in a certain pre-specified format will have bugs. Bugs here are defined to be certain undesirable properties (e.g., looping forever, entering certain enumerated states, etc).

Baby versions of such tools (which automatically check whether your program will have certain properties from inspecting the code) already exist.

Comment author: HungryHobo 14 January 2016 12:56:12PM *  1 point [-]

If the language and format you're using is Turing complete then you can't write a program which can guarantee to find all bugs.

If you limit yourself to a subset of features such that you are no longer writing in a format which is turing complete then you may be able to have a program capable of automatically proving that code reliably.

Static analysis code does exist but still doesn't guarantee 100% accuracy and is generally limited to the first couple of levels of abstraction.

Keep in mind that if you want to be 100% certain of no bugs you also have to prove the compiler, the checker, any code your program interacts with and the hardware on which the code runs.

Comment author: ZoltanBerrigomo 15 January 2016 01:11:59AM 0 points [-]

If you limit yourself to a subset of features such that you are no longer writing in a format which is turing complete then you may be able to have a program capable of automatically proving that code reliably.

Right, that is what i meant.