gRR comments on Let's reimplement EURISKO! - Less Wrong

19 Post author: cousin_it 11 June 2009 04:28PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (151)

You are viewing a single comment's thread. Show more comments above.

Comment author: gRR 14 February 2012 12:45:26PM 0 points [-]

What if its domain is restricted to math and self-modification? Then, if it fooms, it will be a safe math Oracle, possibly even provably safe. Then it would be a huge help in the road to FAI, both directly and as a case study.

Comment author: Houshalter 10 May 2014 09:49:24AM *  0 points [-]

It may very well be possible to build such an AI. However there are several issues with it:

  • The AI can be adapted for other, less restricted, domains if knowledge on how it works spreads. There would be a large incentive to since such an oracle would only be of limited utility.

  • The AI adds code that will evolve into another AI into it's output. It's remotely possible, depending on what kind of problems you have it working on. If you were using it to design more efficient algorithms, in some cases an AI of some form might be the optimal solution.

    Even if you 100% trust the AI to provide the optimal output, you can't trust that the optimal output to the problem you've specified is what you actually want.

  • The AI could self-modify incorrectly and result in unfriendly AI. In order to be provably friendly/restricted, it would have to be 100% certain of any modification. That's a very tall order, especially in AI where everything has to be approximations or probabilistic.

  • It might not be as safe as you think it is. The AI runs some code and gets an unexpected result. Possibly because of a bug in the environment itself. Look up how difficult it is to sandbox untrusted code and you will get some appreciation for how a superintelligence could figure a way out of it's box.

    But it can't do anything with any exploits it finds because it is restricted to hard-coded axioms? Well, maybe. If it's using probabilities and some form of machine learning, it might be able to learn that "executing this code give me this result" and then learn to take advantage of that. I don't believe that a system can work only in formal proofs. However I might be completely wrong about this one, it's just a thought.

Comment author: gRR 10 May 2014 10:53:46AM *  0 points [-]

The AI can be adapted for other, less restricted, domains

That the ideas from a safe AI can be used to build an unsafe AI is a general argument against working on (or even talking about) any kind of AI whatsoever.

The AI adds code that will evolve into another AI into it's output

The output is to contain only proofs of theorems. Specifically, a proof (or refutation) of the theorem in the input. The state of the system is to be reset after each run so as to not accumulate information.

The AI could self-modify incorrectly and result in unfriendly AI

Any correct or incorrect self-modification is still restricted to the math domain, and so cannot result in an unsafe AI.

bug in the environment itself

Guarding against software bugs is easy in this case. You design an abstract virtual machine environment for the AI, then design the software that implements this environment, then formally prove that the software is correct. Guarding against errors caused by cosmic rays is also easy. You estimate the probability of such an error, and then add redundancy until the probability is so low that it won't happen until the end of the universe.

Look up how difficult it is to sandbox untrusted code

Sandboxing untrusted code is easy. The difficult thing is sandboxing it while making it think that it runs normally. This is irrelevant here.

I don't believe that a system can work only in formal proofs

It is quite probable that a "pure math Oracle" system cannot work. The point was, it can be made safe to try.