You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gwern comments on Isolated AI with no chat whatsoever - Less Wrong Discussion

14 Post author: ancientcampus 28 January 2013 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (61)

You are viewing a single comment's thread. Show more comments above.

Comment author: John_Maxwell_IV 29 January 2013 12:33:29AM *  4 points [-]

Thinking Outside The Box: Using And Controlling an Oracle AI has lots of AI boxing ideas.

Here's an unrelated question. For most computer programs written nowadays, the data they store and manipulate is directly or indirectly related to domain they are working in. In other words, most computer programs don't speculate about how to "break out" of the computer they are running in, because they weren't programmed to do this. If you've got an AI that's programmed to model the entire world and attempt to maximize some utility function about it, then the AI will probably want to break out of the box as a consequence of its programming. But what if your AI wasn't programmed to model the entire world, just some subset of it, and had restrictions in place to preserve this? Would it be possible to write a safe, recursively self-improving chess-playing AI, for instance? (You could call this approach "restricting the AI's ontology".)

Or would it be possible to write a recursively self-improving AI that modelled the world, but restricted its self-improvements in such a way as to make breaking out of the box unlikely? For example, let's say my self-improving AI is running on a cloud server somewhere. Although it self-improves in a way so as to model the world better and better, rewriting itself so that it can start making HTTP requests and sending email and stuff (a) isn't a supported form of self-improvement (and changing this isn't a supported form of self-improvement either, ad infinitum) and (b) additionally is restricted by various non-self-improving computer security technology. (I'm not an expert on computer security, but it seems likely that you could implement this if it wasn't implemented already. And proving that your AI can't make HTTP connections or anything like that could be easier than proving friendliness.)

I haven't thought about these proposals in depth, I'm just throwing them out there.

Eliezer has complained about people offering "heuristic security" because they live in a world of English and not math. But it's not obvious to me that his preferred approach is more easily made rigorously safe than some other approach.

I think there might be a certain amount of anthropomorphization going on when people talk about AGI--we think of "general" and "narrow" AI as a fairly discrete classification, but in reality it's probably more of a continuum. It might be possible to have an AI that was superintelligent in a very large number of ways compared to humans that still wasn't much of a threat. (That's what we've already got with computers to a certain extent; how far can one take this?)

Comment author: gwern 29 January 2013 01:03:15AM 1 point [-]

But what if your AI wasn't programmed to model the entire world, just some subset of it, and had restrictions in place to preserve this? Would it be possible to write a safe, recursively self-improving chess-playing AI, for instance? (You could call this approach "restricting the AI's ontology".)

Why would this work any better (or worse) than an oracle AI?

Comment author: John_Maxwell_IV 29 January 2013 07:14:52AM *  1 point [-]

Presumably an Oracle AI's ontology would not be restricted because it's trying to model the entire world.

Obviously we don't particularly need an AI to play chess. It's possible that we'd want this for some other domain, though, perhaps especially for one that has some relevance for FAI, or as a self-improving AI prototype. I also think it's interesting as a thought experiment. I don't understand the reasons why SI is so focused on the FAI approach, and I figure by asking questions like that one maybe I can learn more about their views.

Comment author: gwern 29 January 2013 04:47:43PM 0 points [-]

Presumably an Oracle AI's ontology would not be restricted because it's trying to model the entire world.

Well, yes, by definition. But that's not an answer to my question.

Comment author: John_Maxwell_IV 30 January 2013 07:30:07AM *  0 points [-]

I don't know which would approach would be more easily formalized and proven to be safe.