Kaj_Sotala comments on Evaluating the feasibility of SI's plan - Less Wrong

25 Post author: JoshuaFox 10 January 2013 08:17AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (186)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kaj_Sotala 10 January 2013 04:42:51PM *  11 points [-]

I don't know how to take a self-modifying heuristic soup in the process of going FOOM and make it Friendly. You don't know either, but the problem is, you don't know that you don't know. Or to be more precise, you don't share my epistemic reasons to expect that to be really difficult.

But the article didn't claim any different: it explicitly granted that if we presume a FOOM, then yes, trying to do anything with heuristic soups seems useless and just something that will end up killing us all. The disagreement is not on whether it's possible to make a heuristic AGI that FOOMs while remaining Friendly; the disagreement is on whether there will inevitably be a FOOM soon after the creation of the first AGI, and whether there could be a soft takeoff during which some people prevented those powerful-but-not-yet-superintelligent heuristic soups from killing everyone while others put the finishing touches on the AGI that could actually be trusted to remain Friendly when it actually did FOOM.

Comment author: torekp 21 January 2013 12:11:23AM 2 points [-]

The disagreement is not on whether it's possible to make a heuristic AGI that FOOMs while remaining Friendly; the disagreement is on whether there will inevitably be a FOOM soon after the creation of the first AGI

Moreover, the very fact that an AGI is "heuristic soup" removes some of the key assumptions in some FOOM arguments that have been popular around here (Omohundro 2008). In particular, I doubt that a heuristic AGI is likely to be a "goal seeking agent" in the rather precise sense of maximizing a utility function. It may not even approximate such behavior as closely as humans do. On the other hand, if a whole lot of radically different heuristic-based approaches are tried, the odds of at least one of them being "motivated" to seek resources increases dramatically.

Comment author: Kaj_Sotala 21 January 2013 09:41:19AM 3 points [-]

Note that Omohundro doesn't assume that the AGI would actually have a utility function: he only assumes that the AGI is capable of understanding the microeconomic argument for why it would be useful for it to act as if it did have one. His earlier 2007 paper is clearer on this point.

Comment author: torekp 22 January 2013 01:26:20AM 0 points [-]

Excellent point. But I think the assumptions about goal-directedness are still too strong. Omohundro writes:

Self-improving systems do not yet exist but we can predict how they might play chess. Initially, the rules of chess and the goal of becoming a good player would be supplied to the system in a formal language such as first order predicate logic1. Using simple theorem proving, the system would try to achieve the specified goal by simulating games and studying them for regularities. [...] As its knowledge grew, it would begin doing “meta-search”, looking for theorems to prove about the game and discovering useful concepts such as “forking”. Using this new knowledge it would redesign its position representation and its strategy for learning from the game simulations.

That's all good and fine, but doesn't show that the system has a "goal of winning chess games" in the intuitive sense of that phrase. Unlike a human being or other mammal or bird, say, its pursuit of this "goal" might turn out to be quite fragile. That is, changing the context slightly might have the system happily solving some other, mathematically similar problem, oblivious to the difference. It could dramatically fail to have robust semantics for key "goal" concepts like "winning at chess".

For example, a chess playing system might choose U to be the total number of games that it wins in a universe history.

That seems highly unlikely. More likely, the system would be programmed to maximize the percentage of its games that end in a win, conditional on the number of games it expects to play and the resources it has been given. It would not care how many games were played nor how many resources it was allotted.

On the other hand, Omohundro is making things too convenient for me by his choice of example. So let's say we have a system intended to play the stock market and to maximize profits for XYZ Corporation. Further let's suppose that the programmers do their best to make it true that the system has a robust semantics for the concept "maximize profits".

OK, so they try. The question is, do they succeed? Bear in mind, again, that we are considering a "heuristic soup" approach.

Comment author: Kaj_Sotala 22 January 2013 01:21:20PM *  2 points [-]

Even at the risk of sounding like someone who's arguing by definition, I don't think that a system without any strongly goal-directed behavior qualifies as an AGI; at best it's an early prototype on the way towards AGI. Even an oracle needs the goal of accurately answering questions in order to do anything useful, and proposals of "tool AGI" sound just incoherent to me.

Of course, that raises the question of whether a heuristic soup approach can be used to make strongly goal-directed AGI. It's clearly not impossible, given that humans are heuristic soups themselves; but it might be arbitrarily difficult, and it could turn out that a more purely math-based AGI was far easier to make both tractable and goal-oriented. Or it could turn out that it's impossible to make a tractable and goal-oriented AGI by the math route, and the heuristic soup approach worked much better. I don't think anybody really knows the answer to that, at this point, though a lot of people have strong opinions one way or the other.

Comment author: Wei_Dai 10 January 2013 11:41:37PM 0 points [-]

it explicitly granted that if we presume a FOOM, then yes, trying to do anything with heuristic soups seems useless and just something that will end up killing us all.

Maybe it shouldn't be granted so readily?

and whether there could be a soft takeoff during which some people prevented those powerful-but-not-yet-superintelligent heuristic soups from killing everyone while others put the finishing touches on the AGI that could actually be trusted to remain Friendly when it actually did FOOM.

I'm not sure how this could work, if provably-Friendly AI has a significant speed disadvantage, as the OP argues. You can develop all kinds of safety "plugins" for heuristic AIs, but if some people just don't care about the survival of humans or of humane values (as we understand it), then they're not going to use your ideas.

Comment author: JoshuaFox 11 January 2013 09:34:27AM 2 points [-]

provably-Friendly AI has a significant speed disadvantage, as the OP argues.

Yes, the OP made that point. But I have heard the opposite from SI-ers -- or at least they said that in the future SI's research may lead to implementation secrets that should not be shared with others. I didn't understand why that should be.

Comment author: Wei_Dai 11 January 2013 01:16:59PM 4 points [-]

or at least they said that in the future SI's research may lead to implementation secrets that should not be shared with others. I didn't understand why that should be.

It seems pretty understandable to me... SI may end up having some insights that could speed up UFAI progress if made public, and at the same time provably-Friendly AI may be much more difficult than UFAI. For example, suppose that in order to build a provably-Friendly AI, you may have to first understand how to build an AI that works with an arbitrary utility function, and then it will take much longer to figure out how to specify the correct utility function.