Nick_Beckstead comments on Reply to Holden on 'Tool AI' - Less Wrong

94 Post author: Eliezer_Yudkowsky 12 June 2012 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (348)

You are viewing a single comment's thread. Show more comments above.

Comment author: Nick_Beckstead 12 June 2012 05:32:20PM 1 point [-]

For context, I pointed this out because it looks like Eliezer is going for the first reading and criticizing that.

Comment author: Eliezer_Yudkowsky 12 June 2012 06:04:15PM 7 points [-]

Nope, I was assuming the second reading. The first reading is too implausible to be considered at all.

Comment author: Nick_Beckstead 12 June 2012 06:13:22PM 5 points [-]

Good. But now I find this response less compelling:

If Holden says there's 90% doom probability left over no matter what sane intelligent people do (all of which goes away if you just build Google Maps AGI, but leave that aside for now) I would ask him what he knows now, in advance, that all those sane intelligent people will miss. I don't see how you could (well-justifiedly) access that epistemic state. [emphasis added]

Holden might think that these folks will be of the opinion, "I can't see an error, but I'm really not confident that there isn't an error." He doesn't have to think that he knows something they don't. In particular, he doesn't have to think that there is some special failure mode he's thought of that none of them have thought of.

Comment author: Eliezer_Yudkowsky 12 June 2012 08:58:37PM 4 points [-]

Nonetheless, where is he getting the 90% doom probability from?

Comment author: Nick_Beckstead 12 June 2012 09:21:34PM *  1 point [-]

I'm with you, 90% seems too high given the evidence he cites or any evidence I know of.

Comment author: Arepo 13 June 2012 12:01:52PM 4 points [-]

Assuming you accept the reasoning, 90% seems quite generous to me. What percentage of complex computer programmes when run for the first time exhibit behaviour the programmers hadn't anticipated? I don't have much of an idea, but my guess would be close to 100. If so, the question is how likely unexpected behaviour is to be fatal. For any programme that will eventually gain access to the world at large and quickly become AI++, that seems (again, no data to back this up - just an intuitive guess) pretty likely, perhaps almost certain.

For any parameter of human comfort (eg 253 degrees Kelvin, 60% water, 40 hour working weeks), a misplaced decimal point misplaced by seems like it would destroy the economy at best and life on earth at worst.

If Holden’s criticism is appropriate, the best response might be to look for other options rather than making a doomed effort to make FAI – for example trying to prevent the development of AI anywhere on earth, at least until we can self-improve enough to keep up with it. That might have a low probability of success, but if FAI has sufficiently low probability, it would still seem like a better bet.

Comment author: TheOtherDave 13 June 2012 01:30:06PM 4 points [-]

You know, the idea that SI might at any moment devote itself to suppressing AI research is one that pops up from time to time, the logic pretty much being what you suggest here, and until this moment I have always treated it as a kind of tongue-in-cheek dig at SI.

I have only just now come to realize that the number of people (who are not themselves affiliated with SI) who really do seem to consider suppressing AI research to be a reasonable course of action given the ideas discussed on this forum has a much broader implication in terms of the social consequences of these ideas. That is, I've only just now come to realize that what the community of readers does is just as important, if not more so, than what SI does.

I am now becoming genuinely concerned that, by participating in a forum that encourages people to take seriously ideas that might lead them to actively suppress AI research, I might be doing more harm than good.

I'll have to think about that a bit more.

Arepo, this is not particularly directed at you; you just happen to be the data point that caused this realization to cross an activation threshold.

Comment author: Bruno_Coelho 15 June 2012 01:09:38AM 0 points [-]

People with similar background are entering in AI field because they like reduce x-risks, so it's not obvious this is happening. If safety guided research supress AI research, then be it. Extremely rapid advance per se is not good, if the consequence is extiction.

Comment author: shminux 13 June 2012 04:12:15PM 0 points [-]

I am now becoming genuinely concerned that, by participating in a forum that encourages people to take seriously ideas that might lead them to actively suppress AI research, I might be doing more harm than good.

Assuming that you think that more AI research is good, wouldn't adding your voice to those who advocate it here be a good thing? It's not like your exalted position and towering authority lends credence to a contrary opinion just because you mention it.

Comment author: TheOtherDave 13 June 2012 04:25:41PM 1 point [-]

I think better AI (of the can-be-engineered-given-what-we-know-today, non-generally-superhuman sort) is good, and I suspect that more AI research is the most reliable way to get it.

I agree that my exalted position and towering authority doesn't lend credence to contrary opinions I mention.

It's not clear to me whether advocating AI research here would be a better thing than other options, though it might be.

Comment author: falenas108 13 June 2012 04:16:57PM 3 points [-]

What percentage of complex computer programmes when run for the first time exhibit behaviour the programmers hadn't anticipated? I don't have much of an idea, but my guess would be close to 100.

That's for normal programs, where errors don't matter. If you look at ones where people carefully look over the code because lives are at stake (like NASA rockets), then you'll have a better estimate.

Probably still not accurate, because much more is at stake for AI than just a few lives, but it will be closer.

Comment author: TheOtherDave 13 June 2012 04:28:25PM 2 points [-]

I suspect that unpacking "run a program for the first time" more precisely would be useful here; it's not clear to me that everyone involved in the conversation has the same referents for it.

Comment author: Nick_Beckstead 13 June 2012 06:59:23PM *  1 point [-]

This. I see that if you have one and only one chance to push the Big Red Button and you're not allowed to use any preliminary testing of components or boxing strategies (or you're confident that those will never work) and you don't get most of the experts to agree that it is safe, then 90% is more plausible. If you envision more of these extras to make it safer--which seems like the relevant thing to envision--90% seems too high to me.

Comment author: DanArmak 13 June 2012 08:29:43PM 1 point [-]

Surely NASA code is thoroughly tested in simulation runs. It's the equivalent of having a known-perfect method of boxing an AI.

Comment author: asparisi 14 June 2012 11:18:31PM 0 points [-]

Huh. This brings up the question of whether or not it would be possible to simulate the AGI code in a test-run without regular risks. Maybe create some failsafe that is invisible to the AGI that destroys it if it is "let out of the box" or (to incorporate Holden's suggestion, since it just came to me) having a "tool mode" where the AGI's agent-properties (decision making, goal setting, etc.) are non-functional.

Comment author: Eliezer_Yudkowsky 14 June 2012 09:26:34PM -1 points [-]

But NASA code can't check itself - there's no attempt at having an AI go over it.

Comment author: DanArmak 15 June 2012 06:45:40AM 0 points [-]

Yes, but even ordinary simulation testing produces software that's much better on its first real run than software that has never been run at all.

Comment author: Randaly 13 June 2012 09:12:48PM 0 points [-]

The last three versions of [NASA's] program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors.

From They Write the Right Stuff

Note, however, that a) this is after many years of debugging from practice, b) NASA was able to safely 'box' their software, and c) even one error, if in the wrong place, would be really bad.

Comment author: Strange7 13 June 2012 09:30:37PM 0 points [-]

How hard would it actually be to "box" an AI that's effectively had it's brain sliced up into very small chunks?

A program could, if it was important enough and people were willing to take the time to do so, be broken down into pieces and each of the pieces tested separately. Any given module has particular sorts of input it's designed to receive, and particular sorts of output it's supposed to pass on to the next module. Testers give the module different combinations of valid inputs and try to get it to produce an invalid output, and when they succeed, either the module is revised and the testing process on that module starts over from the beginning, or the definition of valid inputs is narrowed, which changes the limits for valid outputs and forces some other module further back to be redesigned and retested. A higher-level analysis, which is strictly theoretical, also tries to come up with sequences of valid inputs and outputs which could lead to a bad outcome. Eventually, after years of work and countless iterations of throwing out massive bodies of work to start over, you get a system which is very tightly specified to be safe, and meets those specs under all conceivable conditions, but has never actually been plugged in and run as a whole.

Comment author: TheOtherDave 13 June 2012 10:09:31PM 0 points [-]

The conceptually tricky part of this, of course, (as opposed to merely difficult to implement) is getting from "these pieces are individually certified to exhibit these behaviors" to "the system as a whole is certified to exhibit these behaviors"

Comment author: JenniferRM 12 June 2012 08:19:10PM *  4 points [-]

It seems like this is turning into a statement about human technical politics.

Yes, I am highly confident that this is in fact safe.

No, I couldn't identify any way in which this will kill us, though that doesn't mean it won't kill us.

The latter is stereotypically something a cautious engineer in cover-your-ass-mode is likely to say no matter how much quality assurance has happened. The first is something that an executive in selling-to-investors-and-the-press-mode is likely to say once they estimate it will have better outcomes than saying something else with the investors and the press, perhaps just because they know of something worse that will happen outside their control that seems very likely to be irreversible and less likely to be good. Between these two stereotypes lays a sort of "reasonable rationalist speaking honestly but pragmatically"?

This is a hard area to speak about clearly between individuals without significant interpersonal calibration on the functional meaning of "expert", because you run into Dunning-Kruger effects if you aren't careful and a double illusion of transparency can prevent you from even noticing the miscommunication.

There are conversations that can allow specific people to negotiate a common definition with illustrations grounded in personal experience here, but they take many minutes or hours, and are basically a person-to-person protocol. The issue is doubly hard with a general audience because wildly different gut reactions will be elicited and there will be bad faith participation by at least some people, and so on. Rocket scientists get this wrong sometimes. It is a hard problem.