How much friendliness is enough?

7 Post author: cousin_it 27 March 2011 10:27AM

According to Eliezer, making AI safe requires solving two problems:

1) Formalize a utility function whose fulfillment would constitute "good" to us. CEV is intended as a step toward that.

2) Invent a way to code an AI so that it's mathematically guaranteed not to change its goals after many cycles of self-improvement, negotiations etc. TDT is intended as a step toward that.

It is obvious to me that (2) must be solved, but I'm not sure about (1). The problem in (1) is that we're asked to formalize a whole lot of things that don't look like they should be necessary. If the AI is tasked with building a faster and more efficient airplane, does it really need to understand that humans don't like to be bored?

To put the question sharply, which of the following looks easier to formalize:

a) Please output a proof of the Riemann hypothesis, and please don't get out of your box along the way.

b) Please do whatever the CEV of humanity wants.

Note that I'm not asking if (a) is easy in absolute terms, only if it's easier than (b). If you disagree that (a) looks easier than (b), why?

Comments (75)

Comment author: atucker 27 March 2011 02:35:45PM 5 points [-]

I think that a is just a special case of a narrow AI.

Like, GAI is dangerous because it can do anything, and would probably ruin this section of the universe for us if its goals were misaligned with ours.

I'm not sure if GAI is needed to do highly domain-specific tasks like a.

Comment author: cousin_it 27 March 2011 02:44:27PM *  3 points [-]

Yeah, this looks right. I guess you could rephrase my post as saying that narrow AI could solve most problems we'd want an AI to solve, but with less danger than the designs discussed on LW (e.g. UDT over Tegmark multiverse).

Comment author: Vladimir_Nesov 27 March 2011 03:13:32PM *  4 points [-]

That's what evolution was saying. Since recently I expect narrow AI developments to be directly on track to an eventual intelligence explosion.

Comment author: cousin_it 27 March 2011 03:49:27PM 6 points [-]

What narrow AI developments do you have in mind?

Comment author: Dr_Manhattan 28 March 2011 02:06:17PM 0 points [-]

That's what evolution was saying

Who's 'evolution'?

Comment author: Dr_Manhattan 28 March 2011 05:07:04PM 1 point [-]

Apparently whoever downvoted understood what Vladimir was saying, can you please explain? I can't parse "what evolution was saying".

Comment author: FAWS 28 March 2011 05:34:32PM *  7 points [-]

Vladimir's writing style has high information density, but he leaves the work of unpacking to the reader. In this context "that's what evolution was saying" seems to be a shorthand for something like:

Evolution optimized for goals that did not necessarily imply general intelligence, nor did evolution ever anticipate creating a general intelligence. Nevertheless a general intelligence appeared as the result of evolution's optimizations. By analogy we should be not be too sure about narrow AI developments not leading to AGI.

Comment author: Dr_Manhattan 28 March 2011 08:54:56PM 3 points [-]

Ah. This seems about right, though I think Vladimir's statement was denser either denser and/or more ambiguous than usual.

Comment author: [deleted] 27 March 2011 06:51:30PM 10 points [-]

It strikes me that this is the wrong way to look at the issue.

The problem scenario is if someone, anywhere, develops a powerful AGI that isn't safe for humanity. How do you stop the invention and proliferation of an unsafe technology? Well, you can either try to prevent anybody from building an AI without authorization; or you can try to make your own powerful friendly AGI before anybody else gets unfriendly AGI. The latter has the advantage that you only have to be really good at technology, you don't have to enforce an unenforceable worldwide law.

Building an AI that doesn't want to get out of its box doesn't solve the problem that somewhere, somebody may build an AI that does want to get out of its box.

Comment author: Wei_Dai 29 March 2011 08:32:14PM *  3 points [-]

It might be worth noting that I often phrase questions as "how would we design an FAI to think about that" not because I want to build an FAI, but because I want the answer to some philosophical question for myself, and phrasing it in terms of FAI seems to be (1) an extremely productive way of framing the problem, and (2) generates interest among those who have good philosophy skills and are already interested in FAI.

ETA: Even if we don't build an FAI, eventually humanity might have god-like powers, and we'd need to solve those problems to figure out what we want to do.

Comment author: XiXiDu 27 March 2011 02:12:26PM 4 points [-]

If you figured out artificial general intelligence that is capable of explosive recursive self-improvement and know how to achieve goal-stability and know how to constrain it then you ought to concentrate on taking over the universe because of the multiple discovery hypothesis and that you can't expect other humans to be friendly.

Comment author: CuSithBell 28 March 2011 03:24:32PM 3 points [-]

Why is this downvoted? Isn't this one of the central theses of FAI?

Comment author: XiXiDu 28 March 2011 05:49:40PM 2 points [-]

Why is this downvoted? Isn't this one of the central theses of FAI?

Possible reasons:

  • I implicitly differentiated between AGI in general and the ability to recursively self-improve (which is usually lumped together on LW). I did this on purpose.
  • I included the ability to constrain such an AGI as a prerequisite to run it. I did this on purpose because friendliness is not enough if the AGI is free to hunt for vast utilities irregardless of tiny probabilities. Even an AGI equipped with perfect human-friendliness might try to hack the Matrix to support 3^^^^3 people rather than just a galactic civilisation. This problem isn't solved and therefore, as suggested by Yudkowsky, it needs to be constrained using a "hack".
  • I used the phrasing "taking over the universe" which is badly received yet factually correct if you got a fooming AI and want to use it to spawn a positive Singularity.
  • I said that you can't expect other humans to be friendly which is not the biggest problem, it is stupidity.
  • I said one "ought" to concentrate on taking over the universe. I said this on purpose to highlight that I actually believe that to be the only sensible thing to do once fooming AI is possible because if you waste too much time with spatiotemporal bounded versions then someone who is ignorant of friendliness will launch one that isn't constrained that way.
  • The comment might have been deemed unhelpful because it added nothing new to the debate.

That's my analysis of why the comment might have initially been downvoted. Sadly most people who downvote don't explain themselves, but I decided to stop complaining about that recently.

Comment author: CuSithBell 28 March 2011 07:39:49PM 1 point [-]

Awesome, thanks for the response. Do you know if there's been any progress on the "expected utility maximization makes you do arbitrarily stupid things that won't work" problem?

Though, stupidity is a form of un-Friendliness, isn't it?

Comment author: XiXiDu 29 March 2011 02:51:15PM 0 points [-]

I only found out about the formalized version of that dilemma around a week ago. As far as I can tell it has not been shown that giving in to a Pascal's mugging scenario would be irrational. It is merely our intuition that makes us believe that something is wrong with it. I am currently far too uneducated to talk about this in detail. What I am worried about is that basically all probability/utility calculations could be put into the same category (e.g. working to mitigate low-probability existential risks), where do you draw the line? You can be your own mugger if you weigh in enough expected utility to justify taking extreme risks.

Comment author: jimrandomh 29 March 2011 04:04:47PM *  0 points [-]

What I am worried about is that basically all probability/utility calculations could be put into the same category (e.g. working to mitigate low-probability existential risks), where do you draw the line?

There's a formalization I gave earlier that distinguishes Pascal's Mugging from problems that just have big numbers in them. It's not enough to have a really big utility; a Pascal's Mugging is when you have a statement provided by another agent, such that just saying a bigger number (without providing additional evidence) increases what you think your expected utility is for some action, without bound.

This question has resurfaced enough times that I'm starting to think I ought to expand that into an article.

Comment author: JamesAndrix 17 June 2011 09:06:32PM 0 points [-]

Minor correction: It may need a hack if it remains unsolved.

Comment author: JoshuaZ 27 March 2011 04:31:00PM 2 points [-]

I'm not sure about the Riemann hypothesis since there's a likely chance that RH is undecidable in ZFC. But this might be more safe if one adds a time limit to when one wants the answer by.

But simply in terms of specification I agree that formalizing "don't get out of your box" is probably easier than formalizing what all of humanity wants.

Comment author: AlephNeil 27 March 2011 06:38:18PM 1 point [-]

a likely chance that RH is undecidable in ZFC

Why? I know certain people (i.e. Chaitin, who's a bit cranky in this regard) have toyed around with the idea, but is there any reason to believe it?

Comment author: JoshuaZ 27 March 2011 06:59:05PM *  2 points [-]

Why? I know certain people (i.e. Chaitin, who's a bit cranky in this regard) have toyed around with the idea, but is there any reason to believe it?

Not any strong one. We do know that some systems similar to the integers have their analogs to be false, but for most analogs (such as the finite field case) it seems to be true. That's very weak evidence for undecidability. However, I was thinking more in contrast to something like the classification of finite simple groups as of 1975 where there was a general program of what to do that had no obvious massive obstructions.

Comment author: ciphergoth 27 March 2011 10:50:29AM 5 points [-]

Don't preemptively refer to anyone who disagrees with you as brainwashed.

Comment author: Johnicholas 28 March 2011 12:39:57AM 3 points [-]

The primary task that EY and SIAI have in mind for Friendly AI is "take over the world". (By the way, I think this is utterly foolish, exactly the sort of appealing paradox (like "warring for peace") that can nerd-snipe the best of us.)

To some extent technolology itself (lithography, for example) is actually Safe technology, (or BelievedSafe technology). As part of the development of the technology, we also develop the safety procedures around it. The questions and problems about "how should you correctly draw up a contract with the devil" come from:

  1. Explicitly pursuing recursive self-improvement, that is, self-modifying code where every potentially limiting component is on the table to be redesigned.
  2. Using a theological-reasoning strategy regarding the fixpoint of the self-modifications.

If you do not pursue no-holds-barred recursive self-improvement so vigorously, then your task of developing a Riemann-Hypothesis-machine doesn't have to involve theological reasoning at all. Indeed, I'm sure there are many mathematicians and computer scientists who have worked on RH machines, and they have not had problems with their creations running amok.

Comment author: cousin_it 29 March 2011 11:00:43AM 1 point [-]

The primary task that EY and SIAI have in mind for Friendly AI is "take over the world". (By the way, I think this is utterly foolish, exactly the sort of appealing paradox (like "warring for peace") that can nerd-snipe the best of us.)

Could you explain this in more detail?

Comment author: Johnicholas 29 March 2011 07:34:00PM *  4 points [-]

As I understand it, EY worked through a chain of reasoning about a decade ago, in his book "Creating Friendly AI". The chain of reasoning is long and I won't attempt to recap it here, but there are two relevant conclusions.

First, that self-improving artificial intelligences are dangerous, and that projects to build self-improving artificial intelligence, or general intelligence that might in principle become self-modifying (such as Goertzel's), are increasing existential risk. Second, that the primary defense against self-improving artificial intelligences is a Friendly self-improving artificial intelligence, and so, in order to reduce existential risk, EY must work on developing (a restricted subset of) self-improving artificial intelligence.

This seems nigh-paradoxical (and unnecessarily dramatic) to me - you should not do <dangerous thing>, and yet EY must do <dangerous thing>. As I said before, this "cancel infinities against one another" sort of thinking (another example might be MAD doctrine), has enormous appeal to a certain (geeky) kind of person. The phenomenon is named "nerd-sniping" in the xkcd comic: http://xkcd.com/356/

Rather than pursuing Friendly AGI vigorously as last/best/only hope for humanity, we should do at least two things: 1. Look hard for errors in the long chain of reasoning that led to these peculiar conclusions, on the grounds that reality rarely calls for that kind of nigh-paradoxical action, and it's far more likely that either all AI development is generally a good thing for existential risks, or all AI development is a generally bad thing for existential risks - EY shouldn't get any special AI-development license. 2. Look hard for more choices - for example, building entities that are very capable at defeating rogue Unfriendly AGI takeoffs, and yet which are not themselves a threat to humanity in general, nor prone to hard takeoffs. It may be difficult to imagine such entities, but all the reduce-existential-risk tasks are very difficult.

Comment author: TheOtherDave 29 March 2011 08:21:53PM 3 points [-]

reality rarely calls for that kind of nigh-paradoxical action

In my experience, reality frequently includes scenarios where the best way to improve my ability to defend myself involves also improving my ability to harm others, should I decide to do that. So it doesn't seem that implausible to me.

Indeed, militaries are pretty much built on this principle, and are fairly common.

But, sure... there are certainly alternatives.

Comment author: Johnicholas 30 March 2011 12:01:00AM *  2 points [-]

I am familiar with the libertarian argument that if everyone has more destructive power, the society is safer. The analogous position would be that if everyone pursues (Friendly) AGI vigorously, existential risk would be reduced. That might well be reasonable, but as far as I can tell, that's NOT what is advocated.

Rather, we are all asked to avoid AGI research (and go into software development and make money and donate? How much safer is general software development for a corporation than careful AGI research?) and instead sponsor SIAI/EY doing (Friendly) AGI research while SIAI/EY is fairly closed-mouth about it.

It just seems to me like it would take a terribly delicate balance of probabilities to make this the safest course forward.

Comment author: cousin_it 29 March 2011 08:12:17PM *  2 points [-]

I have similar misgivings, they prompted me to write the post. Fighting fire with fire looks like a dangerous idea. The problem statement should look like "how do we stop unfriendly AIs", not "how do we make friendly AIs". Many people here (e.g. Nesov and SarahC) seem convinced that the latter is the most efficient way of achieving the former. I hope we can find a better way if we think some more.

Comment author: Wei_Dai 29 March 2011 10:27:24PM 7 points [-]

The problem statement should look like "how do we stop unfriendly AIs", not "how do we make friendly AIs".

If the universe is capable of running super-intelligent beings, then eventually either there will be one, or civilization will collapse. Maintaining the current state where there are no minds more intelligent than base humans seems very unlikely to be stable in the long run.

Given that, it seems the problem should be framed as "how do we end up with a super-intelligent being (or beings) that will go on to rearrange the universe the way we prefer?" which is not too different from "how do we make friendly AIs" if we interpret things like recursively-improved uploads as AIs.

Comment author: Vladimir_Nesov 27 March 2011 11:43:47AM *  2 points [-]

making AI friendly requires solving two problems

The goal is not to "make an AI friendly" (non-lethal), it's to make a Friendly AI. That is, not to make some powerful agent that doesn't kill you (and does something useful), but make an agent that can be trusted with autonomously building the future. For example, a merely non-lethal AI won't help with preventing UFAI risks.

So it's possible that some kind of Oracle AI can be built, but so what? And the risk of unknown unknowns remains, so it's probably a bad idea even if it looks provably safe.

Comment author: Lightwave 27 March 2011 12:44:12PM *  5 points [-]

And the risk of unknown unknowns remains, so it's probably a bad idea even if it looks provably safe.

Doesn't this also apply to provably friendly Friendly AI? Perhaps even more so, given that it is a project of higher complexity.

Comment author: Vladimir_Nesov 27 March 2011 01:14:07PM 2 points [-]

With FAI, you have a commensurate reason to take the risk.

Comment author: Lightwave 27 March 2011 02:55:33PM *  3 points [-]

With FAI, you have a commensurate reason to take the risk.

Sure, but if the Oracle AI is used as a stepping stone towards FAI, then you also have a reason to take the risk.

I guess you could argue that the risk of Oracle + Friendly AI is higher than just going straight for FAI, but you can't be sure how much the FAI risk could be mitigated by the Oracle AI (or any other type of not-so-powerful / constrained / narrow-domain AI). At least it doesn't seem obvious to me.

Comment author: Vladimir_Nesov 27 March 2011 03:23:43PM *  2 points [-]

To the extent you should expect it to be useful. It's not clear in what way it can even in principle help with specifying morality. (See also this thread.)

Assume you have a working halting oracle. Now what? (Actually you could get inside to have infinite time to think about the problem.)

Comment author: ShardPhoenix 28 March 2011 12:26:27PM 0 points [-]

I think he means Oracle as in general powerful question-answer, not as in a halting oracle. A halting oracle could be used to answer many mathematical questions (like the aforementioned Riemann Hypothesis) though.

Comment author: Vladimir_Nesov 28 March 2011 04:01:42PM *  2 points [-]

I know he doesn't mean a halting oracle. A halting oracle is a well-specified superpower that can do more than real Oracles. The thought experiment I described considers an upper bound on usefulness of Oracles.

Comment author: timtyler 27 March 2011 03:04:41PM *  2 points [-]

I figure we will build experts and forecasters before both oracles and full machine intelligence. That will be good - since forecasters will help to give us foresight - which we badly need.

Generally speaking, replacing the brain's functions one-at-a-time seems more desirable than replacing them all-at-once. It is likely to result in a more gradual shift, and a smoother transfer - with a reduced chance of the baton getting dropped during the switch over.

Comment author: Alexandros 27 March 2011 11:59:14AM *  3 points [-]

If:

(1) There is a way to make an AI that is useful and provably not-unfriendly

(2) This requires a subset of the breakthroughs required for a true FAI

(3) It can be used to provide extra leverage towards building a FAI (i.e. using it to generate prestige and funds for hiring and training the best brains available. How? Start by solving protein folding or something.)

Then this safe & useful AI should certainly be a milestone on the way towards FAI.

Comment author: Vladimir_Nesov 27 March 2011 12:06:15PM *  0 points [-]

Just barely possible, but any such system is also a recipe for destroying the universe, if mixed in slightly different proportions. Which on the net makes the plan wrong (destroy-the-universe wrong).

Comment author: Alexandros 27 March 2011 02:34:58PM 5 points [-]

I just don't think that this assertion has been adequately backed up.

Comment author: benelliott 27 March 2011 02:10:04PM 1 point [-]

If we get a working Oracle AI, couldn't we just ask it how to build an FAI. I just don't think this is of much use since the Oracle route doesn't really seem much easier than the FAI route.

Comment author: Vladimir_Nesov 27 March 2011 03:08:03PM 5 points [-]

If we get a working Oracle AI, couldn't we just ask it how to build an FAI?

No, it won't know what you mean. Even you don't know what you mean, which is part of the problem.

Comment author: timtyler 27 March 2011 04:52:09PM *  0 points [-]

I just don't think this is of much use since the Oracle route doesn't really seem much easier than the FAI route.

Experts and general forecasters are easier to build than general intelligent agents - or so I argue in my section on Machine Forecasting Implications. That is before we even get to constraints on how we want them to behave.

If we get a working Oracle AI, couldn't we just ask it how to build an FAI.

At a given tech level, if you trying to use an use an general oracle on its own to create a general intelligence would probably produce a less intelligent agent than could be produced by other means, using a broader set of tools. An oracle might well be able to help, though.

Comment author: wedrifid 27 March 2011 02:20:16PM -1 points [-]

How much friendliness is enough?

I'm for 'bool friendly = true'.

Comment author: Oscar_Cunningham 27 March 2011 10:57:16AM *  1 point [-]

The Riemann hypothesis seems like a special case, since it's a purely mathematical proposition. A real world problem is more likely to require Eliezer's brand of FAI.

Also, I believe solving FAI requires solving a problem not on your list, namely that of solving GAI. :-)

If you disagree that (a) looks easier than (b), congratulations, you've been successfully brainwashed by Eliezer :-)

This was supposed to be humour, right?

Comment author: cousin_it 27 March 2011 11:03:45AM *  3 points [-]

This was supposed to be humour, right?

OK, that didn't come across as intended. Edited the post.

A real world problem is more likely to require Eliezer's brand of FAI.

It seems to me that human engineers don't spend a lot of time thinking about the value of boredom or the problem of consciousness when they design airplanes. Why should an AI need to do that? If the answer involves "optimizing too hard", then doesn't the injunction "don't optimize too hard" look easier to formalize than CEV?

Comment author: timtyler 27 March 2011 05:30:54PM *  3 points [-]

doesn't the injunction "don't optimize too hard" look easier to formalize than CEV?

"Don't optimise for too long" looks easier to formalise. Or so I argued here.

Comment author: Vladimir_Nesov 27 March 2011 11:55:49AM *  0 points [-]

If the answer involves "optimizing too hard", then doesn't the injunction "don't optimize too hard" look easier to formalize than CEV?

Injecting randomness doesn't look like a property of reasoning that would stand (or, alternatively, support) self-modification. This leaves the option of limiting self-modification (for the same reason), although given enough time and sanity even a system with low optimization pressure could find a reliable path to improvement.

Comment author: cousin_it 27 March 2011 12:03:54PM *  0 points [-]

Superintelligence isn't a goal in itself. I'll take super-usefulness over superintelligence any day. I know you want to build superintelligence because otherwise someone else will, but the same reasoning was used to justify nuclear weapons, so I suspect we should be looking for other ways to save the world.

(I see you've edited your comment. My reply still applies, I think.)

Comment author: timtyler 27 March 2011 01:30:17PM *  4 points [-]

I know you want to build superintelligence because otherwise someone else will, but the same reasoning was used to justify nuclear weapons, so I suspect we should be looking for other ways to save the world.

Are you arguing that the USA should not have developed nuclear weapons?

Use of nuclear weapons is often credited with shortening the war - and saving many lives - e.g. see here:

Supporters of Truman's decision to use the bomb argue that it saved hundreds of thousands of lives that would have been lost in an invasion of mainland Japan. In 1954, Eleanor Roosevelt said that Truman had "made the only decision he could," and that the bomb's use was necessary "to avoid tremendous sacrifice of American lives." Others have argued that the use of nuclear weapons was unnecessary and inherently immoral. Truman himself wrote later in life that, "I knew what I was doing when I stopped the war ... I have no regrets and, under the same circumstances, I would do it again."

Comment author: sark 30 March 2011 05:07:24PM *  0 points [-]

Well, that was what in fact happened. But what could have happened was perhaps a nuclear war leading to "significant curtailment of humankind's potential".

cousin_it's point was that perhaps we should not even begin the arms race.

Consider the Terminator scenario where they send the terminator back in time to fix things, but this sending back of the terminator is precisely what provided the past with the technology that will eventually lead to the cataclysm in the first place.

EDIT: included Terminator scenario

Comment author: Vladimir_Nesov 27 March 2011 12:17:05PM *  0 points [-]

I'll take super-usefulness over superintelligence any day.

Of course. But super-usefulness unfortunately requires superintelligence, and superintelligence is super-dangerous. Limited intelligence gives only limited usefulness, and in the long run even limited intelligence would tend to improve its capability, so it's not reliably safe. And not very useful.

I know you want to build superintelligence because otherwise someone else will,

Someone will eventually make an intelligence explosion that destroys the world. That would be bad. Any better ideas on how to mitigate the problem?

but the same reasoning was used to justify nuclear weapons

This is an analogy that you use as an argument? As if we don't already understand the details of the situation a few levels deeper than is covered by the surface similarity here. In making this argument, you appeal to intuition, but individual intuitions (even ones that turn out to be correct in retrospect or on reflection) are unreliable, and we should do better than that, find ways of making explicit reasoning trustworthy.

Comment author: [deleted] 27 March 2011 01:33:28PM 2 points [-]

Of course. But super-usefulness unfortunately requires superintelligence, and superintelligence is super-dangerous. Limited intelligence gives only limited usefulness, and in the long run even limited intelligence would tend to improve its capability, so it's not reliably safe. And not very useful.

Is this not exactly the point that the cousin it is questioning in the OP? I'd think a "limited" intelligence that was capable of solving the Riemann hypothesis might also be capable of cracking some protein-folding problems or whatever.

Comment author: Vladimir_Nesov 27 March 2011 01:45:01PM *  0 points [-]

If it's that capable, it's probably also that dangerous. But at this point the only way to figure out more about how it actually is, is to consider specific object-level questions about a proposed design. Absent design, all we can do is vaguely guess.

Comment author: cousin_it 27 March 2011 02:09:56PM *  6 points [-]

If it's that capable, it's probably also that dangerous.

No. We already have computers that help design better airplanes etc., and they are not dangerous at all. Sewing-Machine's question is right on.

Building machines that help us solve intelligence-bound problems (even if these problems are related to the real world, like building better airplanes) seems to be massively easier than building machines that will "understand" the existence of the real world and try to take it over for whatever reason. Evidence: we have had much success with the former task, but practically no progress on the latter. Moreover, the latter task looks very dangerous, kinda like nuclear weaponry.

Why do some people become so enamored with the singleton scenario that they can't settle for anything less? What's wrong with humans using "smart enough" machines to solve world hunger and such, working out any ethical issues along the way, instead of delegating the whole task to one big AI? If you think you need the singleton to protect you from some danger, what can be more dangerous than a singleton?

Comment author: Vladimir_Nesov 27 March 2011 03:02:56PM *  2 points [-]

Why do some people become so enamored with the singleton scenario that they can't settle for anything less? What's wrong with humans using "smart enough" machines to solve world hunger and such, working out any ethical issues along the way, instead of delegating the whole task to one big AI?

It's potentially dangerous, given the uncertainty about what exactly you are talking about. If it's not dangerous, go for it.

Settling for something less than a singleton won't solve the problem of human-indifferent intelligence explosion.

If you think you need the singleton to protect you from some danger, what can be more dangerous than a singleton?

Another singleton, which is part of the danger in question.

Comment author: [deleted] 27 March 2011 01:55:25PM *  1 point [-]

There are already computer programs that have solved open problems, e.g. That was a much simpler and less interesting question than the Riemann Hypothesis, but I don't know that it's fundamentally different or less dangerous than what cousin it is proposing.

Comment author: Vladimir_Nesov 27 March 2011 02:59:03PM 2 points [-]

Yes, there are non-dangerous useful things, but we were presumably talking about AI capable of open-ended planning.

Comment author: wedrifid 27 March 2011 03:09:17PM 0 points [-]

2) Invent a way to code an AI so that it's mathematically guaranteed not to change its goals after many cycles of self-improvement, negotiations etc. TDT is intended as a step toward that.

Only superficially. It would be possible to create an AI with said properties with CDT.

Comment author: wedrifid 27 March 2011 02:21:36PM 0 points [-]

To put the question sharply, which of the following looks easier to formalize:

a) Please output a proof of the Riemann hypothesis, and please don't get out of your box along the way.

b) Please do whatever the CEV of humanity wants.

The difficulty level seems on the same order of magnitude.

Comment author: cousin_it 27 March 2011 02:26:31PM *  6 points [-]

This looks suspicious. Imagine you didn't know about Risch's algorithm for finding antiderivatives. Would you then consider the problem "find me the antiderivative of this function, and please don't get out of the box" to be on the same order of difficulty as (b)? Does Wolfram Alpha overturn your worldview? Last I looked, it wasn't trying to get out...

Comment author: wedrifid 27 March 2011 03:05:59PM *  -1 points [-]

Does Wolfram Alpha overturn your worldview?

Not even remotely. I don't accept the analogy.

Comment author: timtyler 27 March 2011 02:55:27PM *  -1 points [-]

Does Wolfram Alpha overturn your worldview? Last I looked, it wasn't trying to get out...

Wolfram Alpha isn't really "in a box" in the first place.

Like most modern machines, its sensors and actuators extend into the real world.

We do restrain machines - but mostly when testing them. Elsewhere, constraints are often considered to be unnecessary expense. If a machine is dangerous, we typically keep humans away from it - and not the other way around.