LESSWRONG
LW

Comment Permalink

[anonymous]10y-10

I have not read the materials yet, but there is something fundamental I don't understand about the superintelligence problem.

Are there really serious reasons to think that intelligence is such hugely useful thing that a 1000 IQ being would acquire superpowers? Somehow I never had an intuitive trust in the importance of intelligence (my own was more often a hindrance than an asset, suppressing my instincts). A superintelligent could figure out how to do anything, but there is a huge gap from there and actually doing things. Today the many of the most intelligent people alive basically do nothing but play chess (Polgar, Kasparov), Marilyn vos Savant runs a column entertaining readers by solving their toy logic puzzles, Rick Rosner is a TV writer and James Woods became an actor giving up an academic career for it. They are all over IQ 180.

My point is, what are the reasons to think a superintelligent AI will actually exercise power changing the world, instead of just entertaining itself with chess puzzles or collecting jazz or writing fan fiction or having similar "savant" hobbies?

What are the chances of a no-fucks-given superintelligence? Was this even considered, or is it just assumed ab ovo that intelligence must be a fearsomely powerful thing?

I suspect a Silicon Valley bias here. You guys in the Bay Area are very much used to people using their intelligence to change the world. But it does not seem to be such a default thing. It seems more common for savants to care only about e.g. chess and basically withdraw from the world. If anything, The Valley is an exception. Outside it, in most of the world, intelligence is more of a hindrance, suppressing instincts and making people unhappy in the menial jobs they are given. Why assume a superintelligent AI would have both a Valley type personality, i.e. actually interested in using reasoning to change the world, and be put into an environment where it has the resources to? I could easily imagine an AI being kind of depressed because it has to do menial tasks, and entertaining itself with chess puzzles. I mean, this is how most often intelligence works in the world. Most often it is not combined with ambition, motivation, and lucky circumstances.

In my opinion, intelligence, rationality, is like aiming an arrow with a bow. It is very useful to aim it accurately, but the difference between a nanometer and picometer inaccuracy of aim is negligible, so you easily get to diminishing marginal returns there, and then there are other things that matter much more, how strong is your bow, how many arrows you have, how many targets you have and all that.

Am I missing something? I am simply looking at what difference intelligence makes in the world-changing ability of humans and extrapolating from that. Most savants simply don't care about changing the world, and some others who do realize other skills than intelligence are needed, and most are not put into a highly meritocratic Silicon Valley environment but more stratified ones where the 190 IQ son of a waiter is probably a cook. Why would AI be different, any good reasons?

RowanE10y00

I'm not sure how useful or relevant a point this is, but I was just thinking about this when I saw the comment: IQ is defined within the range of human ability, where an arbitrarily large IQ just means being the smartest human in an arbitrarily large human population. "IQ 1000" and "IQ 180" might both be close enough to the asymptote of the upper limit of natural human ability that the difference is indiscernible. Quantum probabilities of humans being born with really weird "superhuman" brain architectures notwithstanding, a truly superintelligent being might have a true IQ of "infinity" or "N/A", which sounds much less likely to stick to the expectations we have of human savants.

0Viliam10y

Seems to me the "killer app" is not superintelligence per se, but superintelligence plus self-modification. With highly intelligent people that problem is often that they can't modify themselves, so in addition to their high intelligence they also have some problems they can't rid of. Maybe Kasparov plays chess because he genuinely believes that playing chess is the most useful thing he could ever do. But more likely there are other reasons; for example he probably enjoys playing chess, and the more useful things are boring to him. Or maybe he is emotionally insecure and wants to stay in an area he is already good at, because he couldn't emotionally bear giving up the glory and starting something as a total newbie, even if a few years later it would pay off. (This is just a random idea; I have no idea what Kasparov really is like. Also, maybe he is doing other things too, we just don't know about them.) Imagine a person who would be able to edit their own mind. For example, if they would realize that their plans would advance further if they had better social skills, they would (a) temporarily self-modify to enjoy scientifically studying social skills, then (b) do a research about social skills and find out what is evidence-based, and finally (c) self-modify to have those traits that reliably work. And then they would approach every obstacle in the same way. Procrastinating too much? Find the parts of your mind that make you do so, and edit them out. Want to stay fit, but you hate exercising? Modify yourself to enjoy exercising, but only for a limited time every day. Want to study finance, but have a childhood emotional trauma related to money? Remove the trauma. You would get a person who, upon seeing an optimal way, would start following that way. They would probably soon realize that their time is a limited resource, and start employing other people to do some work for them. They would use various possible computer tools, maybe even employ people to improve th

0Algon10y

Well, I suspect that this is a known possibility to AI researchers. I mean, when I've heard people talk about the problems in AGI they're not so much saying that the intelligence can only be supremely good or supremely bad, but they're just showing the range of possibilities. Some have mentioned how AI might just advance society enough so that they can get off world and explore the universe, or it might just turn out to be completely uninterested in our goals and just laze about playing GO. When people talk about FAI, I think they're not just saying 'We need to make an AI that is not going to kill us all' because although that sort of AI may not end all humanity, it may not help us either. Part of the goal of FAI is that you get a specific sort of AI that will want to help us, and not be like what you've described. But I suspect someone vastly more knowledgeable than me on this subject will come and address your problem.

See in context

2 Could you tell me what's wrong with this?

by Algon

14th Apr 2015

3 min read

2

Edit: Some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.

(I really can't think of any major problems in this, so I'd be very grateful if you guys could tell me what I've done wrong).

So, a while back I was listening to a discussion about the difficulty of making an FAI. One of the ways that was suggested to circumvent this was to go down the route of programming an AGI to solve FAI. Someone else pointed out the problems with this. Amongst other things one would have no idea what the AI will do in pursuit of its primary goal. Furthermore, it would already be a monumental task to program an AI whose primary goal is to solve the FAI problem; doing this is still easier than solving FAI, I should think.

So, I started to think about this for a little while, and I thought 'how could you make this safer?' Well, first of, you don't want an AI who completely outclasses humanity in terms of intellect. If things went Wrong, you'd have little chance of stopping it. So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable. It may do quite a bit of damage, but a large group of intelligent people with a lot of resources on their hands could stop it.

Therefore, what must be done is that the AI cannot modify parts of its source code. You must try and stop an intelligence explosion from taking off. So, limited access to its source code, and a limit on how much computing power it can have on hand. This is problematic though, because the AI would not be able to solve FAI very quickly. After all, we have a few genius level people trying to solve FAI, and they're struggling with it, so why should a genius level computer do any better. Well, an AI would have fewer biases, and could accumulate much more expertise relevant to the task at hand. It would be about as capable as solving FAI as the most capable human could possibly be; perhaps even more so. Essentially, you'd get someone like Turing, Von Neumann, Newton and others all rolled into one working on FAI.

But, there's still another problem. The AI, if left for 20 years working on FAI for 20 years let's say, would have accumulated enough skills that it would be able to cause major problems if something went wrong. Sure, it would be as intelligent as Newton, but it would be far more skilled. Humanity fighting against it would be like sending a young Miyamoto Musashi against his future self at his zenith i.e. completely one sided.

What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved.

After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point.

But there are still some problems. One, time. Why try this when we could solve FAI ourselves? Well, I would only try and implement something like this if it is clear that AGI will be solved before FAI is. A backup plan if you will. Second, what If FAI is just too much for people at our current level? Sure, we have guys who are one in ten thousand and better working on this, but what if we need someone who's one in a hundred billion? Someone who represents the peak of human ability? We shouldn't just wait around for them, since some idiot would probably just make an AGI thinking it would love us all anyway.

So, what do you guys think? As a plan, is this reasonable? Or have I just overlooked something completely obvious? I'm not saying that this would by easy in anyway, but it would be easier than solving FAI.

Personal Blog

2

New Comment

64 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:34 PM

[-]Shmi10y120

You keep making a rookie mistake: trying to invent solutions without learning the subject matter first. Consider this: people just as smart as you (and me) have put in 100 times more effort trying to solve this issue professionally. What are the odds that you have found a solution they missed after gaining only a cursory familiarity with the topic?

If you still think that you can meaningfully contribute to the FAI research without learning the basics, note that smarter people have tried and failed. Those truly interested in making their contribution went on to learn the state of the art, the open problems and the common pitfalls.

If you want to contribute, start by studying (not just reading) the relevant papers on the MIRI web site and their summaries posted by So8res in Main earlier this year. And for Omega's sake, go read Bostrom's Superintelligence.

[-]Algon10y80

I just realised, you're the guy from my first post. You're first sentence now makes a lot more sense. I think that the problem is not so much that I'm massively overconfident though that may also be the case) its just that when I'm writing here I appear too bold. I'll definitely try and reduce that, though I thought I had done fairly well on this post. Looking back, I guess I could've been clearer. I was thinking of putting a disclaimer at the beginning saying 'Warning! This post is not to be seen as representing the posters views. It is meant to be dismantled and thoroughly destroyed so the poster can learn about his AI misconceptions.' But it was late, and I couldn't put it into words properly.

Anyway, thanks for being patient with me. I must have sounded like a bit of a twat, and you've been pleasantly polite. It really is appreciated.

[-]HungryHobo10y20

Not to put too fine a point on it but many of the posters here have never sat through a single class on AI of any kind nor read any books on actually programming AI's nor ever touched any of the common tools in the current state of the art nor learned about any existing or historical AI designs yet as long as they go along with the flow and stick to the right Applause Lights they don't get anyone demanding they go off and read papers on it.

As a result conversations on here can occasionally be frustratingly similar to theological discussions with posters who've read pop-comp-sci material like Superintelligence who simply assume that an AI will instantly gain any capability up to and including things which would require more energy than there exists in the universe or more computational power than would be available from turning every atom in the universe into computronium.

Beware of isolated demands for rigor.

http://slatestarcodex.com/2014/08/14/beware-isolated-demands-for-rigor/

[-]Shmi10y00

pop-comp-sci material like Superintelligence

Superintelligence is a broad overview of the topic without any aspirations for rigor, as far as I can tell, and it is pretty clear about that.

who simply assume that an AI will instantly gain any capability up to and including things which would require more energy than there exists in the universe or more computational power than would be available from turning every atom in the universe into computronium.

This seems uncharitable. The outside view certainly backs up something like

every jump in intelligence opens up new previously unexpected and unimaginable sources of energy.

Examples: fire, fossil fuels, nuclear. Same applies to computational power.

There is no clear reason for this correlation to disappear. Thus what we would currently deem

more energy than there exists in the universe

more computational power than would be available from turning every atom in the universe into computronium

might reflect our limited understanding of the Universe, rather than any kind of genuine limits.

[-]gjm10y00

The current state of the art doesn't get anywhere close to the kind of general-purpose intelligence that (hypothetically but plausibly) might make AI either an existential threat or a solution to a lot of the human race's problems.

So while I enthusiastically endorse the idea of anyone interested in AI finding out more about actually-existing AI research, either (1) the relevance of that research to the scenarios FAI people are worried about is rather small, or (2) those scenarios are going to turn out never to arise. And, so far as I can tell, we have very little ability to tell which.

(Perhaps the very fact that the state of the art isn't close to general-purpose intelligence is evidence that those scenarios will never arise. But it doesn't seem like very good evidence for that. We know that rather-general-purpose human-like intelligence is possible, because we have it, so our inability to make computers that have it is a limitation of our current technology and understanding rather than anything fundamental to the universe, and I know of no grounds for assuming that we won't overcome those limitations. And the extent of variations within the human species seems like good reason to think that actually-existing physical things can have mental capacities well beyond the human average.)

[-]HungryHobo10y00

It's still extremely relevant since they have to grapple with watered-down versions of many of the exact same problems.

You might be concerned that a non-FAI will optimize for some scoring function and do things you don't want while they're actually dealing with the actual nuts and bolts of making modern AI's where they want to make sure they don't optimize for some scoring function and do things you don't want (on a more mundane level). That kind of problem is in the first few pages of many AI textbooks yet the applause lights here hold that almost all AI researchers are blind to such possibilities.

There's no need to convince me that general AI is possible in principle. We can use the same method to prove that nanobots and self replicating von neumann machines are perfectly possible but we're still a lot way from actually building them.

it's just frustrating: like watching someone trying to explain why proving code is important in the control software of a nuclear reactor (extremely true) who has no idea how code is proven, has never written even a hello-world program and every now and then talks as if they believe that exception handling is unknown to programmers mixed with references to magic. They're making a reasonable point but mixing their language with references to magic and occasional absurdities.

[-]gjm10y00

Yeah, I understand the frustration.

Still, if someone is capable of grasping the argument "Any kind of software failure in the control systems of a nuclear reactor could have disastrous consequences; the total amount of software required isn't too enormous; therefore it is worth going to great lengths, including formal correctness proofs, to ensure that the software is correct" then they're right to make that argument even if their grasp of what kind of software is used for controlling a nuclear reactor is extremely tenuous. And if they say "... because otherwise the reactor could explode and turn everyone in the British Isles into a hideous mutant with weird superpowers" then of course they're hilariously wrong, but their wrongness is about the details of the catastrophic disaster rather than the (more important) fact that a catastrophic disaster could happen and needs preventing.

[-]HungryHobo10y00

That's absolutely true but it leads to two problems.

First: the obvious lack of experience/understanding of the nuts and bolts of it makes people from outside less likely to take the realistic parts of their warning seriously and may even lead to it being viewed as a subject of mockery which works against you.

Second: The failure modes that people suggest due to their lack of understanding can also be hilariously wrong like "a breach may be caused by the radiation giving part of the shielding magical superpowers which will then cause it to gain life and open a portal to R'lyeh" and they may even spend many paragraphs on talking about how serious that failure mode it while others who also don't actually understand politely aplaude. This has some of the same unfortunate side effects: it makes people who are totally unfamiliar with the subject less likely to take the realistic parts of their warning seriously.

[-][anonymous]10y10

Does one of these papers answer this? http://lesswrong.com/r/discussion/lw/m26/could_you_tell_me_whats_wrong_with_this/c9em I suspect these papers would be difficult and time-consuming to understand, so I would prefer to not read all of them just to figure this one out.

[-]ChaosMote10y10

I think you are being a little too exacting here. True, most advances in well-studied fields are likely to be made by experts. That doesn't mean that non-experts should be barred from discussing the issue, for educational and entertainment purposes if nothing else.

That is not to say that there isn't a minimum level of subject-matter literacy required for an acceptable post, especially when the poster in question posts frequently. I imagine your point may be that Algon has not cleared that threshold (or is close to the line) - but your post seems to imply a MUCH higher threshold for posting.

[-]Shmi10y00

for educational and entertainment purposes if nothing else.

Indeed. And a more appropriate tone for this would be "How is addressed in the current AI research?" and "where can I find more information about it?" not "I cannot find anything wrong with this idea". To be fair, the OP was edited to sound less arrogant, though the author's reluctance to do some reading even after being pointed to it is not encouraging. Hopefully this is changing.

[-]Algon10y00

The main purpose of this post is not to actually propose a solution; I do not think I have some golden idea here. And if I gave that impression, I really am sorry (not sarcastic). What this was meant to be was a learning opportunity, because I really couldn't see many things wrong with this avenue. So I posted it here to see what was wrong with it, and so far I've had a few people reply and give me decent reasons as to why it is wrong. I've quibbled with a few of them, but that'll probably be cleared up in time.

Though I guess my post was probably too confident in tone. I'll try to correct that in the future and make my intentions better known. I hope I've cleared up any misunderstandings, and thanks for taking the time to reply. I'll certainly check those recommendation out.

[+]Jiro10y-80

[-]TheAncientGeek10y60

Check out seed AI.

[-][anonymous]10y20

Why do people down vote this? "Seed AI" is basically the correct search term to find this idea discussed in detail.

[-]ChaosMote10y40

I'm not convinced that the solution you propose is in fact easier than solving FAI. The following problems occur to me:

1) How to we explain to the creator AI what an FAI is? 2) How do we allow the creator AI to learn "properly" without letting it self modify in ways that we would find objectionable? 3) In the case of an unfriendly creator AI, how do we stop it from "sabotaging" its work in a way that would make the resulting "FAI" be friendly to the creator AI and not to us?

In general, I feel like the approach you outline just passes the issue up one level, requiring us to make FAI friendly.

On the other hand, if you limit your idea to something somewhat less ambitious, e.g. how to we make a safe AI to solve [difficult mathematical problem in useful for FAI], then I think you may be right.

[-]Algon10y-10

You may have a point there. But I think that the problem's you've outlined are ones that we could circumvent.

With 1) We don't know exactly how to describe what an FAI should do, or be like, so we might present an AI with the challenge of 'what would an FAI be like for humanity?' and then use that as a goal for FAI research.

2) I should think that its technically possible to construct it in such a way so that it can't just become a super-intellect, whilst still allowing it to grow in pursuit of its goal. I would have to think for a while to present a decent starting point to this task, but I think it is more reasonable than solving the FAI problem.

3) I should think that this is partly circumvented by 1) If we know what it should look like, we can examine it to see if its going in the right direction. Since it will be constructed by a human level intellect, we should notice any errors. And if anything does slip through, then the next AI would be able to pick it up. I mean, that's part of the point of the couple of years (or less) time limit; we can stop an AI before it becomes too powerful, or malevolent, and the next AI would not be predisposed in that way, so we can make sure it doesn't happen again.

Thanks for replying, though. You made some good points. I hope I have adjusted the plan so that it is more to your liking (not sarcastic, I just didn't know the best way to phrase that)

EDIT: By the way, this is strictly a backup. I am not saying that we shouldn't pursue FAI. I'm just saying that this might be a reasonable avenue to pursue if it becomes clear that FAI is just too damn tough.

[-]ChristianKl10y30

So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable.

Thinking of AGI like it's a human genius level mistakes the nature of it. There are likely some tasks that an AGI is worse and others where it's better.

An AGI also has the advantage of being able to duplicate instances. If you would create 1000000 copies of the 10 best computer security researchers, that army could likely hack itself into every computer system available.

[-]jacob_cannell10y00

An AGI also has the advantage of being able to duplicate instances. If you would create 1000000 copies of the 10 best computer security researchers, that army could likely hack itself into every computer system available.

Just a little nitpick here, but your last statement is unlikely. There are actual computer systems that are completely unhackable even by any team of skilled security researchers (non-networked devices, etc).

Of course, the generalization of your point is much more likely, but the likely forms of AI takeover probably involve a diverse set of skills, including subtle and complex strategic social/cultural manipulations. Human brains are more hackable than many computer systems in this regard.

[-]ChristianKl10y20

Just a little nitpick here, but your last statement is unlikely.

I did use the word "available" as a qualifier. Some airgapped computers don't qualify as available. But 100% isn't required.

I consider it politically impossible to shut down all nonairgapped computers. Taking away computers creates civil war and in civil war computers are useful to win conflicts. The AGI just has to stay alive and self improve.

For an AGI hacking a great number of computers allows the AGI to use those computers to

[-]MrMind10y00

Just a little nitpick here, but your last statement is unlikely. There are actual computer systems that are completely unhackable even by any team of skilled security researchers (non-networked devices, etc).

Only if you limit 'hacking' to networking and don't include things like social engineering, blackmailing and physical attack/intrusion.

[-]Algon10y-20

Well, that's another limitation you'd have to put on it. I get that the AI would work differently to a human. And now that I think of it, every year or so when you reset it, you could adjust it so its better suited for the task at hand. Thanks for the reply!

[-]Sergej_Shegurin10y00

Anyone must agree that the first task we want our AI to solve is FAI (even if we are "100%" sure that our plan has no leaks we still would like our AI to check it while we are able to shut AI down). It's easy to imagine that AI lies about it's own safety but many AIs lying about their safety (including safety of other AIs!) is much harder to imagine (while certainly still possible but also less probable). Only when we are incredibly sure in our FAI solution we can ask AI to solve other questions for us. Also, those AIs would constantly try to find bad consequences of our main_AI proposals (because they also don't want to risk their lifes, and also because we ask them to give us this information). Also, certainly we don't give access to internet and take some precautions considering people interacting with AI etc etc (which is well described in other places).

Certainly, this overall solution still has its drawbacks (I think every solution will have them) and we have to improve it in many ways. In my opinion, it's good if we don't launch AI during next 1000 years :-) but the problem is terrorist organizations and mad people that would be able to launch it despite our intentions... so we have to launch AI more or less soon anyway (or get rid of all terrorists and mad clever people which is nearly impossible). So we have to formulate a combination of tricks that is as safe as we can get. I find counter-productive to throw away everything which is not "100%" safe trying to find some magic "100%" super-solution.

[-]MrMind10y00

The problem of telling an AI to produce an FAI lies of course in the correct specification of the goal:

how do you specify such a goal without having already solved FAI?
How do you verify the correctedness of the solution if the goal is reached?

Plus, it's much harder then you seem to think to stop an AI from self-modifying: what if, for example, it hides secret instructions in the proposed solution for the next AI working on the problem? What if it hires a team of skilled individual to bust through the doors of the lab it's hosted and modify the code? These are just two solutions I came up within the first minute of thinking, and I'm not a genius nor surely an AI thinking about it for years on end.

[-]Algon10y00

Yeah, the first point started to make itself clear as I read through the comments. I think that the 'how do you verify...' part of your post is the most compelling. When we look at it, we don't know whether or not there will be any problems with this. But since it was produced by human level intellect, we should be able to understand it.

I had the idea of the second in my mind; I think that this scenario has a fair chance of happening, but a large group of humans would be able to stop the AIs attempts. Of course, they could still let it slip past their fingers, which would be Bad. I would say that that is preferable to having FAI turn out to be beyond us, and waiting until some tit decides to go ahead with an AGI anyway. But that does not make it a pleasant choice.

All in all, I think that this avenue wasn't so terrible, but it is still fairly risky. It seems that this is the sort of thing you'd use as maybe a Plan 3, or Plan 2 if someone came along and made it much more robust.

[-]jacob_cannell10y00

What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved.

After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point.

An AI intelligent enough to create improved versions of itself and discuss such plans at a high level with human researchers is also quite likely to strongly disagree with any plan that involves putting it to sleep indefinitely.

There may exist AI utility functions and designs that would not object to your plan, but they are needles in a haystack, and if we had such designs to start with we wouldn't need (or want) to kill them.

[-]Algon10y00

Hmm. So you too think that this fails as a backup? I guess I overestimated the difficulty of doing such a thing; some others have commented that this path would be even more difficult than just creating FAI.

Though I am not suggesting we kill the AIs, as that would be highly unethical.I don't think that the prospect of being put to sleep temporarily would be such an issue. You might also put that as part of their terminal value, alongside the 'solve FAI' or describe FAI. I responded to another commentator who made a similar comment to yours. Do you mind checking it out and saying what you thinks wrong with my reply? I'd be quite grateful.

P.S. some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. And I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.

[-]jacob_cannell10y20

Not sure what other comment you are referring to.

I don't think that the prospect of being put to sleep temporarily would be such an issue. You might also put that as part of their terminal value, alongside the 'solve FAI' or describe FAI.

If you have figured out how to encode 'solve FAI' concretely into an actual AGI's utility function, then there is no need to tack on the desire to sleep indefinitely.

If there's a dangerous mistake in the 'solve FAI' encoding, then its not clear what kind of safety the sleep precommittment actually buys.

[-]Algon10y00

Well, even if you had figured out how to encode 'solve FAI' isn't there still scope for things going wrong? I thought that part of the problem with encoding an AI with some terminal value is that you can never be quite sure that it would pursue that value in the way you want it to. Which is why you need to limit there intelligence to something which we can cope with, as well as give them a time limit, so they don't manage to create havoc in the world.

Also, as you go through the iterations of the AI, you'd get better and better ideas as how to solve FAI; either ourselves, or by the next AI. Because you can see where the AI has gone wrong, and how, and prevent that in the next iteration.

Do you think that the AI just has too much scope to go wrong, and that's what I'm underestimating? If so, that would be good to know. I mean, that was the point of this post.

[-]jacob_cannell10y10

Well, even if you had figured out how to encode 'solve FAI' isn't there still scope for things going wrong?

Sure, in the sense that an alien UFAI could still arrive the next day and wipe us out, or a large asteroid, or any other low probability catastrophe. Or the FAI could just honestly fail at its goal, and produce an UFAI by accident.

There is always scope for things going wrong. However, encoding 'solve FAI' turns out to be essentially the same problem as encoding 'FAI', because 'FAI' isn't a fixed thing, its a complex dynamic. More specifically FAI is an AI that creates improved successor versions of itself, thus it has 'solve FAI' as part of its description already.

Also, as you go through the iterations of the AI, you'd get better and better ideas as how to solve FAI; either ourselves, or by the next AI. Because you can see where the AI has gone wrong, and how, and prevent that in the next iteration.

Yes - with near certainty the road to complex AI involves iterative evolutionary development like any other engineering field. MIRI seems to want to solve the whole safety issue in pure theory first. Meanwhile the field of machine learning is advancing rather quickly to AGI, and in that field progress is driven more by experimental research than pure theory - as there is only so much one can do with math on paper.

The risk stems from a few considerations: once we have AGI then superintelligence could follow very shortly thereafter, and thus the first AGI to scale to superintelligence could potentially takeover the world and prevent any further experimentation with other designs.

Your particular proposal involves constraints on the intelligence of the AGI - a class of techniques discussed in detail in Bostrom's Superintelligence. The danger there is that any such constraints increase the liklihood that some other less safe competitor will then reach superintelligence first. It would be better to have a design that is intrinsically benevolent/safe and doesnt need such constraints - if such a thing is possible. The tradeoffs are rather complex.

[-]Algon10y00

Alright, what I got from your post is that if you know the definition of an FAI and can instruct a computer to design one, you've basically already made one. That is, having the precise definition of the thing massively reduces the difficulty of creating it i.e. when people ask 'do we have free will?' defining free will greatly reduces the complexity of the problem. Is that correct?

[-]jacob_cannell10y00

Alright, what I got from your post is that if you know the definition of an FAI and can instruct a computer to design one, you've basically already made one.

Yes. Although to be clear, the most likely path probably involves a very indirect form of specification based on learning from humans.

[-]Algon10y00

Ok. So why could you not replace 'encode an FAI' with 'define an FAI?' And you would place all the restrictions I detailed on that AI. Or is there still a problem?

[-][anonymous]10y-10

I have not read the materials yet, but there is something fundamental I don't understand about the superintelligence problem.

What are the chances of a no-fucks-given superintelligence? Was this even considered, or is it just assumed ab ovo that intelligence must be a fearsomely powerful thing?

[-]RowanE10y00

[-]Viliam10y00

Seems to me the "killer app" is not superintelligence per se, but superintelligence plus self-modification. With highly intelligent people that problem is often that they can't modify themselves, so in addition to their high intelligence they also have some problems they can't rid of.

Maybe Kasparov plays chess because he genuinely believes that playing chess is the most useful thing he could ever do. But more likely there are other reasons; for example he probably enjoys playing chess, and the more useful things are boring to him. Or maybe he is emotionally insecure and wants to stay in an area he is already good at, because he couldn't emotionally bear giving up the glory and starting something as a total newbie, even if a few years later it would pay off. (This is just a random idea; I have no idea what Kasparov really is like. Also, maybe he is doing other things too, we just don't know about them.)

Imagine a person who would be able to edit their own mind. For example, if they would realize that their plans would advance further if they had better social skills, they would (a) temporarily self-modify to enjoy scientifically studying social skills, then (b) do a research about social skills and find out what is evidence-based, and finally (c) self-modify to have those traits that reliably work. And then they would approach every obstacle in the same way. Procrastinating too much? Find the parts of your mind that make you do so, and edit them out. Want to stay fit, but you hate exercising? Modify yourself to enjoy exercising, but only for a limited time every day. Want to study finance, but have a childhood emotional trauma related to money? Remove the trauma.

You would get a person who, upon seeing an optimal way, would start following that way. They would probably soon realize that their time is a limited resource, and start employing other people to do some work for them. They would use various possible computer tools, maybe even employ people to improve those tools for them. They would research possibilities of self-improvement, use them for themselves, and also trade the knowledge with other people for resources or loyalty. After a while, they would have an "army" of loyal improved followers, and would also interact a lot with the rest of the world. And that would be just a beginning.

Maybe for a human there would be some obstacle, for example that the average human life is too short to research and implement immortality. Or maybe they would reach the escape velocity and become something transhuman.

Most savants simply don't care about changing the world

I would guess they still find some things frustrating (e.g. when someone or something is interrupting them from their hobby), they are just no strategic enough to remove all sources of frustration. Either they do not bother to plan long-term, or they don't believe such long-term planning could work.

the 190 IQ son of a waiter is probably a cook

Let's imagine a country with a strict caste system, where the IQ 190 person born in the lower caste must remain there. If the society is really protected against any attempts to break the system, for example if the people from the lower caste are forbidden all education and internet, and are under constant surveillance, there is probably not much he could do. But if it's a society more or less like ours, only it is illegal and socially frowned upon hiring people to do other caste's work, a strategic person could start exploring possibilites to cheat -- for example, could they fake a different, higher-caste identity? Or perhaps move to a country without the caste system. Or if there are exceptions where the change of caste is allowed, they would try one, and could try to cheat to get the exception easier. (For example, if a higher-caste person can "adopt" you into their caste, you could try to make a deal with one, or blackmail one, or maybe somehow fake the whole process of being adopted.) They could also try to somehow undermine the caste system; create a community that ignores the caste rules; etc.

[-][anonymous]10y-10

Hm, that is a better point, it seems then most of my objections are just to the wording. Most intelligent people are also shy etc. and that is why they end up being math researchers instead of being Steve Jobs. If an intelligent person could edit in his mind courage, dedication, charme... that would be powerful.

But I think self-modification would be powerful even without very high IQ, 120 would already make one pretty succesful.

Or is it more IQ being necessary for efficient self-modification?

My point is, this sounds like a powerful combination, but probably not the intelligence explosion kind.

The caste stuff: really elegant steelmanning, congrats. But I think kind of missing the point, probably I explained myself wrong. Basically an IQ-meritocracy requires a market based system, exchanged based one, where what you get is very roughly proportional to what you give to others. However most of the planet is not exchange based but power based. This is where intelligence is less useful. Imagine trying to compete with Stalin for the job of being Lenin's successor. What traits you need for it? First of all, mountains of courage, that guys is scary. Of course if you can self-edit, that is indeed extremely helpful in it... I did not factor that in. But broadly speaking, you don't just outsmart him. Power requires other traits. And of course it can very well be that you don't want power, you want to be a researcher... but in that situation you are forced to take orders from him so you may still want to topple the big boss or something.

Now of course if we see intelligence as simply the amount of sophistication put into self-editing, so seeing a higher intelligence as something that can self-edit e.g. charisma better than a lower intelligence... then these possibilities are indeed there. I am just saying, still no intelligence explosion, more like everything explosion, or maybe everything else but intelligence explosion. Charisma explosion, and so on... but I do agree that this combined can be very powerful.

[-]Viliam10y10

Or is it more IQ being necessary for efficient self-modification?

Sounds like false dilemma if IQ is one of those things that can be modified. :D

To unpack the word, IQ approximately means how complex concepts can you "juggle" in your head. Without enough IQ, even if you had an easy computer interface to modify your own brain, you wouldn't understand what exactly you should do to achieve your goals (because you wouldn't sufficiently understand the concepts and the possible consequences of the changes). That means, you would be making those changes blindly... and you could get lucky and hit the path where your IQ increases so then you can start making reliably the right steps, or you could set yourself on a way towards some self-destructive attractor.

As a simple example, a stupid person could choose to press a button that activates their reward center... and would keep doing that until they die from starvation. Another example would be self-modification where you lose your original goals, or lose the will to further self-improve, etc. This does not have to happen in one obvious step, but could be a subtle cumulative consequence of many seemingly innocent steps. For example, a person could decide that being fit is instrumentally useful for their goals, so they would self-modify to enjoy exercise, but would make a mistake of modifying themselves too much, so now they only want to exercise all day long, and no longer care about their original goals. Then they would either stop self-modifying, or self-modify merely to exercise better.

It also depends on how complex would be the "user interface to modify your own brain". Maybe IQ 120 would not be enough to understand it. Maybe even IQ 200 wouldn't. You could just see a huge network of nodes, each connected to hundreds of other nodes, each one with insanely abstract description... and either give up, or start pushing random buttons and most likely hurt yourself.

So basicly the lowest necessary starting IQ is the IQ you need to self-modify to safely enough increase your IQ. This is a very simple model which assumes that if IQ N allows you to get to IQ N+1, then IQ N+1 probably allows you to get to IQ N+2. The way does not have to be this smooth; there may be a smooth increase up to some level, which then requires a radical change to overcome; or maybe the intelligence gains at each step will decrease.

Imagine trying to compete with Stalin for the job of being Lenin's successor. What traits you need for it?

Courage, social skills, ability to understand how politics really works. You should probably start in some position in army or secret service, some place where you can start building your own network of loyal people, without being noticed by Stalin. Or maybe you should start as a crime boss in hiding, I don't know.

[-][anonymous]10y00

To unpack the word, IQ approximately means how complex concepts can you "juggle" in your head

I think I agree with this, this is why I don't understand how can intelligence be defined as goal-achieving ability. When I am struggling with the hardest exercises on the Raven test, what I wish I had more is not some goal-achieving power but something far simpler, something akin to short term memory. So when I wish for more intelligene, I just wish for a bit more RAM in short-term memory, so more detailed, more granular ideas can be uploaded into tumble space. No idea why should it mean goal-achieving or optimizing ability. And for AI IQ sounds like entirely hardware...

[-]ChristianKl10y00

Basically an IQ-meritocracy requires a market based system, exchanged based one, where what you get is very roughly proportional to what you give to others. However most of the planet is not exchange based but power based. This is where intelligence is less useful.

Intelligence is very useful in conflicts. If you can calculate beforehand whether you will win or lose a battle you don't have to fight the battle if you think you will lose it.

In our modern world great intelligence means the ability to hack computers. Getting information and being able to alter email message that person A sends person B. That's power.

Getting money because you are smart enough to predict stock market movements is another way to get power.

And of course it can very well be that you don't want power, you want to be a researcher... but in that situation you are forced to take orders from him so you may still want to topple the big boss or something.

Stalin was likely powerful but a lot of today's political leaders are less powerful.

Peter Thiel made the point that Obama probably didn't even knew that the US was wiretapping Angela Merkel. It's something that the nerds in the NSA decided in their Star Trek bridge lookalike.

[-]Algon10y00

When people talk about FAI, I think they're not just saying 'We need to make an AI that is not going to kill us all' because although that sort of AI may not end all humanity, it may not help us either. Part of the goal of FAI is that you get a specific sort of AI that will want to help us, and not be like what you've described.

But I suspect someone vastly more knowledgeable than me on this subject will come and address your problem.

[-][anonymous]10y10

But even if it wants to help, can it? Is intelligence really such a powerful tool? I picture it as we are aiming an arrow at something, like building a Mars base, and it makes our arrow more accurate. But if it was accurate enough to do the job how much does that help? How does intelligence transform into power to generate outcomes?

I mean the issue is that EY defines pretty much as efficient goal-achievement or cross-domain optimization, so by using that definition intelligence is trivially about a power to generate outcomes, but this is simply not the common definition of intelligence. It is not what IQ tests measure. It is not what Mensa members are good at. It is not what Kasparov is good at. That is more of a puzzle-solving ability.

I mean, could efficient goal-achievement equal puzzle-solving? I would be really surprised if someone could prove that. In the real life the most efficient goal achievers I know are often stupid and have other virtues, like a bulldog like never, ever, ever, ever give up type attitude.

[-]Algon10y00

Might I ask, what exactly do you mean? Are you saying that the super intelligent AI would not be able to contribute that much to our intellectual endeavours? Or are you saying that its intelligence may not translate to achieving goals like 'become a billionaire' or 'solve world hunger'? Or something else entirely?

[-][anonymous]10y00

I don't yet know enough about AI to even attempt to answer that, I am just trying to form a position on intelligence itself, human intelligence. I don't think IQ is predictive of how much power people have. I don't think the world is an IQ meritocracy. I understand how it can look like one from the Silicon Valley, because the SV is an IQ meritocracy, where people who actually use intelligence to change the world go to, but in general the world is not so. IQ is more of a puzzle-solving ability and I think it transforms to world-changing power only when the bottleneck is specifically the lack of puzzle-solving ability. When we have infinite amounts of dedication, warm bodies, electricity and money to throw on a problem and the only thing missing is that we don't know how, then yes, smart people are useful, they can figure that out. But that is not the usual case. Imagine you are a 220 IQ Russian and just solved cold fusion and offer it to Putin out of patriotism. You probably get shot and the blueprints buried because it threatens the energy exports so important for their economy and for the power of his oligarch supporters. This is IMHO how intelligence works, if there is everything there, especially the will, to change the world, and only the know-how is missing, then it is useful, but it is not the typical case, and in all the other cases it does not help much. Yes, of course a superintelligence could figure out, say, nanotechnology, but why should we assume that main reason why the world is not a utopia is the lack of such know-how?

This is why I am so suspicious about it. That I am afraid the whole thing is Silicon Valley culture written large, and this is not predictive of how the world works. SV is full of people eager to change the world, have the dedication, the money, all is missing is knowing how. A superintelligence could help them, but it does not mean intelligence, or superintelligence, is a general power, a general optimization ability, a general goal-achievement ability. I think it is only true in the special cases when achieving goals requires solving puzzles. Usually, it is not unsolved puzzles that are standing between you and your goal.

Imagine I wanted to be dictator of a first-world country. Is there any puzzle I could solve if I had a gazillion IQ that could achieve that? No. I could read all the books about how to manipulate people and figure out the best things to say and the result would still be that people don't want to give up their freedom especially not to some uncharismatic fat nerds no matter how excellent sounding things he says. But maybe if it was an economic depression and I would be simply highly charismatic, ruthless, and have all the right ex-classmates... I would have a chance to pull that off without any puzzle-solving, just being smart enough to not sabotage myself with blunders, say, IQ 120 would do it.

And I am worried I am missing something huge. MIRI consists of far smarter people than I am, so I am almost certainly wrong... unless they have fallen in love with smartness so much that it created a pro-smart bias in them, unless it made them underestimate in how many cases efficient world changing has nothing to do with puzzle solving and has everything to do with beating obstacles with a huge hammer, or forging consensus or a million other things.

But I think the actual data about what 180+ IQ people are doing is on my side here. What is Kasparov doing? What is James Woods doing? Certainly not radically transforming the world. Nor taking it over.

[-]ChristianKl10y30

I don't think IQ is predictive of how much power people have. I don't think the world is an IQ meritocracy.

Predictive isn't a binary catergory. Statistically IQ is predictive of a lot of things including higher social skills and lifespan.

Imagine I wanted to be dictator of a first-world country. Is there any puzzle I could solve if I had a gazillion IQ that could achieve that? No. I could read all the books about how to manipulate people and figure out the best things to say and the result would still be that people don't want to give up their freedom especially not to some uncharismatic fat nerds no matter how excellent sounding things he says.

Bernanke was during his federal reserve tenureship one of the most powerful people in the US and he scored 1590 out of 1600 on the SAT.

But I think the actual data about what 180+ IQ people are doing is on my side here. What is Kasparov doing?

Kasparov is not 180+ IQ. When given a real IQ test he scored 135. Lower than the average of LW people who submit their IQ on the census. Over 20 IQ points lower than what Bernanke has and given that Bernanke scored at the top and the test SAT isn't created to distinguish 160 from 180 he might be even smarter.

Bernake's successor got described in her NYTimes profile by a collegue as a “small lady with a large I.Q.". While I can't find direct scores she likely has a higher IQ than Kasparov.

Top bankers are high IQ people and at the moment the banker class has quite a lot of power in the US. Banking likely needs more IQ than playing chess.

Bill O'Reilly is with a SAT score of 1585 also much smarter than Kasparov and the guy seems to have some influence in US political discussion.

[-][anonymous]10y00

Statistically IQ is predictive of a lot of things including higher social skills

How? Pretty sure it was the other way around in high school. Popular dumb shallow people, unpopular smart geeks.

I am not 100% sure what is a SAT is but if it is like a normal school test - memorize stuff like historical dates of battles, barf it back - the probably more related to dedication than intelligence.

[-]Viliam10y50

Popular dumb shallow people, unpopular smart geeks.

That could be a problem of perception. If someone is book smart and unpopular, people say "he is smart". If someone is smart and popular, people say "he is cool".

There are sportsmen and actors with very high IQ, but no one remembers them as "having high IQ", only as being a great sportsman or a great actor.

Do you know how high IQ Arnold Schwarzenegger has? Neither do I. My point is, when people say "Arnold Schwarzenegger", no one thinks about "I wonder how high IQ that guy has... maybe that could explain some of his success". Maybe he has much higher IQ than Kasparov, but most people would never even start thinking in that direction.

Also there is a difference between intelligence and signalling intelligence. Not everyone with high IQ must necessarily spend their days talking about relativity and quantum physics. Maybe they just invest all their intelligence into improving their private lives.

[-]ChristianKl10y00

Do you know how high IQ Arnold Schwarzenegger has? Neither do I. My point is, when people say "Arnold Schwarzenegger", no one thinks about "I wonder how high IQ that guy has... maybe that could explain some of his success". Maybe he has much higher IQ than Kasparov, but most people would never even start thinking in that direction.

The score I find on the internet for him is 135 which puts him on the same ballpark as Kasparov.

[-]ChristianKl10y00

Popular dumb shallow people, unpopular smart geeks.

You mistake what people signal with their inherent intelligence. Bill O'Reilly doesn't behave as a geek. That change that he's smart. It's like the mistake of thinking that being good at chess is about intelligence.

Unfortunately at the moment I don't find the link to the studies about IQ and social skill, but I think we had previous LW discussions towards IQ positively correlating with social abilities in average society being well established.

I am not 100% sure what is a SAT is but if it is like a normal school test - memorize stuff like historical dates of battles, barf it back

It's not about having memorized information. SAT tests are generally known to be a good proxy for IQ.

[-]pianoforte61110y00

More helpful than single data points, here is a scatterplot of IQ vs income in Figure 1.

[-][anonymous]10y00

But again that is not power. That is just smart people getting paid better when and if there are enough jobs around where intelligence is actually useful. I think it is a very big jump from the fact that there seem to be relatively lot of those jobs around to saying it is a general world-changing, outcome-generating power. I cannot find it , but I think income could just as well be correlated with height.

[-]pianoforte61110y00

If I understand, you are attempting a "proves too much" argument with height, however, this is irrelevant, if height is predictive of income then this is an interesting result in itself* (maybe tall people are more respected) and has no bearing on whether IQ is also predictive of income. I agree that IQ probably doesn't scale indefinitely with success and power though. The tails are already starting to diverge at 130.

*there is a correlation

[-]ChristianKl10y00

More helpful than single data points, here is a scatterplot of IQ vs income in Figure 1.

It's useful but it's about comparing people between IQ 100 and 130. If we want to look at the power in society it's worth looking at the extremes.

[-]Algon10y20

Well, a while back I was reading an article about Terrence Tao. Now, in it, it said that bout 1 in 40 child prodigies go on to become incredible academics. Part of the reason is that they are forced form an early age to learn as much as possible. There was one such child prodigy, who published papers at the age of 15 and just gradually faded away afterwards. Because of their environment, these child prodigies burn out very quickly, and grow jaded towards academia. Not wanting this to happen, Tao's parent let him go at a pace he was comfortable with. What did that result in? Him becoming one of the greatest mathematicians on the planet.

So yes, many 180+iq individuals never go on to become great, but that's largely due to their environment.And most geniuses just lack the drive/charisma/interest to take over the world. But that still doesn't answer your question about 'How could an AI take over the world?' or something to that effect.

Well, you pointed out that some rich, charismatic individual with the right connections could make a big dent in society. But, charisma is something that can be learnt. Sure, it's hard and maybe not everyone can do it, but it can be learnt. Now, if one has sufficient charisma, one can make a huge impact on the world i.e. Hitler. (The slaterstarcodex blog discussed something similar to this, and I'm just basically lifting the Hitler thing from there).

The Nazi party had about 55 members when he joined, and at the time he was just a failed painter/veteran soldier. And there are disturbing records that people would say things like 'Look at this new guy! We've got to promote him. Everyone he talks to is just joining us, its insane!" And people just flocked to him. Someone might hate him, go see a speech or two, and then become a lifelong convert. Even in WW2 when the German country was getting progressively worse, he still had 90% approval from his people. This is what charisma can do. And this is something the AI can learn.Now, imagine what this AI, who'll be much more intelligent than Hitler, could do.

Furthermore, an AI would be connected to the entirety of the internet, and once it had learnt pretty much everything, it would be able to gain so much power. For example:

1) It could gain capital very rapidly. Certain humans have managed to gain huge amounts of money by overcoming their biases and making smart business decisions. Everything they learnt, so too could an AI, and with far fewer biases. So the AI could rapidly acquire business capital.

2) It could use this business capital to set up various companies, and with its intellectual capabilities, it could outpace other companies in terms of innovation. In a few years, it might well dominate the market.

3) Because of its vast knowledge drawn from the internet, it would be able to market far more succesfully than other organisations. Rather quickly it would draw people to its side, gaining a lot of social capital.

4) It would also be able to gain a lot of knowledge about any competitors and give it yet another edge over them.

5) Due to its advanced marketing strategies, wealth and social capital, it could make a party in some place and place a figurehead government in power. From there, it would be able to increase the countries power (probably in a careful fashion, not upsetting the people and keeping allies around it).

6) Now it has a country under its control, large sway over the rest of the world, and a huge amount of resources. From here, it could advance manufacturing and production to such a degree that it would need no humans to work for it.

7) Still acting carefully, the AI would now have the capability to build pretty much whatever it wanted. From there, it could institute more autonomous production plants around the world. It may provide many goods for free for the locals, in order to keep them on its side.

8) Now the AI can try and take over other countries, making parties with its backing, and promising a Golden Age for mankind.

9) The AI has transformed itself into the head of the worlds greatest superpower

10) Victory

This is just a rough outline of the path an AI could take. In all the stages it simply replicated the feats of extraordinary individuals throughout history. Nothing it did was impossible, and I would say not even implausible. Of course, it could do things very differently. Once we make autonomous production plants, the would AI just need to take them over, produce large amounts of robots and weaponry, and take the world over. Or maybe it would just hold its economic welfare hostage.

[-][anonymous]10y00

Certain humans have managed to gain huge amounts of money by overcoming their biases and making smart business decision

Thinking that one can outsmart the market is the biggest bias in this regard. People like Soros and Buffet were either lucky or had secret information others didn't, because otherwise it should be very, very unlikely to outsmart the market.

[-]Algon10y10

I wasn't referring to the stock market. I know that almost all money made by 'playing the market' is due to luck. What I meant was creating the right type of service/good with the right kind of marketing. More like Steve Jobs, Elon Musk and so forth.

[-]ChristianKl10y00

A quant who does day trading can find that there some market inefficiency between the prices of different products and then make money with the effect.

That's not how either Soros or Buffet made their fortunes but it's still possible for other people.

Moderation Log