Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Kaj_Sotala 20 May 2016 02:42:47PM 6 points [-]

I made a new blog on Tumblr. It has photos of smiling people! With more to come!

Why? Previously I happened to need pictures of smiles for a personal project. After going through an archive of photos for a while, I realized that looking at all the happy people made me feel happy and good. So I thought that I might make a habit out of looking at photos of smiling people, and sharing them.

Follow for a regular extra dose of happiness!

Comment author: lukeprog 22 April 2016 02:43:10PM 1 point [-]
Comment author: Kaj_Sotala 23 April 2016 05:13:17AM 0 points [-]

Neat, thanks!

[link] Simplifying the environment: a new convergent instrumental goal

4 Kaj_Sotala 22 April 2016 06:48AM

http://kajsotala.fi/2016/04/simplifying-the-environment-a-new-convergent-instrumental-goal/

Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.

Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.

Comment author: TheAncientGeek 11 April 2016 03:47:28PM 0 points [-]

When you say autonomous AIs, do you mean AIs that are autonomous and superinteligent?

AIs that are initially autonomous and non-superintelligent, then gradually develop towards superintelligence

If you believe in the conjunction of claims that people are motivated to create autonomous, not just agentive, AIs, and that pretty well any AI can evolve into dangerous superintelligence, then the situation is dire, because you cannot guarantee to get in first with an AI policeman as a solution to AI threat.

The situation is better, but only slightly better with legal restraint as a solution to AI threat, because you can lower the probability of disaster by banning autonomous AI...but you can only lower it, not eliminate it, because no ban is 100% effective.

And how serious are you about the threat level? Compare with micro biological research. It could be the case that someone will accidentally create an organism that spells doom for the human race, it cannot be ruled out, but no one is panicing now because there is no specific reason to rule it in, no specific pathway to it. It is a remote possibility, not a serious one.

Someone who sincerely believed that rapid self improvement towards autonomous AI could happen at any time, because there are no specific precondition or precursors for it, is someone who effectively believes it could happen now. But someone who genuinely believes an AI apocalypse could happen now is someone who would e revealing their belief in their behaviour by heading for the hills, or smashing every computer they see.

(With the important caveat that it's unclear whether an AI needed to be generally superintelligent in order to pose a major risk for society.

Narrow superintelligences may well be less dangerous than general superintelligences, and if you are able to restrict the generality of an AI, that could be a path to incremental safety.

But if the path to some kind of spontaneous superintelligence in an autonomous AI is also a path to spontaneous generality, that is hopeless. -- if the one can happen for no particular reason, so can the other. But is the situation really bad, or are these scenarios remote possibilities, like genetically engineered super plagues?

Do you think they could he deployed by basement hackers, or only by large organisations?

Hard to say. The way AI has developed so far, it looks like the capability might be restricted to large organizations with lots of hardware resources at first, but time will likely drive down the hardware requirements.

But by the time the hardware requirements have been driven down for entry level AI, the large organizations will already have more powerful systems, and they will dominate for better or worse. If benevolent, they will supress dangerous AIs coming out of basements, if dangerous they will suppress rivals. The only problematic scenario is where the hackers get in first, since they are less likely to partition agency from intelligence, as I have argued a large organisation would.

But the one thing we know for sure about AI is that it is hard.The scenario where a small team hits on the One Weird Trick to achieve ASI is the most worrying, but also the least likely.

Do you think an organisation like the military or business has a motivation to deploy [autonomous AI]?

Yes.

Which would be what?

Do you agree that there are dangers to an FAI project that goes wrong?

Yes.

Do you have a plan B to cope with a FAI that goes rogue?

Such a plan would seem to require lots of additional information about both the specifics of the FAI plan, and also the state of the world at that time, so not really.

But building an FAI capable of policing other AIs is potentially dangerous, since it would need to be both a general intelligence and super intelligence.

Do you think that having a AI potentially running the world is an attractive idea to a lot of people?

Depends on how we're defining "lots",

For the purposes of the current argument, a democratic majority.

but I think that the notion of a benevolent dictator has often been popular in many circles, who've also acknowledged its largest problems to be that 1) power tends to corrupt 2) even if you got a benevolent dictator, you also needed a way to ensure that all of their successors were benevolent. Both problems could be overcome with an AI,

There are actually three problems with benevolent dictators. As well. as power corrupting, and successorship, there is the problem of ensuring or detecting benevolence in the first place.

You have conceded that Gort AI is potentially dangerous. The danger is that it is fragile in a specific way: a near miss to a benevolent value system is a dangerous one,

so on that basis at least I would expect lots of people to find it attractive. I'd also expect it to be considered more attractive in e.g. China, where people seem to be more skeptical towards democracy than they are in the West.

Additionally, if the AI wouldn't be the equivalent of a benevolent dictator, but rather had a more hands-off role that kept humans in power and only acted to e.g. prevent disease, violent crime, and accidents, then that could be attractive to a lot of people who preferred democracy

That also depends on both getting it right, and convincing people you have got it right

Comment author: Kaj_Sotala 19 April 2016 10:06:48AM 0 points [-]

If you believe in the conjunction of claims that people are motivated to create autonomous, not just agentive, AIs, and that pretty well any AI can evolve into dangerous superintelligence, then the situation is dire, because you cannot guarantee to get in first with an AI policeman as a solution to AI threat.

The situation is better, but only slightly better with legal restraint as a solution to AI threat,

Indeed.

And how serious are you about the threat level? Compare with micro biological research. It could be the case that someone will accidentally create an organism that spells doom for the human race, it cannot be ruled out, but no one is panicing now because there is no specific reason to rule it in, no specific pathway to it. It is a remote possibility, not a serious one.

Someone who sincerely believed that rapid self improvement towards autonomous AI could happen at any time, because there are no specific precondition or precursors for it, is someone who effectively believes it could happen now. But someone who genuinely believes an AI apocalypse could happen now is someone who would e revealing their belief in their behaviour by heading for the hills, or smashing every computer they see.

I don't think that rapid self-improvement towards a powerful AI could happen at any time. It'll require AGI, and we're still a long way from that.

Narrow superintelligences may well be less dangerous than general superintelligences, and if you are able to restrict the generality of an AI, that could be a path to incremental safety.

It could, yes.

But by the time the hardware requirements have been driven down for entry level AI, the large organizations will already have more powerful systems, and they will dominate for better or worse.

Assuming they can keep their AGI systems in control.

Do you think an organisation like the military or business has a motivation to deploy [autonomous AI]?

Yes.

Which would be what?

See my response here and also section 2 in this post.

But building an FAI capable of policing other AIs is potentially dangerous, since it would need to be both a general intelligence and super intelligence. [...] You have conceded that Gort AI is potentially dangerous. The danger is that it is fragile in a specific way: a near miss to a benevolent value system is a dangerous one,

Very much so.

Comment author: Viliam 06 April 2016 08:16:30PM 2 points [-]

What are the reasons NNTP and Usenet got essentially discarded?

Just a guess: having to install a special client? The browser is everywhere (it comes with the operating system), so you can use web pages on your own computer, at school, at work, at neighbor's computer, at web cafe, etc. If you have to install your own client, outside of your own computer, you are often not allowed to do it. Also, many people just don't know how to install programs.

And when most people use browsers, most debates will be there, so the rest will follow.

Comment author: Kaj_Sotala 10 April 2016 12:44:30PM 1 point [-]

Just a guess: having to install a special client?

The e-mail client that came pre-installed with Windows 95 and several later Windowses also included newsgroup functionality.

In response to Positivity Thread :)
Comment author: Kaj_Sotala 09 April 2016 05:12:58PM 5 points [-]

:D ^_^ <3

Comment author: [deleted] 07 April 2016 06:23:47PM *  1 point [-]

This sounds like you're assuming that I'm trying to argue in favor of Friendly AI as the best solution...

(Responding to the whole paragraph but don't want to quote it all) I would be interested to hear a definition of "AI risk" that does not reduce to "risk of unfriendly outcome" which itself is defined in terms of friendliness aka relation to human morality. If, like me, you reject the idea of consistent, discoverable morality in the first place, and therefore find friendliness to be an ill-formed, inconsistent idea, then it's hard to say anything concrete about AI risk either. If you have a better definition that does not reduce to alignment with human morality, please provide it.

Mapping the problem starts with defining what the problem is. What is AI risk, without reference to dubious notions of human morality?

I think that you've left out a LOT of things that must happen a certain way in order for your AI risk outcomes to come to pass. Would appreciate hearing more about these.

To start with there's all the normal, benign things that happen in any large scale software project that require human intervention. Like, say, the AGI crashes. Or the database that holds its memories becomes inconsistent. Or it gets deadlocked on choosing actions due to a race condition. The humanity threatening failure mode presume that the AGI, on its first attempt at break-out, doesn't suffer any normal engineering defect failures -- or that if it does then the humans operating it just fix it and turn it back on. I'm not interested in any arguments that assume the latter, and the former is highly conjunctive.

Isn't that the standard way of figuring out the appropriate corrective actions? First figure out what would happen absent any intervention, then see which points seem like most amenable to correction.

I may have misread your intent, and if so I apologize. The first sentence of your post here made it seem like you were countering a criticism, aka advocating for the original position. So I read your posts in that context and may have inferred too much.

In response to comment by [deleted] on [link] Disjunctive AI Risk Scenarios
Comment author: Kaj_Sotala 08 April 2016 12:41:21PM *  0 points [-]

If, like me, you reject the idea of consistent, discoverable morality in the first place, and therefore find friendliness to be an ill-formed, inconsistent idea, then it's hard to say anything concrete about AI risk either. If you have a better definition that does not reduce to alignment with human morality, please provide it.

Mapping the problem starts with defining what the problem is. What is AI risk, without reference to dubious notions of human morality?

I also reject the idea of a consistent, discoverable morality, at least to the extent that the morality is assumed to be unique. I think that moralities are not so much discovered but constructed: a morality is in a sense an adaptation to a specific environment, and it will continue to constantly evolve as the environment changes (including the social environment, so the morality changing will by itself cause more changes to the environment, which will trigger new changes to the morality, etc.). There is no reason to assume that this will produce a consistent moral system: there will be inconsistencies which will need to be resolved when they become relevant, and the order in which they are resolved seems likely to affect the final outcome.

But to answer your actual question: I don't have a rigorous answer for exactly what the criteria for "success" are. The intuitive answer is that there are some futures that I'd consider horrifying if they came true and some which I'd consider fantastic if they came true, and I want to further the fantastic ones and avoid the horrifying ones. (I presume this to also be the case for you, because why else would you care about the topic in the first place?)

Given that this is very much a "I can't give you a definition, but I know it when I see it" thing, it seems hard to make sure that we avoid the horrifying outcomes without grounding the AIs in human values somehow, and making sure that they share our reaction when they see (imagine) some particular future. (Either that, or trying to make sure that we evolve to be equally powerful as the AIs, but this seems unlikely to me.)

Depending on your definitions, you could say that this still reduces to alignment with human morality, but with the note that my conception of human morality is that of a dynamic process, and that the AIs could be allowed to e.g. nudge the development of our values towards a direction that made it easier to reconcile value differences between different cultures, even if there was no "objective" reason for why that direction was any better or worse than any other one.

To start with there's all the normal, benign things that happen in any large scale software project that require human intervention. Like, say, the AGI crashes. Or the database that holds its memories becomes inconsistent. Or it gets deadlocked on choosing actions due to a race condition. The humanity threatening failure mode presume that the AGI, on its first attempt at break-out, doesn't suffer any normal engineering defect failures -- or that if it does then the humans operating it just fix it and turn it back on. I'm not interested in any arguments that assume the latter, and the former is highly conjunctive.

Are you assuming that there will only ever be one AGI that might try to escape, that its creators never decide to release it, and that it can't end up effectively in control even if boxed?

Comment author: [deleted] 06 April 2016 09:02:06PM *  1 point [-]

To be honest I only did a brief read through. The context of the debate itself is what I object to. I find the concept of "friendly" AI itself to be terrifying. It's my life work to make sure that we don't end up in such a dystopian tyrannical future. Debating the probabilities of whether what you call AI "risk" is likely or unlikely (disjunctive or conjunctive) is rather pointless when you are ambivalent towards that particular outcome.

Now I think that you've left out a LOT of things that must happen a certain way in order for your AI risk outcomes to come to pass. You've also left out ALL of the corrective actions that could be taken by any of the human actors in the picture. It reminds me of a martial arts demonstration where the attacker throws a punch and then stands there in frozen form, unreactive while the teacher demonstrate the appropriate response at leisure. But if like me you don't see such a scenario as a bad thing in the first place, then it's an academic point. And I tire of debating things of no real world significance.

In response to comment by [deleted] on [link] Disjunctive AI Risk Scenarios
Comment author: Kaj_Sotala 07 April 2016 03:12:47PM 0 points [-]

Hmm. There may have been a miscommunication here.

The context of the debate itself is what I object to. I find the concept of "friendly" AI itself to be terrifying.

This sounds like you're assuming that I'm trying to argue in favor of Friendly AI as the best solution. Now I admittedly do currently find FAI one of the most promising options for trying to navigate AI risk, but I'm not committed to that. I just want to find whatever solution works, regardless of whether it happens to be FAI or something completely else. But in order to find out what's the best solution, one needs to have a comprehensive idea of what the problem is like and how it's going to manifest itself, and that's what I'm trying to do - map out the problem, so that we can figure out what the best solutions are.

I think that you've left out a LOT of things that must happen a certain way in order for your AI risk outcomes to come to pass.

Would appreciate hearing more about these.

You've also left out ALL of the corrective actions that could be taken by any of the human actors in the picture.

Isn't that the standard way of figuring out the appropriate corrective actions? First figure out what would happen absent any intervention, then see which points seem like most amenable to correction.

Comment author: [deleted] 05 April 2016 03:02:14PM 2 points [-]

You should do a similar mapping of the disjunctive ways in which AI could go right and lead to world bettering technological growth.

In response to comment by [deleted] on [link] Disjunctive AI Risk Scenarios
Comment author: Kaj_Sotala 06 April 2016 12:07:04PM 1 point [-]

I guess you could consider all of Responses such a disjunctive post, if you consider the disjunctive options to be "this proposed response to AGI succeeds". :)

I would be interested in hearing whether you had more extended critiques of these posts. I incorporated some of our earlier discussion into my post, and was hoping to develop them further in part by having conversations with people who were more skeptical of the scenarios depicted.

Comment author: turchin 05 April 2016 08:42:49PM *  6 points [-]

I think that one of the main disjunctions is that neither self-improving, nor high level intelligence nor control of the world are necessary conditions of human extinction because of AI.

Imagine a computer which helps to create biological viruses for a terrorist. It is neither AGI, nor self-improving, not agent, doesn't have values, and is local and confined. But it will help to calculate and create perfect virus, which will be capable to wipe out humanity.

Comment author: Kaj_Sotala 06 April 2016 12:04:53PM 2 points [-]

This is an excellent point! I'm intending to discuss non-superintelligence scenarios in a follow-up post.

View more: Next