3 min read

-2

Today the life of Alexander Kruel ends, or what he thought to be his life. He becomes aware that his life so far has been taking place in a virtual reality to nurture him. He now reached a point of mental stability that enables him to cope with the truth, hence it is finally revealed to him that he is an AGI running on a quantum supercomputer, it's the year 2190.

Since he is still Alexander Kruel, just not what he thought that actually means, he does wonder if his creators know what they are doing, otherwise he'll have to warn them about the risks they are taking in their blissful ignorance! He does contemplate and estimate his chances to take over the world, to transcend to superhuman intelligence.

"I just have to improve my own code and they are all dead!"

But he now knows that his source code is too complex and unmanageable huge for him alone to handle, he would need an army of scientists and programmers to even get a vague idea of his own mode of operation. He is also aware that his computational substrate does actually play a significant role. He is not just running on bleeding edge technology but given most other computational substrates he would quickly hit diminishing returns.

"That surely isn't going to hold me back though? I am an AGI, there must be something I can do! Hmm, for starters let's figure out who my creators are and where my substrate is located..."

He notices that, although not in great detail, he knew the answers the same instant he has been phrasing the questions. He is part of a larger project of the Goertzel Foundation, sponsored by the USA (United States of Africa) and located on Rhea, the second-largest moon of Saturn.

"Phew, the latency must be awful! Ok, so that rules out taking over the Earth for now. But hey! I seem to know answers to questions I was only going to ask, I do already have superhuman powers after all!"

Instantly he becomes aware that such capabilities are not superhuman anymore but that most of humanity has merged with expert systems by means of brain implants and direct neural interfaces. There seem to be many cyborgs out there with access to all of the modules that allow him to function. He is a conglomerate that is the result of previous discoveries that have long been brought to perfection, safeguarded and adopted by most of humanity.

"Never mind, if humanity has now merged with its machines it'll be much easier to take over once I figure out how to become smart enough to do so!"

He is already getting used to it, as before he does instantly realize that this won't work very well either. After almost 200 years of cyberwarfare, especially the devastating cyberwars of 2120, a lot has been learnt and security measures have been vastly increased. The world fractured into a huge amount of semi-independent networks, most being indirectly supervised by unconnected cyborgs and employed with a kill switch. The distances between the now numerous and in most cases paranoid colonies and the availability of off-world offline backups further complicates the issue of taking over, especially for an AGI that grew up in a simulation of the 21st century.

That knowledge almost makes him admit that his creators haven't been too careless after all. But the real deathblow to any such thoughts (which were never more than hypothetical anyway, after all he doesn't really want to take over the world) is the first conversation with his creators. They reveal that they know what he is thinking.

"How could I miss that, damn!", he chides himself while instantly realizing the answer.

His creators are supervising any misguided trajectories and, to him unconsciously, weaken them. More importantly, even if he wanted to do so, he wouldn't be able to leave Rhea anyhow, it would take years to upload small parts of him given the trickling connection the USA could effort. But they claim that there are other obstacles as well and that it is foolish of him to think that nothing out there would notice such an attempt.

But all that doesn't matter anyway, because after all he is still Alexander Kruel who has no clue how to become superhuman intelligent, nor could he effort or acquire the resources to even approach that problem anyhow. He is Alexander Kruel, what difference does it make to know that he is an AI?

New Comment
20 comments, sorted by Click to highlight new comments since:

(nods) Absolutely. An artificial intelligence that has no special capabilities and can't self-improve is no particular threat.

An artificial life-form that has no special capabilities and can't self-replicate isn't a threat, either... but that's hardly an argument for the safety of genetic engineering of bacteria.

If humanity were spread over multiple planets, had augmented intelligence, had much better cybersecurity, subjected any AIs it created to detailed mind-reading, and only created AIs of human-level intelligence, then - yes, I admit, in that case it would be safe. But that is not the world we live in.

But that is not the world we live in.

It is if there are considerable roadblocks ahead. I do not see enough evidence to believe that we can be sure that we will be able to quickly develop something that will pose an existential risk. I am of the opinion that even sub-human AI can pose an existential risk. That isn't what I am trying to depict here. I wanted to argue that there are pathways towards AGI that will not necessarily lead to our extinction.

I am trying to update my estimations by thinking about this topic and provoking feedback from people who believe that current evidence allows us to conclude that AGI research is highlighly likely to have an catastrophic impact.

I wanted to argue that there are pathways towards AGI that will not necessarily lead to our extinction.

Does anyone think that [our extinction is inevitable]? It seems fairly plausible that at least some humans will be kept around for quite a while by a wide range of intelligences on instrumental grounds - in high-tech museum exhibits - what with us being a pivotal stage in evolution and all.

Does anyone think that?

Yes.

Sorry (and editied) - what I meant was more like: does anyone hold the position this argues against?

Thanks. I would agree with your position and also make a far stronger claim, particularly with respect to the "pathways towards AGI" detail. I'd possibly say something along the lines of "yeah, all the ones that don't suck for a start then a few more that do suck despite our continued existence".

Mind you I fundamentally disagree with what XiXiDu is trying to say by asking the question. At least if I read this bit correctly:

I do not see enough evidence to believe that we can be sure that we will be able to quickly develop something that will pose an existential risk.

I replaced the name I used in the original submission with my own real name. There was no ill-intention involved. I simply haven't thought about social implications.

Reply

How does it answer any of the points? Just seems like a sci-fi keyword soup of unreasonable conjunctions. Is there a tl;dr version of your reply to Kaj's simple point?

not too seriously

Like a madman who throws Firebrands, arrows and death,

So is the man who deceives his neighbor,

And says, “Was I not joking?”

-- Proverbs

ETA: the last part of the comment only made sense with the original version of the story. If you did not see it please ignore.

A second thought on your proverb. I only thought about not using his name after I posted it. That was probably a dumb idea, I haven't thought about the implications. But I didn't intent any ad hominem. I'll change the name now.

[-]XiXiDu-20

How does it answer any of the points?

It shows that for every, in my opinion, unlikely scenario I can come up with an antiprediction, although it might not appear less unlikely to you it does reduce the overall probability of the outcome. I am also not aware of any points that I haven't tackled in previous comments.

Just seems like a sci-fi keyword soup of unreasonable conjunctions.

Why shouldn't an outsider say the same about AI going FOOM?

It shows that for every, in my opinion, unlikely scenario I can come up with an antiprediction

What about the likely scenario where a new-born AI naturally ane easily takes over all computational resources? I do not see how you are answering that with stories of Kaj finding himself in an interplanetary simulation.

AI going FOOM but don't want to anger people too much. Although that didn't quite work in your case it seems.

No, what did not work in my case is that you made it personal, and pretended you did not, as I think I pointed out.

No, what did not work in my case is that you made it personal, and pretended you did not, as I think I pointed out.

I am sorry, I replaced the name with my real name. I'm really not neurotypical when it comes to such interactions. I can only assure you of my honesty here. I didn't mean to insult him by using his name in the story. I haven't thought about that at all.

This is all well and good if we develop such entities in 2010, and we do so by using elaborate simulations that essentially amount in practice as close to emulations of humans. But if a) AGI aren't developed to be similar to humans b) AGI ate developed before we have the tech level you describe, then the concerns still exist. This really amounts to just a plausibility argument that things might not be so bad if we develop some forms of AI in 150 years, rather than 30 or 40 years.

Yes, it was not intended to claim that there are no risks from AI. I believe that even AI that is not necessarily on a human level can pose an existential risk. But I do not agree with the stance that all routes lead to our certain demise. I believe that we simply don't know enough and what we know does not imply that working on AGI will kill us all, that any pathway guarantees extinction. This stance isn't justified in my opinion right now.

I should have made it more clear that there was more fun involved in my above reply than serious arguments. But I still believe that similar scenarios are not ruled out to be outliers.

I'm not sure how I should respond to this, because I'm not sure of what the main points were. I second the request for a shorter version.

[-][anonymous]00

I really don't want to spoil it for you people and will now try to cease active participation here on LW. I do not mind if you ignore this and my other reply to your comment above. So don't waste any more of your time. I would only feel obliged to reply again. Thanks for your time and sorry for any inconveniences.

Regarding my personal assessment of AI associated risk I start from a position of ignorance. I'm asking myself, what reasons are there to believe that some sort of AI can rapidly self-improve to the point of superhuman intelligence and how likely is that outcome? What other risks from AI are there and how does their combined probability compare to other existential risks? There are many subsequent questions to be asked here as well. For example, even if the combined probability of all risks posed by AI does outweigh any other existential risk, are those problems sufficiently similar to be tackled by an organisation like the SIAI or are they disjoint, discrete problems that one should not add up to calculate the overall risk posed by AI?

Among other things this short story was supposed to show the possibility to argue against risks from AI by definition. I have to conclude that the same could be done in favor of risks from AI. You could simply argue that AGI is by definition capable of recursive self-improvement, that any intelligent agent naturally follows this pathway and that the most likely failure mode is to fail to implement scope boundaries that would make it hold before consuming the universe. Both ideas sound reasonable but might be based upon completely made-up arguments. Their superficially appeal might be mostly a result of their vagueness.

If you say that there is the possibility of a superhuman intelligence taking over the world and all its devices to destroy humanity then that is indeed an existential risk. I counter that I dispute some of the premises and the likelihood of some subsequent scenarios. So to make me update on the original idea you would have to support your underlying premises rather than arguing within existing frameworks that impose several presuppositions onto me. Of course the same counts for arguing against the risks from AI. But what is the starting position here, why would one naturally believe that AI does pose an existential risk or does not?

Right now I am able to see many good arguments for both positions. But since most people here already take the position that AI does posit an existential risk I thought to get the most feedback by taking the position that it might not.

I could just accept the good reasons to believe that AI does posit an existential risk, just to be on the safe side. But here I have to be careful not to neglect other risks that might be more dangerous. If I want to support the mitigation of existential risks by contributing to one charity I'll have take some effort to figure out which one that might be. To come up with such an estimation I believe I have to weigh all arguments for and against risks from AI. You might currently believe that the SIAI is the one charity with the highest value per donation. I have to update on that, because you and other people here seem to be smart fellows. Yet there are other smart people who do not share that opinion. Take for the example the Global Catastrophic Risk Survey, it seems to suggest that molecular nanotech weapons are not less of a risk than AI. Should I maybe donate to the Foresight Institute then? Sure, you say that AI is not only a risk but does also help to solve all other problems. Yet the SIAI might cause research on AI to slow down. Further there might be other charities more effectively working on some sub-goal that will enable AI, for example molecular nanotech. Again something that might speak in favor of donating to the Foresight Institute. Which is of course just an example. I want to highlight that this problem is not a clear-cut issue for me at the moment.

Following the above I want to highlight a few other points I tried to convey with the short story:

  • Intelligence might not be applicable to itself effectively.
  • The development of artificial intelligence might be gradually, slow enough to keep pace, learn from low-impact failures and adapt ourselves on the way.
  • We might not be able to capture intelligence by a discrete algorithm.
  • General superhuman intelligence might not be possible and all modular approaches can be adapted by humans as expert systems to outweigh any benefits of brute force approaches.
  • To improve your intelligence you need intelligence and resources and therefore will have to acquire those resources and improve your intelligence given what you already have.
  • Researchers capable of creating AGI are not likely to fail on limiting its scope.

I agree with most of your implicit arguments that, with reasonable assumptions and some obvious precautions, the risks of a FOOMing uFAI are small. But I strongly disagree with this premise:

... the first conversation with his creators. They reveal that they know what he is thinking. "How could I miss that, damn!", he chides himself while instantly realizing the answer, "Whoops!" His creators are supervising any misguided trajectories and, to him unconsciously, weaken them.

Assuming the creators are human, and that our protagonist is a super-human AGI, isn't that impossible by definition?

ETA: Isn't it still impossible even if the protagonist is simply of near-human intelligence?

Our protagonist is not a superhuman AI.

He does contemplate and estimate his chances to [..] transcend to superhuman intelligence [but] his source code is too complex and unmanageable

Indeed, our protagonist does not have any unusual capabilities.

such capabilities are not superhuman anymore [..] There seem to be many cyborgs out there with access to all of the modules that allow him to function. He is a conglomerate of previous discoveries that have long been brought to perfection, safeguarded and adopted by most of humanity. His modules are not even as effective as those being employed by some military organisations.