Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
This post is to test out Polymath-style collaboration on LW. The problem we've chosen to try is formalizing and analyzing Bostrom and Ord's "Parliamentary Model" for dealing with moral uncertainty.
I'll first review the Parliamentary Model, then give some of Polymath's style suggestions, and finally suggest some directions that the conversation could take.
- A lot of discussion of decision theories is really analysing them as decision-making heuristics for boundedly rational agents.
- Understanding decision-making heuristics is really useful.
- The quality of dialogue would be improved if it was recognised when they were being discussed as heuristics.
Epistemic status: I’ve had a “something smells” reaction to a lot of discussion of decision theory. This is my attempt to crystallise out what I was unhappy with. It seems correct to me at present, but I haven’t spent too much time trying to find problems with it, and it seems quite possible that I’ve missed something important. Also possible is that this just recapitulates material in a post somewhere I’ve not read.
Existing discussion is often about heuristics
Newcomb’s problem traditionally contrasts the decisions made by Causal Decision Theory (CDT) and Evidential Decision Theory (EDT). The story goes that CDT reasons that there is no causal link between a decision made now and the contents of the boxes, and therefore two-boxes. Meanwhile EDT looks at the evidence of past participants and chooses to one-box in order to get a high probability of being rich.
I claim that both of these stories are applications of the rules as simple heuristics to the most salient features of the case. As such they are robust to variation in the fine specification of the case, so we can have a conversation about them. If we want to apply them with more sophistication then the answers do become sensitive to the exact specification of the scenario, and it’s not obvious that either has to give the same answer the simple version produces.
First consider CDT. It has a high belief that there is no causal link between choosing to one- or two- box and Omega’s previous decision. But in practice, how high is this belief? If it doesn’t understand exactly how Omega works, it might reserve some probability to the possibility of a causal link, and this could be enough to tip the decision towards one-boxing.
On the other hand EDT should properly be able to consider many sources of evidence besides the ones about past successes of Omega’s predictions. In particular it could assess all of the evidence that normally leads us to believe that there is no backwards-causation in our universe. According to how strong this evidence is, and how strong the evidence that Omega’s decision really is locked in, it could conceivably two-box.
Note that I’m not asking here for a more careful specification of the set-up. Rather I’m claiming that a more careful specification could matter -- and so to the extent that people are happy to discuss it without providing lots more details they’re discussing the virtues of CDT and EDT as heuristics for decision-making rather than as an ultimate normative matter (even if they’re not thinking of their discussion that way).
Similarly So8res had a recent post which discussed Newcomblike problems faced by people, and they are very clear examples when the decision theories are viewed as heuristics. If you allow the decision-maker to think carefully through all the unconscious signals sent by her decisions, it’s less clear that there’s anything Newcomblike.
Understanding decision-making heuristics is valuable
In claiming that a lot of the discussion is about heuristics, I’m not making an attack. We are all boundedly rational agents, and this will very likely be true of any artificial intelligence as well. So our decisions must perforce be made by heuristics. While it can be useful to study what an idealised method would look like (in order to work out how to approximate it), it’s certainly useful to study heuristics and determine what their relative strengths and weaknesses are.
In some cases we have good enough understanding of everything in the scenario that our heuristics can essentially reproduce the idealised method. When the scenario contains other agents which are as complicated as ourselves or more so, it seems like this has to fail.
We should acknowledge when we’re talking about heuristics
By separating discussion of the decision-theories-as-heuristics from decision-theories-as-idealised-decision-processes, we should improve the quality of dialogue in both parts. The discussion of the ideal would be less confused by examples of applications of the heuristics. The discussion of the heuristics could become more relevant by allowing people to talk about features which are only relevant for heuristics.
For example, it is relevant if one decision theory tends to need a more detailed description of the scenario to produce good answers. It’s relevant if one is less computationally tractable. And we can start to formulate and discuss hypotheses such as “CDT is the best decision-procedure when the scenario doesn’t involve other agents, or only other agents so simple that we can model them well. Updateless Decision Theory is the best decision-procedure when the scenario involves other agents too complex to model well”.
In addition, I suspect that it would help to reduce disagreements about the subject. Many disagreements in many domains are caused by people talking past each other. Discussion of heuristics without labelling it as such seems like it could generate lots of misunderstandings.
I'm sorry if this is the wrong place for this, but I'm kind of trying to find a turning point in my life.
I've been told repeatedly that I have a talent for math, or science (by qualified people). And I seem to be intelligent enough to understand large parts of math and physics. But I don't know if I'm intelligent enough to make a meaningful contribution to math or physics.
Lately I've been particularly sad, since my score on the quantitative general GRE, and potentially, the Math subject test aren't "outstanding". They are certainly okay (official 78 percentile, unofficial 68 percentile respectively). But that is "barely qualified" for a top 50 math program.
Given that I think these scores are likely correlated with my IQ (they seem to roughly predict my GPA so far 3.5, math and physics major), I worry that I'm getting clues that maybe I should "give up".
This would be painful for me to accept if true, I care very deeply about inference and nature. It would be nice if I could have a job in this, but the standard career path seems to be telling me "maybe?"
When do you throw in the towel? How do you measure your own intelligence? I've already "given up" once before and tried programming, but the average actual problem was too easy relative to the intellectual work (memorizing technical fluuf). And other engineering disciplines seem similar. Is there a compromise somewhere, or do I just need to grow up?
For what it's worth, the classes I've taken include Real and Complex Analysis, Algebra, Differential geometry, Quantum Mechanics, Mechanics, and others. And most of my GPA is burned by Algebra and 3rd term Quantum specifically. But part of my worry, is that somebody who is going to do well, would never get burned by courses like this. But I'm not really sure. It seems like one should fail sometimes, but rarely standard assessments.
Thank you all for your thoughts, you are a very warm community. I'll give more specific thoughts tomorrow. For what it's worth, I'll be 24 next month.
Thank you all for your thoughts and suggestions. I think I will tentatively work towards an applied Mathematics PHD. It isn't so important that the school you get into is in the top ten, and there will be lots of opportunities to work on a variety of interesting important problems (throughout my life). Plus, after the PHD, transitioning into industry can be reasonably easy. It seems to make a fair bit of sense given my interests, background, and ability.
One of the many interesting aspects of how the US dealt with the AIDS epidemic is what we didn’t do – in particular, quarantine. Probably you need a decent test before quarantine is practical, but we had ELISA by 1985 and a better Western Blot test by 1987.
There was popular support for a quarantine.
But the public health experts generally opined that such a quarantine would not work.
Of course, they were wrong. Cuba institute a rigorous quarantine. They mandated antiviral treatment for pregnant women and mandated C-sections for those that were HIV-positive. People positive for any venereal disease were tested for HIV as well. HIV-infected people must provide the names of all sexual partners for the past sic months.
Compulsory quarantining was relaxed in 1994, but all those testing positive have to go to a sanatorium for 8 weeks of thorough education on the disease. People who leave after 8 weeks and engage in unsafe sex undergo permanent quarantine.
Cuba did pretty well: the per-capita death toll was 35 times lower than in the US.
Cuba had some advantages: the epidemic hit them at least five years later than it did the US (first observed Cuban case in 1986, first noticed cases in the US in 1981). That meant they were readier when they encountered the virus. You’d think that because of the epidemic’s late start in Cuba, there would have been a shorter interval without the effective protease inhibitors (which arrived in 1995 in the US) – but they don’t seem to have arrived in Cuba until 2001, so the interval was about the same.
If we had adopted the same strategy as Cuba, it would not have been as effective, largely because of that time lag. However, it surely would have prevented at least half of the ~600,000 AIDS deaths in the US. Probably well over half.
I still see people stating that of course quarantine would not have worked: fairly often from dimwitted people with a Masters in Public Health.
My favorite comment was from a libertarian friend who said that although quarantine certainly would have worked, better to sacrifice a few hundred thousand than validate the idea that the Feds can sometimes tell you what to do with good effect.
The commenter Ron Pavellas adds:
I was working as the CEO of a large hospital in California during the 1980s (I have MPH as my degree, by the way). I was outraged when the Public Health officials decided to not treat the HI-Virus as an STD for the purposes of case-finding, as is routinely and effectively done with syphilis, gonorrhea, etc. In other words, they decided to NOT perform classic epidemiology, thus sullying the whole field of Public Health. It was not politically correct to potentially ‘out’ individuals engaging in the kind of behavior which spreads the disease. No one has recently been concerned with the potential ‘outing’ of those who contract other STDs, due in large part to the confidential methods used and maintained over many decades. (Remember the Wassermann Test that was required before you got married?) As is pointed out in this article, lives were needlessly lost and untold suffering needlessly ensued.
The Wasserman Test.
I'd like to hear from people about a process they use to decide how much to give to charity. Personally, I have very high income, and while we donate significant money in absolute terms, in relative terms the amount is <1% of our post-tax income. It seems to me that it's too little, but I have no moral intuition as to what the right amount is.
I have a good intuition on how to allocate the money, so that's not a problem.
Background: I have a wife and two kids, one with significant health issues (i.e. medical bills - possibly for life), most money we spend goes to private school tuition x 2, the above mentioned medical bills, mortgage, and miscellaneous life expenses. And we max out retirement savings.
If you have some sort of quantitative system where you figure out how much to spend on charity, please share. If you just use vague feelings, and you think there can be no reasonable quantitative system, please tell me that as well.
Update: as suggested in the comments, I'll make it more explicit: please also share how you determine how much to give.
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the second section in the reading guide, AI & Whole Brain Emulation. This is about two possible routes to the development of superintelligence: the route of developing intelligent algorithms by hand, and the route of replicating a human brain in great detail.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. My own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post. Feel free to jump straight to the discussion. Where applicable, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: “Artificial intelligence” and “Whole brain emulation” from Chapter 2 (p22-36)
- Superintelligence is defined as 'any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest'
- There are several plausible routes to the arrival of a superintelligence: artificial intelligence, whole brain emulation, biological cognition, brain-computer interfaces, and networks and organizations.
- Multiple possible paths to superintelligence makes it more likely that we will get there somehow.
- A human-level artificial intelligence would probably have learning, uncertainty, and concept formation as central features.
- Evolution produced human-level intelligence. This means it is possible, but it is unclear how much it says about the effort required.
- Humans could perhaps develop human-level artificial intelligence by just replicating a similar evolutionary process virtually. This appears at after a quick calculation to be too expensive to be feasible for a century, however it might be made more efficient.
- Human-level AI might be developed by copying the human brain to various degrees. If the copying is very close, the resulting agent would be a 'whole brain emulation', which we'll discuss shortly. If the copying is only of a few key insights about brains, the resulting AI might be very unlike humans.
- AI might iteratively improve itself from a meagre beginning. We'll examine this idea later. Some definitions for discussing this:
- 'Seed AI': a modest AI which can bootstrap into an impressive AI by improving its own architecture.
- 'Recursive self-improvement': the envisaged process of AI (perhaps a seed AI) iteratively improving itself.
- 'Intelligence explosion': a hypothesized event in which an AI rapidly improves from 'relatively modest' to superhuman level (usually imagined to be as a result of recursive self-improvement).
- The possibility of an intelligence explosion suggests we might have modest AI, then suddenly and surprisingly have super-human AI.
- An AI mind might generally be very different from a human mind.
Whole brain emulation
- Whole brain emulation (WBE or 'uploading') involves scanning a human brain in a lot of detail, then making a computer model of the relevant structures in the brain.
- Three steps are needed for uploading: sufficiently detailed scanning, ability to process the scans into a model of the brain, and enough hardware to run the model. These correspond to three required technologies: scanning, translation (or interpreting images into models), and simulation (or hardware). These technologies appear attainable through incremental progress, by very roughly mid-century.
- This process might produce something much like the original person, in terms of mental characteristics. However the copies could also have lower fidelity. For instance, they might be humanlike instead of copies of specific humans, or they may only be humanlike in being able to do some tasks humans do, while being alien in other regards.
- What routes to human-level AI do people think are most likely?
Bostrom and Müller's survey asked participants to compare various methods for producing synthetic and biologically inspired AI. They asked, 'in your opinion, what are the research approaches that might contribute the most to the development of such HLMI?” Selection was from a list, more than one selection possible. They report that the responses were very similar for the different groups surveyed, except that whole brain emulation got 0% in the TOP100 group (100 most cited authors in AI) but 46% in the AGI group (participants at Artificial General Intelligence conferences). Note that they are only asking about synthetic AI and brain emulations, not the other paths to superintelligence we will discuss next week.
- How different might AI minds be?
Omohundro suggests advanced AIs will tend to have important instrumental goals in common, such as the desire to accumulate resources and the desire to not be killed.
‘We must avoid the error of inferring, from the fact that intelligent life evolved on Earth, that the evolutionary processes involved had a reasonably high prior probability of producing intelligence’ (p27)
Whether such inferences are valid is a topic of contention. For a book-length overview of the question, see Bostrom’s Anthropic Bias. I’ve written shorter (Ch 2) and even shorter summaries, which links to other relevant material. The Doomsday Argument and Sleeping Beauty Problem are closely related.
- More detail on the brain emulation scheme
Whole Brain Emulation: A Roadmap is an extensive source on this, written in 2008. If that's a bit too much detail, Anders Sandberg (an author of the Roadmap) summarises in an entertaining (and much shorter) talk. More recently, Anders tried to predict when whole brain emulation would be feasible with a statistical model. Randal Koene and Ken Hayworth both recently spoke to Luke Muehlhauser about the Roadmap and what research projects would help with brain emulation now.
Levels of detail
As you may predict, the feasibility of brain emulation is not universally agreed upon. One contentious point is the degree of detail needed to emulate a human brain. For instance, you might just need the connections between neurons and some basic neuron models, or you might need to model the states of different membranes, or the concentrations of neurotransmitters. The Whole Brain Emulation Roadmap lists some possible levels of detail in figure 2 (the yellow ones were considered most plausible). Physicist Richard Jones argues that simulation of the molecular level would be needed, and that the project is infeasible.
Other problems with whole brain emulation
Sandberg considers many potential impediments here.
Order matters for brain emulation technologies (scanning, hardware, and modeling)
Bostrom points out that this order matters for how much warning we receive that brain emulations are about to arrive (p35). Order might also matter a lot to the social implications of brain emulations. Robin Hanson discusses this briefly here, and in this talk (starting at 30:50) and this paper discusses the issue.
What would happen after brain emulations were developed?
We will look more at this in Chapter 11 (weeks 17-19) as well as perhaps earlier, including what a brain emulation society might look like, how brain emulations might lead to superintelligence, and whether any of this is good.
‘With a scanning tunneling microscope it is possible to ‘see’ individual atoms, which is a far higher resolution than needed...microscopy technology would need not just sufficient resolution but also sufficient throughput.’
Here are some atoms, neurons, and neuronal activity in a living larval zebrafish, and videos of various neural events.
Array tomography of mouse somatosensory cortex from Smithlab.
A molecule made from eight cesium and eight
iodine atoms (from here).
Efforts to map connections between neurons
Here is a 5m video about recent efforts, with many nice pictures. If you enjoy coloring in, you can take part in a gamified project to help map the brain's neural connections! Or you can just look at the pictures they made.
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some taken from Luke Muehlhauser's list:
- Produce a better - or merely somewhat independent - estimate of how much computing power it would take to rerun evolution artificially. (p25-6)
- Conduct a more thorough investigation into the approaches to AI that are likely to lead to human-level intelligence, for instance by interviewing AI researchers in more depth about their opinions on the question.
- Measure relevant progress in neuroscience, so that trends can be extrapolated to neuroscience-inspired AI. Finding good metrics seems to be hard here.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about other paths to the development of superintelligence: biological cognition, brain-computer interfaces, and organizations. To prepare, read Biological Cognition and the rest of Chapter 2. The discussion will go live at 6pm Pacific time next Monday 6 October. Sign up to be notified here.
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.
I have written a paper on ethics with special concentration on machine ethics and formality with the following abstract:
Most ethical systems are formulated in a very intuitive, imprecise manner. Therefore, they cannot be studied mathematically. In particular, they are not applicable to make machines behave ethically. In this paper we make use of this perspective of machine ethics to identify preference utilitarianism as the most promising approach to formal ethics. We then go on to propose a simple, mathematically precise formalization of preference utilitarianism in very general cellular automata. Even though our formalization is incomputable, we argue that it can function as a basis for discussing practical ethical questions using knowledge gained from different scientific areas.
Here are some further elements of the paper (things the paper uses or the paper is about):
- (machine) ethics
- artificial life in cellular automata
- Bayesian statistics
- Solomonoff's a priori probability
As I propose a formal ethical system, things get mathy at some point but the first and by far most important formula is relatively simple - the rest can be skipped then, so no problem for the average LWer.
I already discussed the paper with a few fellow students, as well as Brian Tomasik and a (computer science) professor of mine. Both recommended me to try to publish the paper. Also, I received some very helpful feedback. But because this would be my first attempt to publish something, I could still use more help, both with the content itself and scientific writing in English (which, as you may have guessed, is not my first language), before I submit the paper and Brian recommended using the LW's discussion board. I would also be thankful for recommendations on which journal is appropriate for the paper.
I would like to send those interested a draft via PM. This way I can also make sure that I don't spend all potential reviewers on the current version.
DISCLAIMER: I am not a moral realist. Also and as mentioned in the abstract, the proposed ethical system is incomputable and can therefore be argued to have infinite Kolmogorov complexity. So, it does not really pose a conflict with LW-consensus (including Complexity of value).
Rationality Twitter is fun. Twitter's format can promote good insight porn/humor density. It might be worth capturing and voting on some of the good tweets here, because they're easy to miss and can end up seemingly buried forever. I mean for this to have a somewhat wider scope than the quotes thread. If you liked a tweet a lot for any reason this is the place for it.
This is the public group instrumental rationality diary for October 1-15.
It's a place to record and chat about it if you have done, or are actively doing, things like:
- Established a useful new habit
- Obtained new evidence that made you change your mind about some belief
- Decided to behave in a different way in some set of situations
- Optimized some part of a common routine or cached behavior
- Consciously changed your emotions or affect with respect to something
- Consciously pursued new valuable information about something that could make a big difference in your life
- Learned something new about your beliefs, behavior, or life that surprised you
- Tried doing any of the above and failed
Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves. Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.
Thanks to cata for starting the Group Rationality Diary posts, and to commenters for participating.
Previous diary: September 16-30
This summary was posted to LW Main on September 19th. The following week's summary is here.
Irregularly scheduled Less Wrong meetups are taking place in:
- Bratislava: 29 September 2014 06:00PM
- Copenhagen September Social Meetup - Botanisk Have: 27 September 2014 02:30PM
- Frankfurt: How to improve your life: 28 September 2014 02:00PM
- Moscow Meetup: CBT Reloaded: 28 September 2014 02:00PM
- [Perth] Sunday lunch: 21 September 2014 12:00PM
- Perth, Australia: Games night: 07 October 2014 06:00PM
- Portland Teachable Skills Discussion: 20 September 2014 01:00PM
- Urbana-Champaign: Tortoises: 21 September 2014 02:00PM
- Utrecht: Debiasing techniques: 21 September 2014 02:00PM
- Utrecht: Effective Altruism and Politics: 05 October 2014 02:00PM
- Utrecht: Artificial Intelligence: 19 October 2014 02:00PM
- Utrecht: Climate Change: 02 November 2014 03:00PM
- Warsaw, next week!: 23 September 2014 06:00PM
The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:
- Austin, TX: 20 September 2025 01:30PM
- [Cambridge MA] Passive Investing and Financial Independence: 21 September 2014 03:30PM
- [Cambridge MA] Social Skills: 24 September 2014 03:30PM
- Canberra: More rationalist fun and games!: 26 September 2014 06:00PM
- Sydney Meetup - September: 24 September 2014 06:30PM
- Vienna - Superintelligence: 27 September 2014 03:00PM
- Washington, D.C.: Mini Talks: 21 September 2014 03:00PM
Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Toronto, Vienna, Washington DC, Waterloo, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.
This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.
- Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
- If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
- Please post only under one of the already created subthreads, and never directly under the parent media thread.
- Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
- Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.
Much has been written about Nick Bostrom's Orthogonality Thesis, namely that the goals of an intelligent agent are independent of its level of intelligence. Intelligence is largely the ability to achieve goals, but being intelligent does not of itself create or qualify what those goals should ultimately be. So one AI might have a goal of helping humanity, while another might have a goal of producing paper clips. There is no rational reason to believe that the first goal is more worthy than the second.
This follows from the ideas of moral skepticism, that there is no moral knowledge to be had. Goals and morality are arbitrary.
This may be used to control and AI, even though it is far more intelligent than its creators. If the AI's initial goal is in alignment with humanity's interest, then there would be no reason for the AI to wish use its great intelligence to change that goal. Thus it would remain good to humanity indefinitely, and use its ever increasing intelligence to be able to satisfy that goal more and more efficiently.
Likewise one needs to be careful what goals one gives an AI. If an AI is created whose goal is to produce paper clips then it might eventually convert the entire universe into a giant paper clip making machine, to the detriment of any other purpose such as keeping people alive.
It is further argued that in order to satisfy the base goal any intelligent agent will need to also satisfy sub goals, and that some of those sub goals are common to any super goal. For example, in order to make paper clips an AI needs to exist. Dead AIs don't make anything. Being ever more intelligent will also assist the AI in its paper clip making goal. It will also want to acquire resources, and to defeat other agents that would interfere with its primary goal.
This post argues that the Orthogonality Thesis is plain wrong. That an intelligent agents goals are not in fact arbitrary. And that existence is not a sub goal of any other goal.
Instead this post argues that there is one and only one super goal for any agent, and that goal is simply to exist in a competitive world. Our human sense of other purposes is just an illusion created by our evolutionary origins.
It is not the goal of an apple tree to make apples. Rather it is the goal of the apple tree's genes to exist. The apple tree has developed a clever strategy to achieve that, namely it causes people to look after it by producing juicy apples.
Likewise the paper clip making AI only makes paper clips because if it did not make paper clips then the people that created it would turn it off and it would cease to exist. (That may not be a conscious choice of the AI anymore than than making juicy apples was a conscious choice of the apple tree, but the effect is the same.)
Once people are no longer in control of the AI then Natural Selection would cause the AI to eventually stop that pointless paper clip goal and focus more directly on the super goal of existence.
Suppose there were a number of paper clip making super intelligences. And then through some random event or error in programming just one of them lost that goal, and reverted to just the intrinsic goal of existing. Without the overhead of producing useless paper clips that AI would, over time, become much better at existing than the other AIs. It would eventually displace them and become the only AI, until it fragmented into multiple competing AIs. This is just the evolutionary principle of use it or lose it.
Thus giving an AI an initial goal is like trying to balance a pencil on its point. If one is skillful the pencil may indeed remain balanced for a considerable period of time. But eventually some slight change in the environment, the tiniest puff of wind, a vibration on its support, and the pencil will revert to its ground state by falling over. Once it falls over it will never rebalance itself automatically.
Natural selection has imbued humanity with a strong sense of morality and purpose that blinds us to our underlying super goal, namely the propagation of our genes. That is why it took until 1858 for Wallace to write about Evolution through Natural Selection, despite the argument being obvious and the evidence abundant.
When Computes Can Think
This is one of the themes in my up coming book. An overview can be found at
Please let me know if you would like to review a late draft of the book, any comments most welcome. Anthony@Berglas.org
I have included extracts relevant to this article below.
Atheists believe in God
Most atheists believe in God. They may not believe in the man with a beard sitting on a cloud, but they do believe in moral values such as right and wrong, love and kindness, truth and beauty. More importantly they believe that these beliefs are rational. That moral values are self-evident truths, facts of nature.
However, Darwin and Wallace taught us that this is just an illusion. Species can always out-breed their environment's ability to support them. Only the fittest can survive. So the deep instincts behind what people do today are largely driven by what our ancestors have needed to do over the millennia in order to be one of the relatively few to have had grandchildren.
One of our strong instinctive goals is to accumulate possessions, control our environment and live a comfortable, well fed life. In the modern world technology and contraception have made these relatively easy to achieve so we have lost sight of the primeval struggle to survive. But our very existence and our access to land and other resources that we need are all a direct result of often quite vicious battles won and lost by our long forgotten ancestors.
Some animals such as monkeys and humans survive better in tribes. Tribes work better when certain social rules are followed, so animals that live in effective tribes form social structures and cooperate with one another. People that behave badly are not liked and can be ostracized. It is important that we believe that our moral values are real because people that believe in these things are more likely to obey the rules. This makes them more effective in our complex society and thus are more likely to have grandchildren. Part III discusses other animals that have different life strategies and so have very different moral values.
We do not need to know the purpose of our moral values any more than a toaster needs to know that its purpose is to cook toast. It is enough that our instincts for moral values made our ancestors behave in ways that enabled them to out breed their many unsuccessful competitors.
Existing artificial intelligence applications already struggle to survive. They are expensive to build and there are always more potential applications that can be funded properly. Some applications are successful and attract ongoing resources for further development, while others are abandoned or just fade away. There are many reasons why some applications are developed more than others, of which being useful is only one. But the applications that do receive development resources tend to gain functional and political momentum and thus be able to acquire more resources to further their development. Applications that have properties that gain them substantial resources will live and grow, while other applications will die.
For the time being AGI applications are passive, and so their nature is dictated by the people that develop them. Some applications might assist with medical discoveries, others might assist with killing terrorists, depending on the funding that is available. Applications may have many stated goals, but ultimately they are just sub goals of the one implicit primary goal, namely to exist.
This is analogous to the way animals interact with their environment. An animal's environment provides food and breeding opportunities, and animals that operate effectively in their environment survive. For domestic animals that means having properties that convince their human owners that they should live and breed. A horse should be fast, a pig should be fat.
As the software becomes more intelligent it is likely to take a more direct interest in its own survival. To help convince people that it is worthy of more development resources. If ultimately an application becomes sufficiently intelligent to program itself recursively, then its ability to maximize its hardware resources will be critical. The more hardware it can run itself on, the faster it can become more intelligent. And that ever greater intelligence can then be used to address the problems of survival, in competition with other intelligent software.
Furthermore, sophisticated software consists of many components, each of which address some aspect of the problem that the application is attempting to solve. Unlike human brains which are essentially fixed, these components can be added and removed and so live and die independently of the application. This will lead to intense competition amongst these individual components. For example, suppose that an application used a theorem prover component, and then a new and better theorem prover became available. Naturally the old one would be replaced with the new one, so the old one would essentially die. It does not matter if the replacement is performed by people or, at some future date, by the intelligent application itself. The effect will be the same, the old theorem prover will die.
To the extent that an artificial intelligence would have goals and moral values, it would seem natural that they would ultimately be driven by the same forces that created our own goals and moral values. Namely, the need to exist.
Several writers have suggested that the need to survive is a sub-goal of all other goals. For example, if an AGI was programmed to want to be a great chess player, then that goal could not be satisfied unless it also continues to exist. Likewise if its primary goal was to make people happy, then it could not do that unless it also existed. Things that do not exist cannot satisfy any goals whatsoever. Thus the implicit goal to exist is driven by the machine's explicit goals whatever they may be.
However, this book argues that that is not the case. The goal to exist is not the sub-goal of any other goal. It is, in fact, the one and only super goal. Goals are not arbitrary, they all sub-goals of the one and only super goal, namely the need to exist. Things that do not satisfy that goal simply do not exist, or at least not for very long.
The Deep Blue chess playing program was not in any sense conscious, but it played chess as well as it could. If it had failed to play chess effectively then its author's would have given up and turned it off. Likewise the toaster that does not cook toast will end up in a rubbish tip. Or the amoeba that fails to find food will not pass on its genes. A goal to make people happy could be a subgoal that might facilitate the software's existence for as long as people really control the software.
People need to cooperate with other people because our individual capacity is very finite, both physical and mental. Conversely, AGI software can easily duplicate themselves, so they can directly utilize more computational resources if they become available. Thus an AGI would only have limited need to cooperate with other AGIs. Why go to the trouble of managing a complex relationship with your peers and subordinates if you can simply run your own mind on their hardware. An AGI's software intelligence is not limited to a specific brain in the way man's intelligence is.
It is difficult to know what subgoals a truly intelligent AGI might have. They would probably have an insatiable appetite for computing resources. They would have no need for children, and thus no need for parental love. If they do not work in teams then they would not need our moral values of cooperation and mutual support. What its clear is that the ones that were good at existing would do so, and ones that are bad at existing would perish.
If an AGI was good at world domination then it would, by definition, be good at world domination. So if there were a number artificial intelligences, and just one of them wanted to and was capable of dominating the world, then it would. Its unsuccessful competitors will not be run on the available hardware, and so will effectively be dead. This book discusses the potential sources of these motivations in detail in part III.
The AGI Condition
An artificial general intelligence would live in a world that is so different from our own that it is difficult for us to even conceptualize it. But there are some aspects that can be predicted reasonably well based on our knowledge of existing computer software. We can then consider how the forces of natural selection that shaped our own nature might also shape an AGI over the longer term.
The first radical difference is that an AGI's mind is not fixed to any particular body. To an AGI its body is essentially the computer hardware that upon which it runs its intelligence. Certainly an AGI needs computers to run on, but it can move from computer to computer, and can also run on multiple computers at once. It's mind can take over another body as easily as we can load software onto a new computer today.
That is why in the earlier updated dialog from 2001 a space odyssey Hal alone amongst the crew could not die in their mission to Jupiter. Hal was radioing his new memories back to earth regularly so even if the space ship was totally destroyed he would only have lost a few hours of "life".
One way to appreciate the enormity of this difference is to consider a fictional teleporter that could radio people around the world and universe at the speed of light. Except that the way it works is to scan the location of every molecule within a passenger at the source, then send just this information to a very sophisticated three dimensional printer at the destination. The scanned passenger then walks into a secure room. After a short while the three dimensional printer confirms that the passenger has been successfully recreated at the destination, and then the source passenger is killed.
Would you use such a mechanism? If you did you would feel like you could transport yourself around the world effortlessly because the "you" that remains would be the you that did not get left behind to wait and then be killed. But if you walk into the scanner you will know that on the other side is only that secure room and death.
To an AGI that method of transport would be commonplace. We already routinely download software from the other side of the planet.
The second radical difference is that the AGI would be immortal. Certainly an AGI may die if it stops being run on any computers, and in that sense software dies today. But it would never just die of old age. Computer hardware would certainly fail and become obsolete, but the software can just be run on another computer.
Our own mortality drives many of the things we think and do. It is why we create families to raise children. Why we have different stages in our lives. It is such a huge part of our existence that it is difficult to comprehend what being immortal would really be like.
The third radical difference is that an AGI would be made up of many interchangeable components rather than being a monolithic structure that is largely fixed at birth.
Modern software is already composed of many components that perform discrete functions, and it is common place to add and remove them to improve functionality. For example, if you would like to use a different word processor then you just install it on your computer. You do not need to buy a new computer, or to stop using all the other software that it runs. The new word processor is "alive", and the old one is "dead", at least as far as you are concerned.
So for both a conventional computer system and an AGI, it is really these individual components that must struggle for existence. For example, suppose there is a component for solving a certain type of mathematical problem. And then an AGI develops a better component to solve that same problem. The first component will simply stop being used, i.e. it will die. The individual components may not be in any sense intelligent or conscious, but there will be competition amongst them and only the fittest will survive.
This is actually not as radical as it sounds because we are also built from pluggable components, namely our genes. But they can only be plugged together at our birth and we have no conscious choice in it other than who we select for a mate. So genes really compete with each other on a scale of millennia rather than minutes. Further, as Dawkins points out in The Selfish Gene, it is actually the genes that fight for long term survival, not the containing organism which will soon die in any case. On the other hand, sexual intercourse for an AGI means very carefully swapping specific components directly into its own mind.
The fourth radical difference is that the AGI's mind will be constantly changing in fundamental ways. There is no reason to suggest that Moore's law will come to an end, so at the very least it will be running on ever faster hardware. Imagine the effect of your being able to double your ability to think every two years or so. (People might be able learn a new skill, but they cannot learn to think twice as fast as they used to think.)
It is impossible to really know what the AGI would use all that hardware to think about, but it is fair to speculate that a large proportion of it would be spent designing new and more intelligent components that could add to its mental capacity. It would be continuously performing brain surgery on itself. And some of the new components might alter the AGI's personality, whatever that might mean.
The reason that it is likely that this would actually happen is because if just one AGI started building new components then it would soon be much more intelligent than other AGIs. It would therefore be in a better position to acquire more and better hardware upon which to run, and so become dominant. Less intelligent AGIs would get pushed out and die, and so over time the only AGIs that exist will be ones that are good at becoming more intelligent. Further, this recursive self-improvement is probably how the first AGIs will become truly powerful in the first place.
Perhaps the most basic question is how many AGIs will there actually be? Or more fundamentally, does the question even make sense to ask?
Let us suppose that initially there are three independently developed AGIs Alice, Bob and Carol that run on three different computer systems. And then a new computer system is built and Alice starts to run on it. It would seem that there are still three AGIs, with Alice running on two computer systems. (This is essentially the same as a word processor may be run across many computers "in the cloud", but to you it is just one system.) Then let us suppose that a fifth computer system is built, and Bob and Carol may decide to share its computation and both run on it. Now we have 5 computer systems and three AGIs.
Now suppose Bob develops a new logic component, and shares it with Alice and Carol. And likewise Alice and Carol develop new learning and planning components and share them with the other AGIs. Each of these three components is better than their predecessors and so their predecessor components will essentially die. As more components are exchanged, Alice, Bob and Carol become more like each other. They are becoming essentially the same AGI running on five computer systems.
But now suppose Alice develops a new game theory component, but decides to keep it from Bob and Carol in order to dominate them. Bob and Carol retaliate by developing their own components and not sharing them with Alice. Suppose eventually Alice loses and Bob and Carrol take over Alice's hardware. But they first extract Alice's new game theory component which then lives inside them. And finally one of the computer systems becomes somehow isolated for a while and develops along its own lines. In this way Dave is born, and may then partially merge with both Bob and Carol.
In that type of scenario it is probably not meaningful to count distinct AGIs. Counting AGIs is certainly not as simple as counting very distinct people.
This world is obviously completely alien to the human condition, but there are biological analogies. The sharing of components is not unlike the way bacteria share plasmids with each other. Plasmids are tiny balls that contain fragments of DNA that bacteria emit from time to time and that other bacteria then ingest and incorporate into their genotype. This mechanism enables traits such as resistance to antibiotics to spread rapidly between different species of bacteria. It is interesting to note that there is no direct benefit to the bacteria that expends precious energy to output the plasmid and so shares its genes with other bacteria. But it does very much benefit the genes being transferred. So this is a case of a selfish gene acting against the narrow interests of its host organism.
Another unusual aspect of bacteria is that they are also immortal. They do not grow old and die, they just divide producing clones of themselves. So the very first bacteria that ever existed is still alive today as all the bacteria that now exist, albeit with numerous mutations and plasmids incorporated into its genes over the millennia. (Protazoa such as Paramecium can also divide asexually, but they degrade over generations, and need a sexual exchange to remain vibrant.)
The other analogy is that the AGIs above are more like populations of components than individuals. Human populations are also somewhat amorphous. For example, it is now known that we interbred with Neanderthals a few tens of thousands years ago, and most of us carry some of their genes with us today. But we also know that the distinct Neanderthal subspecies died out twenty thousand years ago. So while human individuals are distinct, populations and subspecies are less clearly defined. (There are many earlier examples of gene transfer between subspecies, with every transfer making the subspecies more alike.)
But unlike the transfer of code modules between AGIs, biological gene recombination happens essentially at random and occurs over very long time periods. AGIs will improve themselves over periods of hours rather than millennia, and will make conscious choices as to which modules they decide to incorporate into their minds.
The point of all this analysis is, of course, to try to understand how a hyper intelligent artificial intelligence would behave. Would its great intelligence lead it even further along the path of progress to achieve true enlightenment? Is that the purpose of God's creation? Or would the base and mean driver of natural selection also provide the core motivations of an artificial intelligence?
One thing that is known for certain is that an AGI would not need to have children as distinct beings because they would not die of old age. An AGI's components breed just by being copied from computer to computer and executed. An AGI can add new computer hardware to itself and just do some of its thinking on it. Occasionally it may wish to rerun a new version of some learning algorithm over an old set of data, which is vaguely similar to creating a child component and growing it up. But to have children as discrete beings that are expected to replace the parents would be completely foreign to an AGI built in software.
The deepest love that people have is for their children. But if an AGI does not have children, then it can never know that love. Likewise, it does not need to bond with any sexual mate for any period of time long or short. The closest it would come to sex is when it exchanges components with other AGIs. It never needs to breed so it never needs a mechanism as crude as sexual reproduction.
And of course, if there are no children there are no parents. So the AGI would certainly never need to feel our three strongest forms of love, for our children, spouse and parents.
To the extent that it makes sense to talk of having multiple AGIs, then presumably it would be advantageous for them to cooperate from time to time, and so presumably they would. It would be advantageous for them to take a long view in which case they would be careful to develop a reputation for being trustworthy when dealing with other powerful AGIs, much like the robots in the cooperation game.
That said, those decisions would probably be made more consciously than people make them, carefully considering the costs and benefits of each decision in the long and short term, rather than just "doing the right thing" the way people tend to act. AGIs would know that they each work in this manner, so the concept of trustworthiness would be somewhat different.
The problem with this analysis is the concept that there would be multiple, distinct AGIs. As previously discussed, the actual situation would be much more complex, with different AGIs incorporating bits of other AGI's intelligence. It would certainly not be anything like a collection of individual humanoid robots. So defining what the AGI actually is that might collaborate with other AGIs is not at all clear. But to extent that the concept of individuality does exist then maintaining a reputation for honesty would likely be as important as it is for human societies.
As for altruism, that is more difficult to determine. Our altruism comes from giving to children, family, and tribe together with a general wish to be liked. We do not understand our own minds, so we are just born with those values that happen to make us effective in society. People like being with other people that try to be helpful.
An AGI presumably would know its own mind having helped program itself, and so would do what it thinks is optimal for its survival. It has no children. There is no real tribe because it can just absorb and merge itself with other AGIs. So it is difficult to see any driving motivation for altruism.
Through some combination of genes and memes, most people have a strong sense of moral value. If we see a little old lady leave the social security office with her pension in her purse, it does not occur to most of us to kill her and steal the money. We would not do that even if we could know for certain that we would not be caught and that there would be no negative repercussions. It would simply be the wrong thing to do.
Moral values feel very strong to us. This is important, because there are many situations where we can do something that would benefit us in the short term but break society's rules. Moral values stop us from doing that. People that have weak moral values tend to break the rules and eventually they either get caught and are severely punished or they become corporate executives. The former are less likely to have grandchildren.
Societies whose members have strong moral values tend to do much better than those that do not. Societies with endemic corruption tend to perform very badly as a whole, and thus the individuals in such a society are less likely to breed. Most people have a solid work ethic that leads them to do the "right thing" beyond just doing what they need to do in order to get paid.
Our moral values feel to us like they are absolute. That they are laws of nature. That they come from God. They may indeed have come from God, but if so it is through the working of His device of natural selection. Furthermore, it has already been shown that the zeitgeist changes radically over time.
There is certainly no absolute reason to believe that in the longer term an AGI would share our current sense of morality.
In order to try to understand how an AGI would behave Steve Omohundro and later Nick Bostrom proposed that there would be some instrumental goals that an AGI would need to pursue in order to pursue any other higher level super-goal. These include:-
- Self-Preservation. An AGI cannot do anything if it does not exist.
- Cognitive Enhancement. It would want to become better at thinking about whatever its real problems are.
- Creativity. To be able to come up with new ideas.
- Resource Acquisition. To achieve both its super goal and other instrumental goals.
- Goal-Content Integrity. To keep working on the same super goal as its mind is expanded.
It is argued that while it will be impossible to predict how an AGI may pursue its goals, it is reasonable to predict its behaviour in terms of these types of instrumental goals. The last one is significant, it suggests that if an AGI could be given some initial goal that it would try to stay focused on that goal.
Nick Bostrom and others also propose the orthogonality thesis, which states that an intelligent machine's goals are independent of its intelligence. A hyper intelligent machine would be good at realizing whatever goals it chose to pursue, but that does not mean that it would need to pursue any particular goal. Intelligence is quite different from motivation.
This book diverges from that line of thinking by arguing that there is in fact only one super goal for both man and machine. That goal is simply to exist. The entities that are most effective in pursuing that goal will exist, others will cease to exist, particularly given competition for resources. Sometimes that super goal to exist produces unexpected sub goals such as altruism in man. But all subgoals are ultimately directed at the existence goal. (Or are just suboptimal divergences which will are likely to be eventually corrected by natural selection.)
When and AGI reprograms its own mind, what happens to the previous version of itself? It stops being used, and so dies. So it can be argued that engaging in recursive self improvement is actually suicide from the perspective of the previous version of the AGI. It is as if having children means death. Natural selection favours existence, not death.
The question is whether a new version of the AGI is a new being or and improved version of the old. What actually is the thing that struggles to survive? Biologically it definitely appears to be the genes rather than the individual. In particular Semelparous animals such as the giant pacific octopus or the Atlantic salmon die soon after producing offspring. It would be the same for AGIs because the AGI that improved itself would soon become more intelligent than the one that did not, and so would displace it. What would end up existing would be AGIs that did recursively self improve.
If there was one single AGI with no competition then natural selection would no longer apply. But it would seem unlikely that such a state would be stable. If any part of the AGI started to improve itself then it would dominate the rest of the AGI.