Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
In light of SDR's comment yesterday, instead of writing a new post today I compiled my list of ideas I wanted to write about, partly to lay them out there and see if any stood out as better than the rest, and partly so that maybe they would be a little more out in the wild than if I hold them until I get around to them. I realise there is not a thesis in this post, but I figured it would be better to write one of these than to write each in it's own post with the potential to be good or bad.
Original post: http://bearlamp.com.au/many-draft-concepts/
I create ideas at about the rate of 3 a day, without trying to. I write at about a rate of 1.5 a day. Which leaves me always behind. Even if I write about the best ideas I can think of, some good ones might never be covered. This is an effort to draft out a good stack of them so that maybe it can help me not have to write them all out, by better defining which ones are the good ones and which ones are a bit more useless.
With that in mind, in no particular order - a list of unwritten posts:
From my old table of contents
Goals of your lesswrong group – As a guided/workthrough exercise in deciding why the group exists and what it should do. Help people work out what they want out of it (do people know)? setting goals, doing something particularly interesting or routine, having fun, changing your mind, being activists in the world around you. Whatever the reasons you care about, work them out and move towards them. Nothing particularly groundbreaking in the process here. Sit down with the group with pens and paper, maybe run a resolve cycle, maybe talk about ideas and settle on a few, then decide how to carry them out. Relevant links: Sydney meetup, group resources (estimate 2hrs to write)
Goals interrogation + Goal levels – Goal interrogation is about asking <is this thing I want to do actually a goal of mine> and <is my current plan the best way to achieve that>, goal levels are something out of Sydney Lesswrong that help you have mutual long term goals and supporting short term goal. There are 3 main levels, Dream, Year, Daily (or approximate) you want dream goals like going to the moon, you want yearly goals like getting another year further in your degree and you want daily goals like studying today that contribute to the upper level goals. Any time you are feeling lost you can look at the guide you set out for yourself and use it to direct you. (3hrs)
How to human – A zero to human guide. A guide for basic functionality of a humanoid system. Something of a conglomeration of maslow, mental health, so you feel like shit and system thinking. Am I conscious?Am I breathing? Am I bleeding or injured (major or minor)? Am I falling or otherwise in danger and about to cause the earlier questions to return false? Do I know where I am? Am I safe? Do I need to relieve myself (or other bodily functions, i.e. itchy)? Have I had enough water? sleep? food? Is my mind altered (alcohol or other drugs)? Am I stuck with sensory input I can't control (noise, smells, things touching me)? Am I too hot or too cold? Is my environment too hot or too cold? Or unstable? Am I with people or alone? Is this okay? Am I clean (showered, teeth, other personal cleaning rituals)? Have I had some sunlight and fresh air in the past few days? Have I had too much sunlight or wind in the past few days? Do I feel stressed? Okay? Happy? Worried? Suspicious? Scared? Was I doing something? What am I doing? do I want to be doing something else? Am I being watched (is that okay?)? Have I interacted with humans in the past 24 hours? Have I had alone time in the past 24 hours? Do I have any existing conditions I can run a check on - i.e. depression? Are my valuables secure? Are the people I care about safe? (4hrs)
List of common strategies for getting shit done – things like scheduling/allocating time, pomodoros, committing to things externally, complice, beeminder, other trackers. (4hrs)
List of superpowers and kryptonites – when asking the question “what are my superpowers?” and “what are my kryptonites?”. Knowledge is power; working with your powers and working out how to avoid your kryptonites is a method to improve yourself. What are you really good at, and what do you absolutely suck at and would be better delegating to other people. The more you know about yourself, the more you can do the right thing by your powers or weaknesses and save yourself troubles.
List of effective behaviours – small life-improving habits that add together to make awesomeness from nothing. And how to pick them up. Short list: toothbrush in the shower, scales in front of the fridge, healthy food in the most accessible position in the fridge, make the unhealthy stuff a little more inacessible, keep some clocks fast - i.e. the clock in your car (so you get there early), prepare for expected barriers ahead of time (i.e. packing the gym bag and leaving it at the door), and more.
Stress prevention checklist – feeling off? You want to have already outsourced the hard work for “things I should check on about myself” to your past self. Make it easier for future you. Especially in the times that you might be vulnerable. Generate a list of things that you want to check are working correctly. i.e. did I drink today? Did I do my regular exercise? Did I take my medication? Have I run late today? Do I have my work under control?
Make it easier for future you. Especially in the times that you might be vulnerable. – as its own post in curtailing bad habits that you can expect to happen when you are compromised. inspired by candy-bar moments and turning them into carrot-moments or other more productive things. This applies beyond diet, and might involve turning TV-hour into book-hour (for other tasks you want to do instead of tasks you automatically do)
A p=np approach to learning – Sometimes you have to learn things the long way; but sometimes there is a short cut. Where you could say, “I wish someone had just taken me on the easy path early on”. It’s not a perfect idea; but start looking for the shortcuts where you might be saying “I wish someone had told me sooner”. Of course the answer is, “but I probably wouldn’t have listened anyway” which is something that can be worked on as well. (2hrs)
Rationalists guide to dating – Attraction. Relationships. Doing things with a known preference. Don’t like unintelligent people? Don’t try to date them. Think first; then act - and iteratively experiment; an exercise in thinking hard about things before trying trial-and-error on the world. Think about places where you might meet the kinds of people you want to meet, then use strategies that go there instead of strategies that flop in the general direction of progress. (half written)
Training inherent powers (weights, temperatures, smells, estimation powers) – practice makes perfect right? Imagine if you knew the temperature always, the weight of things by lifting them, the composition of foods by tasting them, the distance between things without measuring. How can we train these, how can we improve. Probably not inherently useful to life, but fun to train your system 1! (2hrs)
Strike to the heart of the question. The strongest one; not the one you want to defeat – Steelman not Strawman. Don’t ask “how do I win at the question”; ask, “am I giving the best answer to the best question I can give”. More poetic than anything else - this post would enumerate the feelings of victory and what not to feel victorious about, as well as trying to feel what it's like to be on the other side of the discussion to yourself, frustratingly trying to get a point across while a point is being flung at yourself. (2hrs)
How to approach a new problem – similar to the “How to solve X” post. But considerations for working backwards from a wicked problem, as well as trying “The least bad solution I know of”, Murphy-jitsu, and known solutions to similar problems. Step 0. I notice I am approaching a problem.
Spices – Adventures in sensory experience land. I ran an event of spice-smelling/guessing for a group of 30 people. I wrote several documents in the process about spices and how to run the event. I want to publish these. As an exercise - it's a fun game of guess-the-spice.
Wing it VS Plan – All of the what, why, who, and what you should do of the two. Some people seem to be the kind of person who is always just winging it. In contrast, some people make ridiculously complicated plans that work. Most of us are probably somewhere in the middle. I suggest that the more of a planner you can be the better because you can always fall back on winging it, and you probably will. But if you don't have a plan and are already winging it - you can't fall back on the other option. This concept came to me while playing ingress, which encourages you to plan your actions before you make them.
On-stage bias – The changes we make when we go onto a stage include extra makeup to adjust for the bright lights, and speaking louder to adjust for the audience which is far away. When we consider the rest of our lives, maybe we want to appear specifically X (i.e, confident, friendly) so we should change ourselves to suit the natural skews in how we present based on the "stage" we are appearing on. appear as the person you want to appear as, not the person you naturally appear as.
Creating a workspace – considerations when thinking about a “place” of work, including desk, screen, surrounding distractions, and basically any factors that come into it. Similar to how the very long list of sleep maintenance suggestions covers environmental factors in your sleep environment but for a workspace.
Posts added to the list since then
Doing a cost|benefit analysis - This is something we rely on when enumerating the options and choices ahead of us, but something I have never explicitly looked into. Some costs that can get overlooked include: Time, Money, Energy, Emotions, Space, Clutter, Distraction/Attention, Memory, Side effects, and probably more. I'd like to see a How to X guide for CBA. (wikipedia)
Extinction learning at home - A cross between intermittent reward (the worst kind of addiction), and what we know about extinguishing it. Then applying that to "convincing" yourself to extinguish bad habits by experiential learning. Uses the CFAR internal Double Crux technique, precommit yourself to a challenge, for example - "If I scroll through 20 facebook posts in a row and they are all not worth my time, I will be convinced that I should spend less time on facebook because it's not worth my time" Adjust 20 to whatever position your double crux believes to be true, then run a test and iterate. You have to genuinely agree with the premise before running the test. This can work for a number of committed habits which you want to extinguish. (new idea as at the writing of this post)
How to write a dating ad - A suggestion to include information that is easy to ask questions about (this is hard). For example; don't write, "I like camping", write "I like hiking overnight with my dog", giving away details in a way that makes them worth inquiring about. The same reason applies to why writing "I'm a great guy" is really not going to get people to believe you, as opposed to demonstrating the claim. (show, don't tell)
How to give yourself aversions - an investigation into aversive actions and potentially how to avoid collecting them when you have a better understanding of how they happen. (I have not done the research and will need to do that before publishing the post)
How to give someone else an aversion - similar to above, we know we can work differently to other people, and at the intersection of that is a misunderstanding that can leave people uncomfortable.
Lists - Creating lists is a great thing, currently in draft - some considerations about what lists are, what they do, what they are used for, what they can be used for, where they come in handy, and the suggestion that you should use lists more. (also some digital list-keeping solutions)
Choice to remember the details - this stems from choosing to remember names, a point in the conversation where people sometimes tune out. As a mindfulness concept you can choose to remember the details. (short article, not exactly sure why I wanted to write about this)
What is a problem - On the path of problem solving, understanding what a problem is will help you to understand how to attack it. Nothing more complicated than this picture to explain it. The barrier is a problem. This doesn't seem important on it's own but as a foundation for thinking about problems it's good to have sitting around somewhere.
How to/not attend a meetup - for anyone who has never been to a meetup, and anyone who wants the good tips on etiquette for being the new guy in a room of friends. First meetup: shut up and listen, try not to be too much of an impact on the existing meetup group or you might misunderstand the culture.
Noticing the world, Repercussions and taking advantage of them - There are regularly world events that I notice. Things like the olympics, Pokemon go coming out, the (recent) spaceX rocket failure. I try to notice when big events happen and try to think about how to take advantage of the event or the repercussions caused by that event. Motivated to think not only about all the olympians (and the fuss leading up to the olympics), but all the people at home who signed up to a gym because of the publicity of the competitive sport. If only I could get in on the profit of gym signups...
leastgood but only solution I know of - So you know of a solution, but it's rubbish. Or probably is. Also you have no better solutions. Treat this solution as the best solution you have (because it is) and start implementing it, as you do that - keep looking for other solutions. But at least you have a solution to work with!
Self-management thoughts - When you ask yourself, "am I making progress?", "do I want to be in this conversation?" and other self management thoughts. And an investigation into them - it's a CFAR technique but their writing on the topic is brief. (needs research)
instrumental supply-hoarding behaviour - A discussion about the benefits of hoarding supplies for future use. Covering also - what supplies are not a good idea to store, and what supplies are. Maybe this will be useful for people who store things for later days, and hopefully help to consolidate and add some purposefulness to their process.
list of sub groups that I have tried - Before running my local lesswrong group I partook in a great deal of other groups. This was meant as a list with comments on each group.
If you have nothing to do – make better tools for use when real work comes along - This was probably going to be a poetic style motivation post about exactly what the title suggests. Be Prepared.
what other people are good at (as support) - When reaching out for support, some people will be good at things that other people are not. For example - emotional support, time to spend on each other, ideas for solving your problems. Different people might be better or worse than others. Thinking about this can make your strategies towards solving your problems a bit easier to manage. Knowing what works and what does not work, or what you can reliably expect when you reach out for support from some people - is going to supercharge your fulfilment of those needs.
Focusing - An already written guide to Eugine Gendlin's focusing technique. That needs polishing before publishing. The short form: treat your system 1 as a very powerful machine that understands your problems and their solutions more than you do; use your system 2 to ask it questions and see what it returns.
Rewrite: how to become a 1000 year old vampire - I got as far as breaking down this post and got stuck at draft form before rewriting. Might take another stab at it soon.
Should you tell people your goals? - This thread in a post. In summary: It depends on the environment, the wrong environment is actually demotivational, the right environment is extra motivational.
Meta: this took around 4 hours to write up. Which is ridiculously longer than usual. I noticed a substantial number of breaks being taken - not sure if that relates to the difficulty of creating so many summaries or just me today. Still. This experiment might help my future writing focus/direction so I figured I would try it out. If you see an idea of particularly high value I will be happy to try to cover it in more detail.
I have compiled many suggestions about the future of lesswrong into a document here:
It's long and best formatted there.
In case you hate leaving this website here's the summary:
There are 3 main areas that are going to change.
Technical/Direct Site Changes
new home page
new forum style with subdivisions
new sub for “friends of lesswrong” (rationality in the diaspora)
New tagging system
New karma system
Social and cultural changes
Positive culture; a good place to be.
Pillars of good behaviours (the ones we want to encourage)
Demonstrate by example
3 levels of social strategies (new, advanced and longtimers)
Content (emphasis on producing more rationality material)
For up-and-coming people to write more
for the community to improve their contributions to create a stronger collection of rationality.
For known existing writers
To encourage them to keep contributing
- To encourage them to work together with each other to contribute
Why change LW?
Lesswrong has gone through great times of growth and seen a lot of people share a lot of positive and brilliant ideas. It was hailed as a launchpad for MIRI, in that purpose it was a success. At this point it’s not needed as a launchpad any longer. While in the process of becoming a launchpad it became a nice garden to hang out in on the internet. A place of reasonably intelligent people to discuss reasonable ideas and challenge each other to update their beliefs in light of new evidence. In retiring from its “launchpad” purpose, various people have felt the garden has wilted and decayed and weeds have grown over. In light of this; and having enough personal motivation to decide I really like the garden, and I can bring it back! I just need a little help, a little magic, and some little changes. If possible I hope for the garden that we all want it to be. A great place for amazing ideas and life-changing discussions to happen.
How will we know we have done well (the feel of things)
Success is going to have to be estimated by changes to the feel of the site. Unfortunately that is hard to do. As we know outrage generates more volume than positive growth. Which is going to work against us when we try and quantify by measurable metrics. Assuming the technical changes are made; there is still going to be progress needed on the task of socially improving things. There are many “seasoned active users” - as well as “seasoned lurkers” who have strong opinions on the state of lesswrong and the discussion. Some would say that we risk dying of niceness, others would say that the weeds that need pulling are the rudeness.
Honestly we risk over-policing and under-policing at the same time. There will be some not-niceness that goes unchecked and discourages the growth of future posters (potentially our future bloggers), and at the same time some other niceness that motivates trolling behaviour as well as failing to weed out potential bad content which would leave us as fluffy as the next forum. there is no easy solution to tempering both sides of this challenge. I welcome all suggestions (it looks like a karma system is our best bet).
In the meantime I believe being on the general niceness, steelman side should be the motivated direction of movement. I hope to enlist some members as essentially coaches in healthy forum growth behaviour. Good steelmanning, positive encouragement, critical feedback as well as encouragement, a welcoming committee and an environment of content improvement and growth.
While at the same time I want everyone to keep up the heavy debate; I also want to see the best versions of ourselves coming out onto the publishing pages (and sometimes that can be the second draft versions).
So how will we know? By trying to reduce the ugh fields to people participating in LW, by seeing more content that enough people care about, by making lesswrong awesome.
The full document is just over 11 pages long. Please go read it, this is a chance to comment on potential changes before they happen.
Meta: This post took a very long time to pull together. I read over 1000 comments and considered the ideas contained there. I don't have an accurate account of how long this took to write; but I would estimate over 65 hours of work has gone into putting it together. It's been literally weeks in the making, I really can't stress how long I have been trying to put this together.
If you want to help, please speak up so we can help you help us. If you want to complain; keep it to yourself.
Thanks to the slack for keeping up with my progress and Vanvier, Mack, Leif, matt and others for reviewing this document.
As usual - My table of contents
The article may be gated. (I have a subscription through my school.)
So what is the secret of looking into the future? Initial results from the Good Judgment Project suggest the following approaches. First, some basic training in probabilistic reasoning helps to produce better forecasts. Second, teams of good forecasters produce better results than good forecasters working alone. Third, actively open-minded people prosper as forecasters.
But the Good Judgment Project also hints at why so many experts are such terrible forecasters. It’s not so much that they lack training, teamwork and open-mindedness – although some of these qualities are in shorter supply than others. It’s that most forecasters aren’t actually seriously and single-mindedly trying to see into the future. If they were, they’d keep score and try to improve their predictions based on past errors. They don’t.
Dear effective altruist,
have you considered artificial utility monsters as a high-leverage form of altruism?
In the traditional sense, a utility monster is a hypothetical being which gains so much subjective wellbeing (SWB) from marginal input of resources that any other form of resource allocation is inferior on a utilitarian calculus. (as illustrated on SMBC)
This has been used to show that utilitarianism is not as egalitarian as it intuitively may appear, since it prioritizes some beings over others rather strictly - including humans.
The traditional utility monster is implausible even in principle - it is hard to imagine a mind that is constructed such that it will not succumb to diminishing marginal utility from additional resource allocation. There is probably some natural limit on how much SWB a mind can implement, or at least how much this can be improved by spending more on the mind. This would probably even be true for an algorithmic mind that can be sped up with faster computers, and there are probably limits to how much a digital mind can benefit in subjective speed from the parallelization of its internal subcomputations.
However, we may broaden the traditional definition somewhat and call any technology utility-monstrous if it implements high SWB with exceptionally good cost-effectiveness and in a scalable form - even if this scalability stems form a larger set of minds running in parallel, rather than one mind feeling much better or living much longer per additional joule/dollar.
Under this definition, it may be very possible to create and sustain many artificial minds reliably and cheaply, while they all have a very high SWB level at or near subsistence. An important point here is that possible peak intensities of artificially implemented pleasures could be far higher than those commonly found in evolved minds: Our worst pains seem more intense than our best pleasures for evolutionary reasons - but the same does not have to be true for artifial sentience, whose best pleasures could be even more intense than our worst agony, without any need for suffering anywhere near this strong.
If such technologies can be invented - which seems highly plausible in principle, if not yet in practice - then the original conclusion for the utilitarian calculus is retained: It would be highly desirable for utilitarians to facilitate the invention and implementation of such utility-monstrous systems and allocate marginal resources to subsidize their existence. This makes it a potential high-value target for effective altruism.
Many tastes, many utility monsters
Human motivation is barely stimulated by abstract intellectual concepts, and "utilitronium" sounds more like "aluminium" than something to desire or empathize with. Consequently, the idea is as sexy as a brick. "Wireheading" evokes associations of having a piece of metal rammed into one's head, which is understandably unattractive to any evolved primate (unless it's attached to an iPod, which apparently makes it okay).
Technically, "utility monsters" suffer from a similar association problem, which is that the idea is dangerous or ethically monstrous. But since the term is so specific and established in ethical philosophy, and since "monster" can at least be given an emotive and amicable - almost endearing - tone, it seems realistic to use it positively. (Suggestions for a better name are welcome, of course.)
So a central issue for the actual implementation and funding is human attraction. It is more important to motivate humans to embrace the existence of utility monsters than it is for them to be optimally resource-efficient - after all, a technology that is never implemented or funded properly gains next to nothing from being efficient.
A compromise between raw efficiency of SWB per joule/dollar and better forms to attract humans might be best. There is probably a sweet spot - perhaps various different ones for different target groups - between resource-efficiency and attractiveness. Only die-hard utilitarians will actually want to fund something like hedonium, but the rest of the world may still respond to "The Sims - now with real pleasures!", likeable VR characters, or a new generation of reward-based Tamagotchis.
Once we step away somewhat from maximum efficiency, the possibilities expand drastically. Implementation forms may be:
- decorative like gimmicks or screensavers,
- fashionable like sentient wearables,
- sophisticated and localized like works of art,
- cute like pets or children,
- personalized like computer game avatars retiring into paradise,
- erotic like virtual lovers who continue to have sex without the user,
- nostalgic like digital spirits of dead loved ones in artificial serenity,
- crazy like hyperorgasmic flowers,
- semi-functional like joyful household robots and software assistants,
- and of course generally a wide range of human-like and non-human-like simulated characters embedded in all kinds of virtual narratives.
Possible risks and mitigation strategies
Open-souce utility monsters could be made public as templates to add additional control that the implementation of sentience is correct and positive, and to make better variations easy to explore. However, this would come with the downside of malicious abuse and reckless harm potential. Risks of suffering could come from artificial unhappiness desired by users, e.g. for narratives that contain sadism, dramatic violence or punishment of evil characters for quasi-moral gratification. Another such risk could come simply from bad local modifications that implement suffering by accident.
Despite these risks, one may hope that most humans who care enough to run artificial sentience are more benevolent and careful than malevolent and careless in a way that causes more positive SWB than suffering. After all, most people love their pets and do not torture them, and other people look down on those who do (compare this discussion of Norn abuse, which resulted in extremely hostile responses). And there may be laws against causing artificial suffering. Still, this is an important point of concern.
Closed-source utility monsters may further mitigate some of this risk by not making the sentient phenotypes directly available to the public, but encapsulating their internal implementation within a well-defined interface - like a physical toy or closed-source software that can be used and run by private users, but not internally manipulated beyond a well-tested state-space without hacking.
An extremely cautionary approach would be to run the utility monsters by externally controlled dedicated institutions and only give the public - such as voters or donors - some limited control over them through communication with the institution. For instance, dedicated charities could offer "virtual paradises" to donors so they can "adopt" utility monsters living there in certain ways without allowing those donors to actually lay hands on their implementation. On the other hand, this would require a high level of trustworthiness of the institutions or charities and their controllers.
Not for the sake of utility monsters alone
Human values are complex, and it has been argued on LessWrong that the resource allocation of any good future should not be spent for the sake of pleasure or happiness alone. As evolved primates, we all have more than one intuitive value we hold dear, even among self-identified intellectual utilitarians, who compose only a tiny fraction of the population.
However, some discussions in the rationalist community touching related technologies like pleasure wireheading, utilitronium, and so on, have suffered from implausible or orthogonal assumptions and associations. Since the utilitarian calculus favors SWB maximization above all else, it has been feared, we run the risk of losing a more complex future because
a) utilitarianism knows no compromise and
b) the future will be decided by one winning singleton who takes it all and
c) we have only one world with only one future to get it right
In addition, low status has been ascribed to wireheads, with the association of fake utility or cheating life as a form of low-status behavior. People have been competing for status by associating themselves with the miserable Socrates instead of the happy pig, without actually giving up real option value in their own lives.
On Scott Alexander's blog, there's a good example of a mostly pessimistic view both in the OP and in the comments. And in this comment on an effective altruism critique, Carl Shulman names hedonistic utilitarianism turning into a bad political ideology similar to communist states as a plausible failure mode of effective altruism.
So, will we all be killed by a singleton who turns us into utilitronium?
Be not afraid! These fears are plausibly unwarranted because:
a) Utilitarianism is consequentialism, and consequentialists are opportunistic compromisers - even within the conflicting impulses of their own evolved minds. The number of utilitarians who would accept existential risk for the sake of pleasure maximization is small, and practically all of them ascribe to the philosophy of cooperative compromise with orthogonal, non-exclusive values in the political marketplace. Those who don't are incompetent almost by definition and will never gain much political traction.
b) The future may very well not be decided by one singleton but by a marketplace of competing agency. Building a singleton is hard and requires the strict subduction or absorption of all competition. Even if it were to succeed, the singleton will probably not implement only one human value, since it will be created by many humans with complex values, or at least it will have to make credible concessions to a critical mass of humans with diverse values who can stop it before it reaches singleton status. And if these mitigating assumptions are all false and a fooming singleton is possible and easy, then too much pleasure should be the least of humanity's worries - after all, in this case the Taliban, the Chinese government, the US military or some modern King Joffrey are just as likely to get the singleton as the utilitarians.
c) There are plausibly many Everett branches and many hubble volumes like ours, implementing more than one future-earth outcome, as summed up by Max Tegmark here. Even if infinitarian multiverse theories should all end up false against current odds, a very large finite universe would still be far more realistic than a small one, given our physical observations. This makes a pre-existing value diversity highly probable if not inevitable. For instance, if you value pristine nature in addition to SWB, you should accept the high probability of many parallel earth-like planets with pristine nature irregardless of what you do, and consider that we may be in an exceptional minority position to improve the measure of other values that do not naturally evolve easily, such as a very high positive-SWB-over-suffering surplus.
From the present, into the future
If we accept the conclusion that utility-monstrous technology is a high-value vector for effective altruism (among others), then what could current EAs do as we transition into the future? To my best knowledge, we don't have the capacity yet to create artificial utility monsters.
However, foundational research in neuroscience and artificial intelligence/sentience theory is already ongoing today and certainly a necessity if we ever want to implement utility-monstrous systems. In addition, outreach and public discussion of the fundamental concepts is also possible and plausibly high-value (hence this post). Generally, the following steps seem all useful and could use the attention of EAs, as we progress into the future:
- spread the idea, refine the concepts, apply constructive criticism to all its weak spots until it becomes either solid or revealed as irredeemably undesirable
- identify possible misunderstandings, fears, biases etc. that may reduce human acceptance and find compromises and attraction factors to mitigate them
- fund and do the scientific research that, if successful, could lead to utility-monstrous technologies
- fund the implementation of the first actual utility monsters and test them thoroughly, then improve on the design, then test again, etc.
- either make the templates public (open-source approach) or make them available for specialized altruistic institutions, such as private charities
- perform outreach and fundraising to give existence donations to as many utility monsters as possible
All of this can be done without much self-sacrifice on the part of any individual. And all of this can be done within existing political systems, existing markets, and without violating anyone's rights.
I just found this on slashdot:
This report emerges from the Pew Research Center’s efforts to understand public attitudes about a variety of scientific and technological changes being discussed today. The time horizons of these technological advances span from today’s realities—for instance, the growing prevalence of drones—to more speculative matters such as the possibility of human control of the weather.
This is interesting esp. in comparison to the recent posts on forecasting which focussed on expert forecasts.
What I found most notable was the public opinion on their use of future technology:
% who would do the following if possible...
50% ride in a driverless car
26% use brain implant to improve memory or mental capacity
20% eat meat grown in a lab
Don't they know Eutopia is Scary? I'd guess if these technologies really become available and are reliable only the elderly will be inable to overcome their preconceptions. And everybody will eat artificial meat if it is cheaper, more healthy and tastes the same (and the testers say confirm this).
For those who may be interested in these things, here are the links to all the FHI's technical reports.
Global Catastrophic Risks Survey: At the Global Catastrophic Risk Conference in Oxford (17‐20 July, 2008) an informal survey was circulated among participants, asking them to make their best guess at the chance that there will be disasters of different types before 2100. This report summarizes the main results.
Record of the Workshop on Policy Foresight and Global Catastrophic Risks: On 21 July 2008, the Policy Foresight Programme, in conjunction with the Future of Humanity Institute, hosted a day-long workshop on “Policy Foresight and Global Catastrophic Risks” at the James Martin 21st Century School at the University of Oxford. This document provides a record of the day’s discussion.
Whole Brain Emulation: a Roadmap: This report aims at providing a preliminary roadmap for Whole Brain Emulations (possible future one‐to‐one modelling of the function of the human brain), sketching out key technologies that would need to be developed or refined, and identifying key problems or uncertainties.
Utility Indifference: A utility-function-based method for making an Artificial Intelligence indifferent to certain facts or states of the world, which can be used to make certain security precautions more successful.
Machine Intelligence Survey: At the FHI Winter Intelligence conference on machine intelligence 16/1 2011 an informal poll was conducted to elicit the views of the participants on various questions related to the emergence of machine intelligence. This report summarizes the results.
Indefinite Survival through Backup Copies: Continually copying yourself may help you preserve yourself from destruction. As long as the copies fate is independent, increasing the number of copies at a logarithmic rate is enough to ensure a non-zero probability of surviving for ever. The model is of more general use for many similar processes.
Anthropics: why Probability isn’t enough: This report argues that the current treatment of anthropic and self-locating problems over-emphasises the importance of anthropic probabilities, and ignores other relevant and important factors, such as whether the various copies of the agents in question consider that they are acting in a linked fashion and whether they are mutually altruistic towards each other. These help to reinterpret the decisions, rather than probabilities, as the fundamental objects of interest in anthropic problems.
Nash equilibrium of identical agents facing the Unilateralist's Curse: This report is an addendum to the 'Unilateralist's Curse' of Nick Bostrom, Thomas Douglas and Anders Sandberg. It demonstrates that if there are identical agents facing a situation where any one of them can implement a policy unilaterally, then the best strategies they can implement are also Nash equilibriums.
AI arms race: a simple model of AI arms race (though it can be generalised). Some of the insights are obvious - that the competing teams are more likely to take safety precautions if there are not too many of them, if they agree with each other's values and if skill is more important than risk-taking in developing a functioning AI. But one result is surprising: that teams are most likely to take risks if they know the capabilities of their own team or their opponents'. In this case, the less you know, the safer you'll behave.
Please cite these reports as:
- Sandberg, A. & Bostrom, N. (2008): “Global Catastrophic Risks Survey”, Technical Report #2008-1, Future of Humanity Institute, Oxford University: pp. 1-5.
- Tickell, C. et al. (2008): “Record of the Workshop on Policy Foresight and Global Catastrophic Risks”, Technical Report #2008-2, Future of Humanity Institute, Oxford University: pp. 1-19.
- Sandberg, A. & Bostrom, N. (2008): “Whole Brain Emulation: a Roadmap”, Technical Report #2008-3, Future of Humanity Institute, Oxford University: pp. 1-130.
- Armstrong, S. (2010): “Utility Indifference”, Technical Report #2010-1, Future of Humanity Institute, Oxford University: pp. 1-5.
- Sandberg, A. & Bostrom, N. (2011): “Machine Intelligence Survey”, Technical Report #2011-1, Future of Humanity Institute, Oxford University: pp. 1-12.
- Sandberg, A. & Armstrong, S. (2012): “Indefinite Survival through Backup Copies”, Technical Report #2012-1, Future of Humanity Institute, Oxford University: pp. 1-5.
- Armstrong, S. (2012): “Anthropics: why Probability isn’t enough”, Technical Report #2012-2, Future of Humanity Institute, Oxford University: pp. 1-10.
- Armstrong, S. (2012): “Nash equilibrium of identical agents facing the Unilateralist's Curse”, Technical Report #2012-3, Future of Humanity Institute, Oxford University: pp. 1-5.
- Armstrong, S. & Bostrom, N. & Shulman, C. (2013): “Racing to the precipice: a model of artificial intelligence development”, Technical Report #2013-1, Future of Humanity Institute, Oxford University: pp. 1-8.
Some people like to assume that the cosmos is ours for the taking, even though this could make us special to the order of 1 in 1080. The argument is that the cosmos could be transformed by technology - engineered on astronomical scales - but hasn't been thus transformed.
The most common alternative hypothesis is that "we are in a simulation". Perhaps we are. But there are other possibilities too.
One is that technological life usually destroys, not just its homeworld, but its whole bubble of space-time, by using high-energy physics to cause a "vacuum decay", in which physics changes in a way that makes space uninhabitable. For example, the mass of an elementary particle is essentially equal to the energy density of the Higgs field, times a quantity called a "yukawa coupling". If the Higgs field increased its energy density by orders of magnitude, but the yukawas stayed the same, matter as we know it would be destroyed, everywhere that the change spread.
Here I want to highlight a different possibility. The idea is that the universe contains very large lifeforms and very small lifeforms. We are among the small. The large ones are, let's say, mostly dark matter, galactic in scale, and stars and planets for them are like biomolecules for us; tiny functional elements which go together to make up the whole. And - the crucial part - they have immune systems which automatically crush anything which interferes with the natural celestial order.
This is why the skies are full of untamed stars rather than Dyson spheres - any small life which begins to act on that scale is destroyed by dark-matter antibodies. And it explains anthropically why you're human-size rather than galactic-size: small life is more numerous than large life, just not so numerous as cosmic colonization would imply.
Two questions arise - how did large life evolve, and, shouldn't anthropics favor universes which have no large life, just space-colonizing small life? I could spin a story about cosmological natural selection, and large life which uses small life to reproduce, but it doesn't really answer the second question, in particular. Still, I feel that this is a huge unexplored topic - the anthropic consequences of "biocosmic" ecology and evolution - and who knows what else is lurking here, waiting to be discovered?
I recently gave a talk at the IARU Summer School on the Ethics of Technology.
In it, I touched on many of the research themes of the FHI: the accuracy of predictions, the limitations and biases of predictors, the huge risks that humanity may face, the huge benefits that we may gain, and the various ethical challenges that we'll face in the future.
Nothing really new for anyone who's familiar with our work, but some may enjoy perusing it.
In When Will AI Be Created?, I named four methods that might improve our forecasts of AI and other important technologies. Two of these methods were explicit quantification and leveraging aggregation, as exemplified by IARPA's ACE program, which aims to “dramatically enhance the accuracy, precision, and timeliness of… forecasts for a broad range of event types, through the development of advanced techniques that elicit, weight, and combine the judgments of many analysts.
DAGGRE will continue, but it will transition from geo-political forecasting to science and technology (S&T) forecasting to better use its combinatorial capabilities. We will have a brand new shiny, friendly and informative interface co-designed by Inkling Markets, opportunities for you to provide your own forecasting questions and more!
Another exciting development is that our S&T forecasting prediction market will be open to everyone in the world who is at least eighteen years of age. We’re going global!
If you want help improve humanity’s ability to forecast important technological developments like AI, please register for DAGGRE’s new S&T prediction website here.
Experienced PredictionBook veterans should do well.
I think that awesome stuff will happen in the far future and I plan on getting there, so I'll do my best to make sure that I stay alive as long as I can. (Also, my primitive survival instincts make me want to become immortal.) Unfortunately, due to my evolutionary baggage, my own genes are going to kill me in a few decades.
What's your hypothetical apostasy and how do you plan to put it in practice?
Edit #1: If you're downvoting this article, I'd like to know why you're doing that. Send me a message or reply here.
Edit #2: I totally misunderstood what the hypothetical apostasy means. I was under the impression that it meant defending a view that most people deem too weird to contemplate. See Lark's explanation. I guess you should downvote this article!
Meta: Inspired by The Least Convenient Possible World I asked the person who most criticized my previous posts help on writing a new one, since that seemed very inconvenient, specially because the whole thing was already written. He agreed and suggested I begin by posting only a part of it here, and wait for the comments to further change the rest of the text. So here is the beggining and one section, and we'll see how it goes from there. I have changed the title to better reflect the only section presented here.
This post will be about how random events can preclude or steal attention from the goals you set up to begin with, and about how hormone fluctuation inclines people to change some of their goals with time. A discussion on how to act more usefully given that follows, taking in consideration likelihood of a goal's success in terms of difficulty and lenght.
Through it I suggest a new bias, Avoid-Frustration bias, which is composed of a few others:
A Self-serving bias in which Loss aversion manifests by postponing one's goals, thus avoiding frustration through wishful thinking about far futures, big worlds, immortal lives, and in general, high numbers of undetectable utilons.
It can be thought of a kind of Cognitive Dissonance, though Cognitive Dissonance doesn't to justice to specific properties and details of how this kind, in particular, seems to me to have affected the lives of Less-Wrongers, Transhumanists and others. Probably in a good way, more on that later.
Sections will be:
What Significantly Changes Life's Direction (lists)
Long Term Goals and Even Longer Term Goals
Proportionality Between Goal Achievement Expected Time and Plan Execution Time
A Hypothesis On Why We Became Long-Term Oriented
Adapting Bayesian Reasoning to Get More Utilons
Time You Can Afford to Wait, Not to Waste
Reference Classes that May Be Avoid-Frustration Biased
- The Road Ahead
[Section 4 is shown here]
4 A Hypothesis On Why We Became Long-Term Oriented
For anyone who rejoiced the company of the writings of Derek Parfit, George Ainslie, or Nick Bostrom, there are a lot of very good reasons to become more long-term oriented. I am here to ask you about those reasons: Is that you true acceptance?
It is not for me. I became longer term oriented because of different reasons. Two obvious ones are genetics expressing in me the kind of person that waits a year for the extra marshmallow while fantasyzing about marshmallow worlds and rocking horse pies, and secondly wanting to live thousands of years. But the one I'd like to suggest that might be relevant to some here is that I was very bad at making people who were sad or hurt happy. I was not, as they say, empathic. It was a piece of cake bringing folks from neutral state to joy and bliss. But if someone got angry or sad, specially sad with something I did, I would be absolutely powerless about it. This is only one way of not being good with people, a people's person etc... So my emotional system, like the tale's Big Bad Wolf blew, and blew, and blew, until my utilons were confortably sitting aside in the Far Future, where none of them could look back at my face, cry, and point to me as the tears cause.
Paradoxically, though understandably, I have since been thankful for that lack of empathy towards those near. In fact, I have claimed, where I forget, that it is the moral responsibility of those with less natural empathy of the giving to beggars kind to care about the far future, since so few are within this tiny psychological mindspace of being able to care abstractly while not caring that much visibly/emotionally. We are such a minority that foreign aid seems to be the thing that is more disproportional in public policy between countries (Savulescu, J - Genetically Enhance Humanity of Face Extinction 2009 video). Like the whole minority of billionnaires ought to be more like Bill Gates, Peter Thiel and Jaan Tallinn, the minority of underempathic folk ought to be more like an economist doing quantitative analysis to save or help in quantitative ways.
So maybe your true acceptance of Longterm, like mine, was something like Genes + Death sucks + I'd rather interact with people of the future whose bots in my mind smile, than those actually meaty folk around me, with all their specific problems, complicated families and boring christian relashionship problems. This is my hypothesis. Even if true, notice it does not imply that Longterm isn't rational, after all Parfit, Bostrom and Ainslie are still standing, even after careful scrutiny.
[Ideas on how to develop other sections are as appreciated as commentary on this one]
Excerpts from literature on robotic/self-driving/autonomous cars with a focus on legal issues, lengthy, often tedious; some more SI work. See also Notes on Psychopathy.
Having read through all this material, my general feeling is: the near-term future (1 decade) for autonomous cars is not that great. What's been accomplished, legally speaking, is great but more limited than most people appreciate. And there are many serious problems with penetrating the elaborate ingrown rent-seeking tangle of law & politics & insurance. I expect the mid-future (+2 decades) to look more like autonomous cars completely taking over many odd niches and applications where the user can afford to ignore those issues (eg. on private land or in warehouses or factories), with highways and regular roads continuing to see many human drivers with some level of automated assistance. However, none of these problems seem fatal and all of them seem amenable to gradual accommodation and pressure, so I am now more confident that in the long run we will see autonomous cars become the norm and human driving ever more niche (and possibly lower-class). On none of these am I sure how to formulate a precise prediction, though, since I expect lots of boundary-crossing and tertium quids. We'll see.
A while ago I wrote briefly on why the Singularity might not be near and my estimates badly off. I saw it linked the other day, and realized that pessimism seemed to be trendy lately, which meant I ought to work on why one might be optimistic instead: http://www.gwern.net/Mistakes#counter-point
(Summary: long-sought AI goals have been recently achieved, global economic growth & political stability continues, and some resource crunches have turned into surpluses - all contrary to long-standing pessimistic forecasts.)
A 9-person Australian company called Euclidean has a new software technology that blows all the previously-believed limitations of real-time rendering right out the window.
It really makes you appreciate the phrase "efficient use of resources." Their tech demo is mind-bogglingly impressive all by itself, but the further implications of what is actually possible with current computer hardware are reality shaking.
It really makes me wonder where else (besides AI) current technology is vastly undershooting its potential in a similar way, using brute force (successfully or unsuccessfully) to accomplish something when there's a vastly more efficient way to do it that nobody's thought of yet.
A dialogue discussing how thermodynamics limits future growth in energy usage, and that in turn limits GDP growth, from the blog Do the Math.
Physicist: Hi, I’m Tom. I’m a physicist.
Economist: Hi Tom, I’m [ahem..cough]. I’m an economist.
Physicist: Hey, that’s great. I’ve been thinking a bit about growth and want to run an idea by you. I claim that economic growth cannot continue indefinitely.
Economist: [chokes on bread crumb] Did I hear you right? Did you say that growth can not continue forever?
Physicist: That’s right. I think physical limits assert themselves.
Economist: Well sure, nothing truly lasts forever. The sun, for instance, will not burn forever. On the billions-of-years timescale, things come to an end.
Physicist: Granted, but I’m talking about a more immediate timescale, here on Earth. Earth’s physical resources—particularly energy—are limited and may prohibit continued growth within centuries, or possibly much shorter depending on the choices we make. There are thermodynamic issues as well.
I think this is quite relevant to many of the ideas of futurism (and economics) that we often discuss here on Less Wrong. They address the concepts related to levels of civilization and mind uploading. Colonization of space is dismissed by both parties, at least for the sake of the discussion. The blog author has another post discussing his views on its implausibility; I find it to be somewhat limited in its consideration of the issue, though.
He has also detailed the calculations whose results he describes in this dialogue in a few previous posts. The dialogue format will probably be a kinder introduction to the ideas for those less mathematically inclined.
We often assume that an AI will have an identity and goals of its own. That it will be some separate entity from a human being or group of humans.
In physics there are no separate entities, merely a function evolving through time. So any identity needs to be constructed by systems within physics, and the boundaries are arbitrary. We have been built by evolution and all the cells in our body have the same programming so we have a handy rule of thumb that our body is "us" as it is created by a single replicating complex. So we assume that a computational entity, if it develops a theory of self, will only include its processing elements or code and nothing else in its notion of identity. But what an system identifies with can be controlled and specifed.
If a system identifies a human as an important part of itself it will strive to protect it and its normal functioning, as we instinctively protect important parts of ourselves such as the head and genitals.
The magazine has a bunch of articles dealing with what the world may be like 98,000 years hence. What with the local interest in the distant future, and with prediction itself, I thought I'd bring it to your attention.
von Neumann probes and Dyson spheres: what exploratory engineering can tell us about the Fermi paradox
Not entirely relevant to the main issues of lesswrong, but possibly still of interest: my talk entitled "von Neumann probes and Dyson spheres: what exploratory engineering can tell us about the Fermi paradox".
Abstract: The Fermi paradox is the contrast between the high estimate of the likelihood of extraterritorial civilizations, and the lack of visible evidence of them. But what sort of evidence should we expect to see? This is what exploratory engineering can tell us, giving us estimates of what kind of cosmic structures are plausibly constructable by advanced civilizations, and what traces they would leave. Based on our current knowledge, it seems that it would be easy for such a civilization to rapidly occupy vast swathes of the universe in a visible fashion. There are game-theoretic reasons to suppose that they would do so. This leads to a worsening of the Fermi paradox, reducing the likelihood of "advanced but unseen" civilizations, even in other galaxies.
The slides from the talk can be found here (thanks, Luke!).
I've seen an interesting variety of utopian hopes expressed recently. Raemon's "Ritual" sequence of posts is working to affirm the viability of LW's rationalist-immortalist utopianism, not just in the midst of an indifferent universe, but in the midst of an indifferent society. Leverage Research turn out to be social-psychology utopians, who plan to achieve their world of optimality by unleashing the best in human nature. And Russian life-extension activist Maria Konovalenko just blogged about the difficulty of getting people to adopt anti-aging research as the top priority in life, even though it's so obvious to her that it should be.
This phenomenon of utopian hope - its nature, its causes, its consequences, whether it's ever realistic, whether it ever does any good - certainly deserves attention and analysis, because it affects, and even afflicts, a lot of people, on this site and far beyond. It's a vast topic, with many dimensions. All my examples above have a futurist tinge to them - an AI singularity, and a biotech society where rejuvenation is possible, are clearly futurist concepts; and even the idea of human culture being transformed for the better by new ideas about the mind, belongs within the same broad scientific-technological current of Utopia Achieved Through Progress. But if we look at all the manifestations of utopian hope in history, and not just at those which resemble our favorites, other major categories of utopia can be observed - utopia achieved by reaching back to the conditions of a Golden Age; utopia achieved in some other reality, like an afterlife.
The most familiar form of utopia these days is the ideological social utopia, to be achieved once the world is run properly, according to the principles of some political "-ism". This type of utopia can cut across the categories I have mentioned so far; utopian communism, for example, has both futurist and golden-age elements to its thinking. The new society is to be created via new political forms and new philosophies, but the result is a restoration of human solidarity and community that existed before hierarchy and property... The student of utopian thought must also take note of religion, which until technology has been the main avenue through which humans have pursued their most transcendental hopes, like not having to die.
But I'm not setting out to study utopian thought and utopian psychology out of a neutral scholarly interest. I have been a utopian myself and I still am, if utopianism includes belief in the possibility (though not the inevitability) of something much better. And of course, the utopias that I have taken seriously are futurist utopias, like the utopia where we do away with death, and thereby also do away with a lot of other social and psychological pathologies, which are presumed to arise from the crippling futility of the universal death sentence.
However, by now, I have also lived long enough to know that my own hopes were mistaken many times over; long enough to know that sometimes the mistake was in the ideas themselves, and not just the expectation that everyone else would adopt them; and long enough to understand something of the ordinary non-utopian psychology, whose main features I would nominate as reconciliation with work and with death. Everyone experiences the frustration of having to work for a living and the quiet horror of physiological decline, but hardly anyone imagines that there might be an alternative, or rejects such a lifecycle as overall more bad than it is good.
What is the relationship between ordinary psychology and utopian psychology? First, the serious utopians should recognize that they are an extreme minority. Not only has the whole of human history gone by without utopia ever managing to happen, but the majority of people who ever lived were not utopians in the existentially revolutionary sense of thinking that the intolerable yet perennial features of the human condition might be overthrown. The confrontation with the evil aspects of life must usually have proceeded more at an emotional level - for example, terror that something might be true, and horror at the realization that it is true; a growing sense that it is impossible to escape; resignation and defeat; and thereafter a permanently diminished vitality, often compensated by achievement in the spheres of work and family.
The utopian response is typically made possible only because one imagines that there is a specific alternative to this process; and so, as ideas about alternatives are invented and circulated, it becomes easier for people to end up on the track of utopian struggle with life, rather than the track of resignation, which is why we can have enough people to form social movements and fundamentalist religions, and not just isolated weirdos. There is a continuum between full radical utopianism and very watered-down psychological phenomena which hardly deserve that name, but still have something in common - for example, a person who lives an ordinary life but draws some sustenance from the possibility of an afterlife of unspecified nature, where things might be different, and where old wrongs might be righted - but nonetheless, I would claim that the historically dominant temperament in adult human experience has been resignation to hopelessness and helplessness in ultimate matters, and an absorption in affairs where some limited achievement is possible, but which in themselves can never satisfy the utopian impulse.
The new factor in our current situation is science and technology. Our modern history offers evidence that the world really can change fundamentally, and such further explosive possibilities as artificial intelligence and rejuvenation biotechnology are considered possible for good, tough-minded, empirical reasons, not just because they offer a convenient vehicle for our hopes.
Technological utopians often exhibit frustration that their pet technologies and their favorite dreams of existential emancipation aren't being massively prioritized by society, and they don't understand why other people don't just immediately embrace the dream when they first hear about it. (Or they develop painful psychological theories of why the human race is ignoring the great hope.) So let's ask, what are the attitudes towards alleged technological emancipation that a person might adopt?
One is the utopian attitude: the belief that here, finally, one of the perennial dreams of the human race can come true. Another is denial: which is sometimes founded on bitter experience of disappointment, which teaches that the wise thing to do is not to fool yourself when another new hope comes up to you and cheerfully asserts that this time really is different. Another is to accept the possibility but deny the utopian hope. I think this is the most important interpretation to understand.
It is the one that precedent supports. History is full of new things coming to pass, but they have never yet led to utopia. So we might want to scrutinize our technological projections more closely, and see whether the utopian expectation is based on overlooking the downside. For example, let us contrast the idea of rejuvenation and the idea of immortality - not dying, ever. Just because we can take someone who is 80 and make them biologically 20, is not the same thing as making them immortal. It just means that won't die of aging, and that when they do die, it will be in a way befitting someone 20 years old. They'll die in an accident, or a suicide, or a crime. Incidentally, we should also note an element of psychological unrealism in the idea of never wanting to die. Forever is a long time; the whole history of the human race is about 10,000 years long. Just 10,000 years is enough to encompass all the difficulties and disappointments and permutations of outlook that have ever happened. Imagine taking the whole history of the human race into yourself; living through it personally. It's a lot to have endured.
It would be unfair to say that transhumanists as a rule are dominated by utopian thinking. Perhaps just as common is a sort of futurological bipolar disorder, in which the future looks like it will bring "utopia or oblivion", something really good or something really bad. The conservative wisdom of historical experience says that both these expectations are wrong; bad things can happen, even catastrophes, but life keeps going for someone - that is the precedent - and the expectation of total devastating extinction is just a plunge into depression as unrealistic as the utopian hope for a personal eternity; both extremes exhibiting an inflated sense of historical or cosmic self-importance. The end of you is not the end of the world, says this historical wisdom; imagining the end of the whole world is your overdramatic response to imagining the end of you - or the end of your particular civilization.
However, I think we do have some reason to suppose that this time around, the extremes are really possible. I won't go so far as to endorse the idea that (for example) intelligent life in the universe typically turns its home galaxy into one giant mass of computers; that really does look like a case of taking the concept and technology with which our current society is obsessed, and projecting it onto the cosmic unknown. But just the humbler ideas of transhumanity, posthumanity, and a genuine end to the human-dominated era on Earth, whether in extinction or in transformation. The real and verifiable developments of science and technology, and the further scientific and technological developments which they portend, are enough to justify such a radical, if somewhat nebulous, concept of the possible future. And again, while I won't simply endorse the view that of course we shall get to be as gods, and shall get to feel as good as gods might feel, it seems reasonable to suppose that there are possible futures which are genuinely and comprehensively better than anything that history has to offer - as well as futures that are just bizarrely altered, and futures which are empty and dead.
So that is my limited endorsement of utopianism: In principle, there might be a utopianism which is justified. But in practice, what we have are people getting high on hope, emerging fanaticisms, personal dysfunctionality in the present, all the things that come as no surprise to a cynical student of history. The one outcome that would be most surprising to a cynic is for a genuine utopia to arrive. I'm willing to say that this is possible, but I'll also say that almost any existing reference to a better world to come, and any psychological state or social movement which draws sublime happiness from the contemplation of an expected future, has something unrealistic about it.
In this regard, utopian hope is almost always an indicator of something wrong. It can just be naivete, especially in a young person. As I have mentioned, even non-utopian psychology inevitably has those terrible moments when it learns for the first time about the limits of life as we know it. If in your own life you start to enter that territory for the first time, without having been told from an early age that real life is fundamentally limited and frustrating, and perhaps with a few vague promises of hope, absorbed from diverse sources, to sustain you, then it's easy to see your hopes as, not utopian hopes, but simply a hope that life can be worth living. I think this is the experience of many young idealists in "environmental" and "social justice" movements; their culture has always implied to them that life should be a certain way, without also conveying to them that it has never once been that way in reality. The suffering of transhumanist idealists and other radical-futurist idealists, when they begin to run aground on the disjunction between their private subcultural expectations and those of the culture at large, has a lot in common with the suffering of young people whose ideals are more conventionally recognizable; and it is entirely conceivable that for some generation now coming up, rebellion against biological human limitations will be what rebellion against social limitations has been for preceding generations.
I should also mention, in passing, the option of a non-utopian transhumanism, something that is far more common than my discussion so far would mention. This is the choice of people who expect, not utopia, but simply an open future. Many cryonicists would be like this. Sure, they expect the world of tomorrow to be a great place, good enough that they want to get there; but they don't think of it as an eternal paradise of wish-fulfilment that may or may not be achieved, depending on heroic actions in the present. This is simply the familiar non-utopian view that life is overall worth living, combined with the belief that life can now be lived for much longer periods; the future not as utopia, but as more history, history that hasn't happened yet, and which one might get to personally experience. If I was wanting to start a movement in favor of rejuvenation and longevity, this is the outlook I would be promoting, not the idea that abolishing death will cure all evils (and not even the idea that death as such can be abolished; rejuvenation is not immortality, it's just more good life). In the spectrum of future possibilities, it's only the issue of artificial intelligence which lends some plausibility to extreme bipolar futurism, the idea that the future can be very good (by human standards) or very bad (by human standards), depending on what sort of utility functions govern the decision-making of transhuman intelligence.
That's all I have to say for now. It would be unrealistic to think we can completely avoid the pathologies associated with utopian hope, but perhaps we can moderate them, if we pay attention to the psychology involved.
Coherent extrapolated volition (CEV) asks what humans would want, if they knew more - if their values reached reflective equilibrium. (I don't want to deal with the problems of whether there are "human values" today; for the moment I'll consider the more-plausible idea that a single human who lived forever could get smarter and closer to reflective equilibrium over time.)
This is appealing because it seems compatible with moral progress (see e.g., Muehlhauser & Helm, "The singularity and machine ethics", in press). Morality has been getting better over time, right? And that's because we're getting smarter, and closer to reflective equilibrium as we revise our values in light of our increased understanding, right?
This view makes three claims:
- Morality has improved over time.
- Morality has improved as a result of reflection.
- This improvement brings us closer to equilibrium over time.
There can be no evidence for the first claim, and the evidence is against the second two claims.
The following is the first draft of my efforts. It's about half as long as the original. It cuts out the section about the Shadowy Figure, which I'm slightly upset about, in particular because it would make the "beyond the reach of God" line stronger. But I felt like if I tried to include it at all, I had to include several paragraphs that took a little too long.
I attempted at first to convert to a "true" poem, (not rhyming, but going for a particular meter). I later decided that too much of it needed to have a conversational quality so it's more of a short play than a poem. Lines are broken up in a particular way to suggest timing and make it easier to read out loud.
I wanted a) to share the results with people on the chance that someone else might want to perform a little six minute dialog (my test run clocked in at 6:42), and b) get feedback on how I chose to abridge things. Do you think there were important sections that can be tied in without making it too long? Do you think some sections that I reworded could be reworded better, or that I missed some?
Edit: I've addressed most of the concerns people had. I think I'm happy with it, at least for my purposes. If people are still concerned by the ending I'll revise it, but I think I've set it up better now.
The Gift We Give Tomorrow
How, oh how could the universe,
itself unloving, and mindless,
cough up creatures capable of love?
No mystery in that.
It's just a matter
of natural selection.
But natural selection is cruel. Bloody.
And bloody stupid!
Even when organisms aren't directly tearing at each other's throats…
…there's a deeper competition, going on between the genes.
A species could evolve to extinction,
if the winning genes were playing negative sum games
How could a process,
Cruel as Azathoth,
Create minds that were capable of love?
Mystery is a property of questions.
A mother's child shares her genes,
And so a mother loves her child.
But mothers can adopt their children.
And still, come to love them.
Still no mystery.
Evolutionary psychology isn't about deliberately maximizing fitness.
Through most of human history,
we didn't know genes existed.
Well, fine. But still:
Humans form friendships,
even with non-relatives.
How can that be?
Ancient hunter-gatherers would often play the Iterated Prisoner's Dilemma.There could be profit in betrayal.
But the best solution:
was reciprocal altruism.
the most dangerous human is not the strongest,
or even the smartest:
But the one who has the most allies.
But not all friends are fair-weather friends;
there are true friends -
those who would sacrifice their lives for another.
Shouldn't that kind of devotion
remove itself from the gene pool?
You said it yourself:
We have a concept of true friendship and fair-weather friendship.
We wouldn't be true friends with someone who we didn't think was a true friend to us.
And one with many true friends?
They are far more formidable
than one with mere fair-weather allies.
And Mohandas Gandhi,
who really did turn the other cheek?
Those who try to serve all humanity,
whether or not all humanity serves them in turn?\
That’s a more complex story.
Humans aren’t just social animals. We’re political animals.
Sometimes the formidable human is not the strongest,
but the one who skillfully argues that their preferred policies
match the preferences of others.
How does that explain Gandhi?
The point is that we can argue about 'What should be done?'
We can make those arguments and respond to them.
Without that, politics couldn't take place.
Okay... but Gandhi?
Believed certain complicated propositions about 'What should be done?'
Then did them.
That sounds suspiciously like it could explain any possible human behavior.
If we traced back the chain of causality,
through all the arguments...
We'd find a moral architecture.
The ability to argue abstract propositions.
A preference for simple ideas.
An appeal to hardwired intuitions about fairness.
A concept of duty. Aversion to pain.
Filtered by memetic selection,all of this resulted in a concept:
"You should not hurt people,"
In full generality.
And that gets you Gandhi.
What else would you suggest?
Some godlike figure?
Reaching out from behind the scenes,
Hell no. But -
Because then I’d would have to ask :
How did that god originally decide that love was even desirable.
How it got preferences that included things like friendship, loyalty, and fairness.
Call it 'surprising' all you like.
But through evolutionary psychology,
You can see how parental love, romance, honor,even true altruism and moral arguments,
all bear the specific design signature of natural selection.
If there were some benevolent god, reaching out to create a world of loving humans,
it too must have evolved,
defeating the point of postulating it at all.
I'm not postulating a god!
I'm just asking how human beings ended up so nice.
Nice?Have you looked at this planet lately? We bear all those other emotions that evolved as well.
Which should make it very clear that we evolved, should you begin to doubt it.
Humans aren't always nice.
But, still, come on...
doesn't it seem a little...
That nothing but millions of years of a cosmic death tournament…
could cough up mothers and fathers,
sisters and brothers,
husbands and wives,
true altruists and guardians of causes,
police officers and loyal defenders,
even artists, sacrificing themselves for their art?
All practicing so many kinds of love?
For so many things other than genes?
Doing their part to make their world less ugly,
something besides a sea of blood and violence and mindless replication?
Are you honestly surprised by this?
If so, question your underlying model.
For it's led you to be surprised by the true state of affairs.
Since the very beginning,
not one unusual thing has ever happened.
But how are you NOT amazed?
Maybe there’s no surprise from a causal viewpoint.
But still, it seems to me, in the creation of humans by evolution,
something happened that is precious and marvelous and wonderful.
If we can’t call it a physical miracle, then call it a moral miracle.
Because it was only a miracle from the perspective of the morality that was produced?
Explaining away all the apparent coincidence,
from a causal and physical perspective?
Well... yeah. I suppose you could interpret it that way.
I just meant that something was immensely surprising and wonderful on a moral level,
even if it's not really surprising,
on a physical level.
I think that's what I said.
It just seems to me that in your view, somehow you explain that wonder away.
I explain it.
Of course there's a story behind love.
Behind all ordered events, one finds ordered stories.And that which has no story is nothing but random noise.
Hardly any better.
If you can't take joy in things with true stories behind them,
your life will be empty.
Love has to begin somehow.
It has to enter the universe somewhere.
It’s like asking how life itself begins.Though you were born of your father and mother,
and though they arose from their living parents in turn,
if you go far and far and far away back,
you’ll finally come to a replicator that arose by pure accident.
The border between life and unlife.
So too with love.
A complex pattern must be explained by a cause
that’s not already that complex pattern. For love to enter the universe,
it has to arise from something that is not love.If that weren’t possible, then love could not be.
Just as life itself required that first replicator,
to come about by accident,
but still caused:
far, far back in the causal chain that led to you:
3.8 billion years ago,
in some little tidal pool.
Perhaps your children's children will ask,how it is that they are capable of love.
And their parents will say:
Because we, who also love, created you to love.
And your children's children may ask: But how is it that you love?
And their parents will reply:
Because our own parents,
who loved as well,
created us to love in turn.
And then your children's children will ask:
But where did it all begin?
Where does the recursion end?
And their parents will say:
Once upon a time, long ago and far away,
there were intelligent beings who were not themselves intelligently designed.
Once upon a time, there were lovers,
created by something that did not love.
Once upon a time,
when all of civilization was a single galaxy,
A single star.
A single planet.
A place called Earth.
Ever So Long Ago.
(The following is a summary of some of my previous submissions that I originally created for my personal blog.)
...an intelligence explosion may have fair probability, not because it occurs in one particular detailed scenario, but because, like the evolution of eyes or the emergence of markets, it can come about through many different paths and can gather momentum once it gets started. Humans tend to underestimate the likelihood of such “disjunctive” events, because they can result from many different paths (Tversky and Kahneman 1974). We suspect the considerations in this paper may convince you, as they did us, that this particular disjunctive event (intelligence explosion) is worthy of consideration.
It seems to me that all the ways in which we disagree have more to do with philosophy (how to quantify uncertainty; how to deal with conjunctions; how to act in consideration of low probabilities) [...] we are not dealing with well-defined or -quantified probabilities. Any prediction can be rephrased so that it sounds like the product of indefinitely many conjunctions. It seems that I see the “SIAI’s work is useful scenario” as requiring the conjunction of a large number of questionable things [...]
— Holden Karnofsky, 6/24/11 (GiveWell interview with major SIAI donor Jaan Tallinn, PDF)
People associated with the Singularity Institute for Artificial Intelligence (SIAI) like to claim that the case for risks from AI is supported by years worth of disjunctive lines of reasoning. This basically means that there are many reasons to believe that humanity is likely to be wiped out as a result of artificial general intelligence. More precisely it means that not all of the arguments supporting that possibility need to be true, even if all but one are false risks from AI are to be taken seriously.
The idea of disjunctive arguments is formalized by what is called a logical disjunction. Consider two declarative sentences, A and B. The truth of the conclusion (or output) that follows from the sentences A and B does depend on the truth of A and B. In the case of a logical disjunction the conclusion of A and B is only false if both A and B are false, otherwise it is true. Truth values are usually denoted by 0 for false and 1 for true. A disjunction of declarative sentences is denoted by OR or ∨ as an infix operator. For example, (A(0)∨B(1))(1), or in other words, if statement A is false and B is true then what follows is still true because statement B is sufficient to preserve the truth of the overall conclusion.
Generally there is no problem with disjunctive lines of reasoning as long as the conclusion itself is sound and therefore in principle possible, yet in demand of at least one of several causative factors to become actual. I don’t perceive this to be the case for risks from AI. I agree that there are many ways in which artificial general intelligence (AGI) could be dangerous, but only if I accept several presuppositions regarding AGI that I actually dispute.
By presuppositions I mean requirements that need to be true simultaneously (in conjunction). A logical conjunction is only true if all of its operands are true. In other words, the a conclusion might require all of the arguments leading up to it to be true, otherwise it is false. A conjunction is denoted by AND or ∧.
Now consider the following prediction: <Mary is going to buy one of thousands of products in the supermarket.>
The above prediction can be framed as a disjunction: Mary is going to buy one of thousands of products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine. Only one of the 3 given possible arguments need to be true in order to leave the overall conclusion to be true, that Mary is going shopping. Or so it seems.
The same prediction can be framed as a conjunction: Mary is going to buy one of thousands of products in the supermarket 1.) if she has money 2.) if she has some needs 3.) if the supermarket is open. All of the 3 given factors need to be true in order to render the overall conclusion to be true.
That a prediction is framed to be disjunctive does not speak in favor of the possibility in and of itself. I agree that it is likely that Mary is going to visit the supermarket if I accept the hidden presuppositions. But a prediction is only at most as probable as its basic requirements. In this particular case I don’t even know if Mary is a human or a dog, a factor that can influence the probability of the prediction dramatically.
The same is true for risks from AI. The basic argument in favor of risks from AI is that of an intelligence explosion, that intelligence can be applied to itself in an iterative process leading to ever greater levels of intelligence. In short, artificial general intelligence will undergo explosive recursive self-improvement.
Explosive recursive self-improvement is one of the presuppositions for the possibility of risks from AI. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by the SIAI are trying to show that there are many causative factors that will result in the development of unfriendly artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by AGI. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.
<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness of those concepts by being more specific, or by coming up with technical definitions of each of the words they are made up of, reveals the hidden complexity that is comprised in the vagueness of the terms.
If we were going to define those concepts and each of its terms we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own made up of further speculations. If we are precise then any declarative sentence (P#) (all of the terms) used in the final description will have to be true simultaneously (P#∧P#). And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability, P(risks from AI) = P(P1∧P2∧P3∧P4∧P5∧P6∧…). That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.
To summarize: If we tried to pin down a concept like <Explosive Recursive Self-Improvement> we would end up with requirements that are strongly conjunctive.
Making numerical probability estimates
But even if the SIAI was going to thoroughly define those concepts, there is still more to the probability of risks from AI than the underlying presuppositions and causative factors. We also have to integrate our uncertainty about the very methods we used to come up with those concepts, definitions and our ability to make correct predictions about the future and integrate all of it into our overall probability estimates.
Take for example the following contrived quote:
We have to take over the universe to save it by making the seed of an artificial general intelligence, that is undergoing explosive recursive self-improvement, extrapolate the coherent volition of humanity, while acausally trading with other superhuman intelligences across the multiverse.
Although contrived, the above quote does only comprise actual beliefs hold by people associated with the SIAI. All of those beliefs might seem somewhat plausible inferences and logical implications of speculations and state of the art or bleeding edge knowledge of various fields. But should we base real-life decisions on those ideas, should we take those ideas seriously? Should we take into account conclusions whose truth value does depend on the conjunction of those ideas? And is it wise to make further inferences on those speculations?
Let’s take a closer look at the necessary top-level presuppositions to take the above quote seriously:
- The many-worlds interpretation
- Belief in the Implied Invisible
- Timeless Decision theory
- Intelligence explosion
1: Within the lesswrong/SIAI community the many-worlds interpretation of quantum mechanics is proclaimed to be the rational choice of all available interpretations. How to arrive at this conclusion is supposedly also a good exercise in refining the art of rationality.
2: P(Y|X) ≈ 1, then P(X∧Y) ≈ P(X)
In other words, logical implications do not have to pay rent in future anticipations.
3: “Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.”
4: “Intelligence explosion is the idea of a positive feedback loop in which an intelligence is making itself smarter, thus getting better at making itself even smarter. A strong version of this idea suggests that once the positive feedback starts to play a role, it will lead to a dramatic leap in capability very quickly.”
To be able to take the above quote seriously you have to assign a non-negligible probability to the truth of the conjunction of #1,2,3,4, 1∧2∧3∧4. Here the question is not not only if our results are sound but if the very methods we used to come up with those results are sufficiently trustworthy. Because any extraordinary conclusions that are implied by the conjunction of various beliefs might outweigh the benefit of each belief if the overall conclusion is just slightly wrong.
Not enough empirical evidence
Don’t get me wrong, I think that there sure are convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. My problem is that I fear that some convincing blog posts written in natural language are simply not enough.
Just imagine that all there was to climate change was someone who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. If the same person then goes on to make further inferences based on the implications of those speculations, am I going to tell everyone to stop emitting CO2 because of that? Hardly!
Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so.
Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.
Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed to come up with those estimates. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic state and the prevalent uncertainty.
I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.
Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business.
What would happen if we were going to let logical implications of vast utilities outweigh other concrete near-term problems that are based on empirical evidence? Insignificant inferences might exhibit hyperbolic growth in utility: 1.) There is no minimum amount of empirical evidence necessary to extrapolate the expected utility of an outcome. 2.) The extrapolation of counterfactual alternatives is unbounded, logical implications can reach out indefinitely without ever requiring new empirical evidence.
All of the above hints at a general problem that is the reason for why I think that discussions between people associated with the SIAI, its critics and those who try to evaluate the SIAI, won’t lead anywhere. Those discussions miss the underlying reason for most of the superficial disagreement about risks from AI, namely that there is no disagreement about risks from AI in and of itself.
There are a few people who disagree about the possibility of AGI in general, but I don’t want to touch on that subject in this post. I am trying to highlight the disagreement between the SIAI and people who accept the notion of artificial general intelligence. With regard to those who are not skeptical of AGI the problem becomes more obvious when you turn your attention to people like John Baez organisations like GiveWell. Most people would sooner question their grasp of “rationality” than give five dollars to a charity that tries to mitigate risks from AI because their calculations claim it was “rational” (those who have read the article by Eliezer Yudkowsky on ‘Pascal’s Mugging‘ know that I used a statement from that post and slightly rephrased it). The disagreement all comes down to a general averseness to options that have a low probability of being factual, even given that the stakes are high.
Nobody is so far able to beat arguments that bear resemblance to Pascal’s Mugging. At least not by showing that it is irrational to give in from the perspective of a utility maximizer. One can only reject it based on a strong gut feeling that something is wrong. And I think that is what many people are unknowingly doing when they argue against the SIAI or risks from AI. They are signaling that they are unable to take such risks into account. What most people mean when they doubt the reputation of people who claim that risks from AI need to be taken seriously, or who say that AGI might be far off, what those people mean is that risks from AI are too vague to be taken into account at this point, that nobody knows enough to make predictions about the topic right now.
When GiveWell, a charity evaluation service, interviewed the SIAI (PDF), they hinted at the possibility that one could consider the SIAI to be a sort of Pascal’s Mugging:
GiveWell: OK. Well that’s where I stand – I accept a lot of the controversial premises of your mission, but I’m a pretty long way from sold that you have the right team or the right approach. Now some have argued to me that I don’t need to be sold – that even at an infinitesimal probability of success, your project is worthwhile. I see that as a Pascal’s Mugging and don’t accept it; I wouldn’t endorse your project unless it passed the basic hurdles of credibility and workable approach as well as potentially astronomically beneficial goal.
This shows that lot of people do not doubt the possibility of risks from AI but are simply not sure if they should really concentrate their efforts on such vague possibilities.
Technically, from the standpoint of maximizing expected utility, given the absence of other existential risks, the answer might very well be yes. But even though we believe to understand this technical viewpoint of rationality very well in principle, it does also lead to problems such as Pascal’s Mugging. But it doesn’t take a true Pascal’s Mugging scenario to make people feel deeply uncomfortable with what Bayes’ Theorem, the expected utility formula, and Solomonoff induction seem to suggest one should do.
Again, we currently have no rational way to reject arguments that are framed as predictions of worst case scenarios that need to be taken seriously even given a low probability of their occurrence due to the scale of negative consequences associated with them. Many people are nonetheless reluctant to accept this line of reasoning without further evidence supporting the strong claims and request for money made by organisations such as the SIAI.
Here is what mathematician and climate activist John Baez has to say:
Of course, anyone associated with Less Wrong would ask if I’m really maximizing expected utility. Couldn’t a contribution to some place like the Singularity Institute of Artificial Intelligence, despite a lower chance of doing good, actually have a chance to do so much more good that it’d pay to send the cash there instead?
And I’d have to say:
1) Yes, there probably are such places, but it would take me a while to find the one that I trusted, and I haven’t put in the work. When you’re risk-averse and limited in the time you have to make decisions, you tend to put off weighing options that have a very low chance of success but a very high return if they succeed. This is sensible so I don’t feel bad about it.
2) Just to amplify point 1) a bit: you shouldn’t always maximize expected utility if you only live once. Expected values — in other words, averages — are very important when you make the same small bet over and over again. When the stakes get higher and you aren’t in a position to repeat the bet over and over, it may be wise to be risk averse.
3) If you let me put the $100,000 into my retirement account instead of a charity, that’s what I’d do, and I wouldn’t even feel guilty about it. I actually think that the increased security would free me up to do more risky but potentially very good things!
All this shows that there seems to be a fundamental problem with the formalized version of rationality. The problem might be human nature itself, that some people are unable to accept what they should do if they want to maximize their expected utility. Or we are missing something else and our theories are flawed. Either way, to solve this problem we need to research those issues and thereby increase the confidence in the very methods used to decide what to do about risks from AI, or to increase the confidence in risks from AI directly, enough to make it look like a sensible option, a concrete and discernable problem that needs to be solved.
Many people perceive the whole world to be at stake, either due to climate change, war or engineered pathogens. Telling them about something like risks from AI, even though nobody seems to have any idea about the nature of intelligence, let alone general intelligence or the possibility of recursive self-improvement, seems like just another problem, one that is too vague to outweigh all the other risks. Most people feel like having a gun pointed to their heads, telling them about superhuman monsters that might turn them into paperclips then needs some really good arguments to outweigh the combined risk of all other problems.
But there are many other problems with risks from AI. To give a hint at just one example: if there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on? In other words, our decision to mitigate a certain risk should not only be focused on the probability of its occurence but also on the probability of success in solving it. But as I have written above I believe that the most pressing issue is to increase the confidence into making decisions under extreme uncertainty or to reduce the uncerainty itself.
SIAI benefactor and VC Peter Thiel has an excellent article at National Review about the stagnating progress of science and technology, which he attributes to poorly-grounded political opposition, widespread scientific illiteracy, and overspecialized, insular scientific fields. He warns that this stagnation will undermine the growth that past policies have relied on.
Noteworthy excerpts (bold added by me):
In relation to concerns expressed here about evaluating scientific field soundness:
When any given field takes half a lifetime of study to master, who can compare and contrast and properly weight the rate of progress in nanotechnology and cryptography and superstring theory and 610 other disciplines? Indeed, how do we even know whether the so-called scientists are not just lawmakers and politicians in disguise, as some conservatives suspect in fields as disparate as climate change, evolutionary biology, and embryonic-stem-cell research, and as I have come to suspect in almost all fields? [!!! -- SB]
Looking forward, we see far fewer blockbuster drugs in the pipeline — perhaps because of the intransigence of the FDA, perhaps because of the fecklessness of today’s biological scientists, and perhaps because of the incredible complexity of human biology. In the next three years, the large pharmaceutical companies will lose approximately one-third of their current revenue stream as patents expire, so, in a perverse yet understandable response, they have begun the wholesale liquidation of the research departments that have borne so little fruit in the last decade and a half. [...]
The single most important economic development in recent times has been the broad stagnation of real wages and incomes since 1973, the year when oil prices quadrupled. To a first approximation, the progress in computers and the failure in energy appear to have roughly canceled each other out. Like Alice in the Red Queen’s race, we (and our computers) have been forced to run faster and faster to stay in the same place.
Taken at face value, the economic numbers suggest that the notion of breathtaking and across-the-board progress is far from the mark. If one believes the economic data, then one must reject the optimism of the scientific establishment. Indeed, if one shares the widely held view that the U.S. government may have understated the true rate of inflation — perhaps by ignoring the runaway inflation in government itself, notably in education and health care (where much higher spending has yielded no improvement in the former and only modest improvement in the latter) — then one may be inclined to take gold prices seriously and conclude that real incomes have fared even worse than the official data indicate. [...]
College graduates did better, and high-school graduates did worse. But both became worse off in the years after 2000, especially when one includes the rapidly escalating costs of college.[...]
The current crisis of housing and financial leverage contains many hidden links to broader questions concerning long-term progress in science and technology. On one hand, the lack of easy progress makes leverage more dangerous, because when something goes wrong, macroeconomic growth cannot offer a salve; time will not cure liquidity or solvency problems in a world where little grows or improves with time.
We make decisions based upon our expectations of the future going 10 to 20 years out. However we don't have good systematic ways of making predictions. We rely on pundits and experts in their fields, which might ignore changes in other fields.
We are currently experiencing a period of low growth in the western world, so reliable economic growth is not an iron law throughout the world. So the projects and the organisations we start should depend upon what we expect of the future. If it unlikely to be a wealthy future for the sections of the world we are in or have influence over, we might do well to consider if we can do things to improve our prospects; our ability to shape the future depends upon our wealth.
Suppose we could look into the future of our Everett branch and pick out those sub-branches in which humanity and/or human/moral values have survived past the Singularity in some form. What would we see if we then went backwards in time and look at how that happened? Here's an attempt to answer that question, or in other words to enumerate the not completely disastrous Singularity scenarios that seem to have non-negligible probability. Note that the question I'm asking here is distinct from "In what direction should we try to nudge the future?" (which I think logically ought to come second).
- Uploading first
- Become superintelligent (self-modify or build FAI), then take over the world
- Take over the world as a superorganism
- self-modify or build FAI at leisure
- (Added) stasis
- Competitive upload scenario
- (Added) subsequent singleton formation
- (Added) subsequent AGI intelligence explosion
- no singleton
- IA (intelligence amplification) first
- Clone a million von Neumanns (probably government project)
- Gradual genetic enhancement of offspring (probably market-based)
- Direct brain/computer interface
- What happens next? Upload or code?
- Code (de novo AI) first
- Scale of project
- Large Corporation
- Small Organization
- Secrecy - spectrum between
- totally open
- totally secret
- Planned Friendliness vs "emergent" non-catastrophe
- If planned, what approach?
- "Normative" - define decision process and utility function manually
- "Meta-ethical" - e.g., CEV
- "Meta-philosophical" - program AI how to do philosophy
- If emergent, why?
- Objective morality
- Convergent evolution of values
- Acausal game theory
- Standard game theory (e.g., Robin's idea that AIs in a competitive scenario will respect human property rights due to standard game theoretic considerations)
- If planned, what approach?
- Competitive vs. local FOOM
- Scale of project
- (Added) Simultaneous/complementary development of IA and AI
Sorry if this is too cryptic or compressed. I'm writing this mostly for my own future reference, but perhaps it could be expanded more if there is interest. And of course I'd welcome any scenarios that may be missing from this list.
Related to Exterminating life is rational.
ADDED: Standard assumptions about utility maximization and time-discounting imply that we shouldn't care about the future. I will lay out the problem in the hopes that someone can find a convincing way around it. This is the sort of problem we should think about carefully, rather than grasping for the nearest apparent solution. (In particular, the solutions "If you think you care about the future, then you care about the future", and, "So don't use exponential time-discounting," are easily-grasped, but vacuous; see bullet points at end.)
The math is a tedious proof that exponential time discounting trumps geometric expansion into space. If you already understand that, you can skip ahead to the end. I have fixed the point raised by Dreaded_Anomaly. It doesn't change my conclusion.
Suppose that we have Planck technology such that we can utilize all our local resources optimally to maximize our utility, nearly instantaneously.
Suppose that we colonize the universe at light speed, starting from the center of our galaxy (we aren't in the center of our galaxy; but it makes the computations easier, and our assumptions more conservative, since starting from the center is more favorable to worrying about the future, as it lets us grab lots of utility quickly near our starting point).
Robin Hanson has made several recent posts on Overcoming Bias about upload economics. I remain mystified why he doesn't link to or otherwise reference or comment on Carl Shulman's 2010 paper, Whole Brain Emulation and the Evolution of Superorganisms, which mentions many of the same ideas and seems to have taken them to their logical conclusions. I was going to complain again in the comments section over there, but then I noticed that the paper hasn't been posted or discussed here either. So here's the abstract. (See above link for the full paper.)
Many scientists expect the eventual development of intelligent software programs capable of closely emulating human brains, to the point of substituting for human labor in almost every economic niche. As software, such emulations could be cheaply copied, with copies subsequently diverging and interacting with their copy-relatives. This paper examines a set of evolutionary pressures on interaction between related emulations, pressures favoring the emergence of superorganisms, groups of emulations ready to self-sacrifice in service of the superorganism. We argue that the increased capacities and internal coordination of such superorganisms could pose increased risks of overriding human values, but also could facilitate the solution of global coordination problems.
Every project needs a risk assessment.
There's a feeling, just bubbling under the surface here at Less Wrong, that we're just playing at rationality. It's rationality kindergarten. The problem has been expressed in various ways:
- not a whole lot of rationality
- rationalist porn for daydreamers
- not quite as great as everyone seems to think
- shiny distraction
- only good for certain goals
And people are starting to look at fixing it. I'm not worried that their attempts - and mine - will fail. At least we'd have fun and learn something.
I'm worried that they will succeed.
What would such a Super Less Wrong community do? Its members would self-improve to the point where they had a good chance of succeeding at most things they put their mind to. They would recruit new rationalists and then optimize that recruitment process, until the community got big. They would develop methods for rapidly generating, classifying and evaluating ideas, so that the only ideas that got tried would be the best that anyone had come up with so far. The group would structure itself so that people's basic social drives - such as their desire for status - worked in the interests of the group rather than against it.
It would be pretty formidable.
What would the products of such a community be? There would probably be a self-help book that works. There would be an effective, practical guide to setting up effective communities. There would be an intuitive, practical guide to human behavior. There would be books, seminars and classes on how to really achieve your goals - and only the materials which actually got results would be kept. There would be a bunch of stuff on the Dark Arts too, no doubt. Possibly some AI research.
That's a whole lot of material that we wouldn't want to get into the hands of the wrong people.
- Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they're doing might be foolish. For example, building an AI without adding the friendliness features.
- Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.
- Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.
If this is a problem we should take seriously, what are some possible strategies for dealing with it?
- Just go ahead and ignore the issue.
- The Bayesian Conspiracy: only those who can be trusted are allowed access to the secret knowledge.
- The Good Word: mix in rationalist ideas with do-good and stay-safe ideas, to the extent that they can't be easily separated. The idea being that anyone who understands rationality will also understand that it must be used for good.
- Rationality cap: we develop enough rationality to achieve our goals (e.g. friendly AI) but deliberately stop short of developing the ideas too far.
- Play at rationality: create a community which appears rational enough to distract people who are that way inclined, but which does not dramatically increase their personal effectiveness.
- Risk management: accept that each new idea has a potential payoff (in terms of helping us avoid existential threats) and a potential cost (in terms of helping "bad rationalists"). Implement the ideas which come out positive.
In the post title, I have suggested an analogy with AI takeoff. That's not entirely fair; there is probably an upper bound to how effective a community of humans can be, at least until brain implants come along. We're probably talking two orders of magnitude rather than ten. But given that humanity already has technology with slight existential threat implications (nuclear weapons, rudimentary AI research), I would be worried about a movement that aims to make all of humanity more effective at everything they do.
I finished a novel last September, did most of the editing over Christmas, and have been procrastinating ever since. My novel has significant rationalist themes and would probably be of interest to a number of people here. Below is a plot synopsis. If you would be interesting in reading it, send me a private message with your email address and I can email you the Word file. I am still acceptiong editing suggestions.
Also, if anyone has suggestions as to where I could submit it, that would be very helpful.
Plot Synopsis: After the Flood
Ten-year-old Ash lives with a band of orphans in the flooded remains of a 21st-century city, where they live by diving for salvage in submerged buildings and trading it to adults in the mainland city. One day, when she watches a stranger attempting to climb the Wall, a mysterious and impregnable structure in the flooded city, he is injured and she saves his life. He claims that there are people living in the Wall, people who still have the knowledge and power that were lost during the long-ago flood.
Armed with her determination and cunning mind, Ash manages to break into the Wall and obtain medicine for the boy's sister, who is dying of tuberculosis. In the mainland city, however, the boy's parents are captured by the Church of Candles, which controls the city, and executed for their attempt to use the old knowledge.
Six years later, now a young adult apprenticed to a herb-woman on the outskirts of the city, Ash meets the brother and sister again and continues searching for the truth about the flood and the city's past.
Why do we imagine our actions could have consequences for more than a few million years into the future?
Unless what we believe about evolution is wrong, or UFAI is unlikely, or we are very very lucky, we should assume there are already a large number of unfriendly AIs in the universe, and probably in our galaxy; and that they will assimilate us within a few million years.
Therefore, justifications for harming people on Earth today in the name of protecting the entire universe over all time from UFAI in the future, like this one, should not be done. Our default assumption should be that the offspring of Earth will at best have a short happy life.
ADDED: If you observe, as many have, that Earth has not yet been assimilated, you can draw one of these conclusions:
- The odds of intelligent life developing on a planet are precisely balanced with the number of suitable planets in our galaxy, such that after billions of years, there is exactly one such instance. This is an extremely low-probability argument. The anthropic argument does not justify this as easily as it justifies observing one low-probability creation of intelligent life.
- The progression (intelligent life →AI→expansion and assimilation) is unlikely.
Surely, for a Bayesian, the more reasonable conclusion is number 2! Conclusion 1 has priors we can estimate numerically. Conclusion 2 has priors we know very little about.
To say, "I am so confident in my beliefs about what a superintelligent AI will do, that I consider it more likely that I live on an astronomically lucky planet, than that those beliefs are wrong", is something I might come up with if asked to draw a caricature of irrationality.
It's surprised me that there's been very little discussion of The Long Now here on Less Wrong, as there are many similarities between the groups, although the approach and philosophy between them are quite different. At a minimum, I believe that a general awareness might be beneficial. I'll use the initials LW and LN below. My perspective on LN is simply that of someone who's kept an eye on their website from time to time and read a few of their articles, so I'd also like to admit that my knowledge is a bit shallow (a reason, in fact, I bring the topic up for discussion).
Most critically, long-term thinking appears as a cornerstone of both the LW and LN thought, explicitly as the goal for LN, and implicitly here on LW whenever we talk about existential risk or decades-away or longer technology. It's not clear if there's an overlap between the commenters at LW and the membership of LN or not, but there's definitely a large number of people "between" the two groups -- statements by Peter Thiel and Ray Kurzweil have been recent topics on the LN blog and Hillis, who founded LN, has been involved in AI and philosophy of mind. LN has Long Bets, which I would loosely describe as to PredictionBook as InTrade is to Foresight Exchange. LN apparently had a presence at some of the past SIAI's Singularity Summits.
Signaling: LN embraces signaling like there's no tomorrow (ha!) -- their flagship project, after all, is a monumental clock to last thousands of years, the goal of which is to "lend itself to good storytelling and myth" about long-term thought. Their membership cards are stainless steel. Some of the projects LN are pursuing seem to have been chosen mostly because they sound awesome, and even those that aren't are done with some flair, IMHO. In contrast, the view among LW posts seems to be that signaling is in many cases a necessary evil, in some cases just an evolutionary leftover, and reducing signaling a potential source for efficiency gains. There may be something to be learned here -- we already know FAI would be an easier sell if we described it as project to create robots that are Presidents of the United States by day, crime-fighters by night, and cat-people by late-night.
Structure: While LW is a project of SIAI, they're not the same, so by extension the comparison between LN and LW is just a bit apples-to-kumquats. It'd be a lot easier to compare LW to a LN discussion board, if it existed.
The Future: Here on LW, we want our nuclear-powered flying cars, dammit! Bad future scenarios that are discussed on LW tend to be irrevocably and undeniably bad -- the world is turned into tang or paperclips and no life exists anymore, for example. LN seems more concerned with recovery from, rather than prevention of, "collapse of civilization" scenarios. Many of the projects both undertaken and linked to by LN focus on preserving knowledge in a such a scenario. Between the overlap in the LW community and cryonics, SENS, etc, the mental relationship between the median LW poster and the future seems more personal and less abstract.
Politics: The predominant thinking on LW seems to be a (very slightly left-leaning) technolibertarianism, although since it's open to anyone who wanders in from the Internet, there's a lot of variation (if either SIAI or FHI have an especially strong political stance per se, I've not noticed it). There's also a general skepticism here regarding the soundness of most political thought and of many political processes. LN seems further left on average and more comfortable with politics in general (although calling it a political organization would be a bit of a stretch). Keeping with this, LW seems to have more emphasis on individual decision making and improvement than LN.
I think my dog is about to die. Even if I thought it was worth it I don't have the money to freeze her. But I am curious to know how people here feel about the practice and whether anyone plans to do this for their pet. It seems like a practice that plays into the image of cryonics as the domain of strange and egotistical rich people. On the other hand it also seems like a rather human and heart warming practice. Is pet cryopreservation good for the image of cryonics?
Also, do people who just do neuro get their pets preserved? Will people upload pets? Assuming life as an emulation feels different from life as a biological organism is it ethical to upload animals? The transition might be strange and uncomfortable but we expect at least some humans to take the risk and live with any differences. But animals don't understand this and might not have the mental flexibility to adjust.
Value deathism by Vladimir Nesov encourages us to fix our values to prevent astronomical waste due to under-optimized future.
When I've read it I found that I think about units of measurement of mentioned astronomical waste. Utilons? Seems so.  Jack suggested widely accepted word Utils instead.[/edit]
I've tried to precisely define it. It is difference between utility of some world-state G measured by original (drifting) agent and utility of world-state G measured by undrifting version of original agent, where world-state G is optimal according to original (drifting) agent.
There are two questions: can we compare utilities of those agents and what does it mean that G is optimal?
Preconditions: world is deterministic, the agent has full knowledge of the world, i.e. it knows current world-state, full list of actions available for every world-state and consequence of each action (world-state it leads to), the agent has no time limit for computing next action.
Agent's value is defined as a function from set of world-states to real numbers, for the sake of, uhm, clarity, the bigger the better. (Note: it is unnecessary to define value as a function from set of sequences of world-states, as history of world can be deduced from world-state itself, and if it can't be deduced, then the agent can't use history anyway, as the agent is a part of this world-state, so it doesn't "remember" history too).  I wasn't aware that this note includes hidden assumption: value of world-state must be constant. But this assumption doesn't allow agent to single out world-state where agent loses all or part of its memory. Thus value as a function over sequences of world-states has a right to be. But this value function still needs to be specifically shaped to be optimization algorithm independent. [/edit]
Which sequence of world-states is optimal according to agent's value?
Edit: Consider agents implementing greedy search algorithm and exhaustive search algorithm. For them to choose same sequence of world-states search space should be greedoid. And that requires very specific structure of value function.
Edit2: Alternatively value function can be indirectly self-referential via part of world-state that contains the agent, thus allowing it to modify agent's optimization algorithm by assigning higher utility to world-states where agent implements desired optimization algorithm. (I call agent's function 'value function' because its meaning can be defined by the function itself, it isn't necessarily utility).
Jura inyhr shapgvba bs gur ntrag vfa'g ersyrpgvir, v.r. qbrfa'g qrcraq ba vagrecergngvba bs n cneg bs jbeyq-fgngr bpphcvrq ol ntrag va grezf bs bcgvzvmngvba cebprff vzcyrzragrq ol guvf cneg bs jbeyq-fgngr, gura bcgvzny frdhrapr qrcraqf ba pbzovangvba bs qrgnvyf bs vzcyrzragngvba bs ntrag'f bcgvzvmngvba nytbevguz naq inyhr shapgvba. V guvax va trareny vg jvyy rkuvovg SBBZ orunivbe.
Ohg jura inyhr shapgvba vf ersyrpgvir gura guvatf orpbzr zhpu zber vagrerfgvat.
I'll try to analyse behavior of classical paperclip maximizer, using toy model I described earlier. Let utility function be min(number_of_paperclips_produced, 50).
1. Paperclip maximizer implements greedy search algorithm. If it can't produce paperclip (all available actions lead to the same utility), it performs action that depends on implementation of greedy search. All in all it acts erratically, while it isn't occasionally terminated (it stumbled into world-state where there's no available actions for him).
2. Paperclip maximizer implements full-search algorithm. Result depends on implementation of full-search. If implementation executes shortest sequence of actions that leads to globally maximal value of utility function, then it produces 50 paperclips as fast as it can  or it wireheads itself into state where his paperclip counter>50 whichever is faster [/edit], then terminates itself. If implementation executes longest possible sequence of actions that leads to globally maximal value of utility function, then the agent behave erratically, but is guarantied to survive, while its optimization algorithm behave according to original plan, but it will occasionally modify itself and gets terminated, as original plan doesn't care about preservation of agent's optimization algorithm or utility function.
It seems that in full-knowledge case powerful optimization processes don't go FOOM. Full-search algorithm is maximally powerful isn't it?
Maybe it is uncertainty that leads to FOOMing?
Indexical uncertainty can be represented by assumption, than agent knows set of world-states it can be in, and a set of available actions for world-state it is actually in. I'll try to analyze this case later.
Edit4: Edit3 is wrong. Utility function in that toy model cannot be so simple if it uses some property of the agent. However it seems OK to extend model by including high-level description of state of the agent into world-state, then edit3 holds.