Hawking/Russell/Tegmark/Wilczek on dangers of Superintelligent Machines [link]
http://www.huffingtonpost.com/stephen-hawking/artificial-intelligence_b_5174265.html
Very surprised none has linked to this yet:
TL;DR: AI is a very underfunded existential risk.
Nothing new here, but it's the biggest endorsement the cause has gotten so far. I'm greatly pleased they got Stuart Russell, though not Peter Norvig, who seems to remain lukewarm to the cause. Also too bad this was Huffington vs something more respectable. With some thought I think we could've gotten the list to be more inclusive and found a better publication; still I think this is pretty huge.
Southern California FAI Workshop
This Saturday, April 26th, we will be holding a one day FAI workshop in southern California, modeled after MIRI's FAI workshops. We are a group of individuals who, aside from attending some past MIRI workshops, are in no way affiliated with the MIRI organization. More specifically, we are a subset of the existing Los Angeles Less Wrong meetup group that has decided to start working on FAI research together.
The event will start at 10:00 AM, and the location will be:
USC Institute for Creative Technologies
12015 Waterfront Drive
Playa Vista, CA 90094-2536.
This first workshop will be open to anyone who would like to join us. If you are interested, please let us know in the comments or by private message. We plan to have more of these in the future, so if you are interested but unable to makethis event, please also let us know. You are welcome to decide to join at the last minute. If you do, still comment here, so we can give you necessary phone numbers.
Our hope is to produce results that will be helpful for MIRI, and so we are starting off by going through the MIRI workshop publications. If you will be joining us, it would be nice if you read the papers linked to here, here, here, here, and here before Saturday. Reading all of these papers is not necessary, but it would be nice if you take a look at one or two of them to get an idea of what we will be doing.
Experience in artificial intelligence will not be at all necessary, but experience in mathematics probably is. If you can follow the MIRI publications, you should be fine. Even if you are under-qualified, there is very little risk of holding anyone back or otherwise having a negative impact on the workshop. If you think you would enjoy the experience, go ahead and join us.
This event will be in the spirit of collaboration with MIRI, and will attempt to respect their guidelines on doing research that will decrease, rather than increase, existential risk. As such, practical implementation questions related to making an approximate Bayesian reasoner fast enough to operate in the real world will not be on-topic. Rather, the focus will be on the abstract mathematical design of a system capable of having reflexively consistent goals, preforming naturalistic induction, et cetera.
Food and refreshments will be provided for this event, courtesy of MIRI.
Bostrom versus Transcendence
Nick Bostrom takes on the facts, the fictions and the speculations in the movie Transcendence:

Could you upload Johnny Depp's brain? Oxford Professor on Transcendence
How soon until machine intelligence? Oxford professor on Transcendence
Would you have warning before artificial superintelligence? Oxford professor on Transcendence
Oxford professor on Transcendence: how could you get a machine intelligence?
How long will Alcor be around?
The Drake equation for cryonics is pretty simple: work out all the things that need to happen for cryonics to succeed one day, estimate the probability of each thing occurring independently, then multiply all those numbers together. Here’s one example of the breakdown from Robin Hanson. According to the 2013 LW survey, LW believes the average probability that cryonics will be successful for someone frozen today is 22.8% assuming no major global catastrophe. That seems startlingly high to me – I put the probability at at least two orders of magnitude lower. I decided to unpick some of the assumptions behind that estimate, particularly focussing on assumptions which I could model.
EDIT: This needs a health warning; here be overconfidence dragons. There are psychological biases that can lead you to estimating these numbers badly based on the number of terms you're asked to evaluate, statistical biases that lead to correlated events being evaluated independently by these kind of models and overall this can lead to suicidal overconfidence if you take the nice neat number these equations spit out as gospel.
Every breakdown includes a component for ‘the probability that the company you freeze with goes bankrupt’ for obvious reasons. In fact, the probability of bankruptcy (and global catastrophe) are particularly interesting terms because they are the only terms which are ‘time dependant’ in the usual Drake equation. What I mean by this is that if you know your body will be frozen intact forever, then it doesn’t matter to you when effective unfreezing technology is developed (except to the extent you might have a preference to live in a particular time period). By contrast, if you know safe unfreezing techniques will definitely be developed one day it matters very much to you that it occurs sooner rather than later because if you unfreeze before the development of these techniques then they are totally wasted on you.
The probability of bankruptcy is also very interesting because – I naively assumed last week – we must have excellent historical data on the probability of bankruptcy given the size, age and market penetration of a given company. From this – I foolishly reasoned – we must be able to calculate the actual probability of the ‘bankruptcy’ component in the Cryo-Drake equation and slightly update our beliefs.
I began by searching for the expected lifespan of an average company and got two estimates which I thought would be a useful upper- and lower-bound. Startup companies have an average lifespan of four years. S&P 500 companies have an average lifespan of fifteen years. My logic here was that startups must be the most volatile kind of company, S&P 500 must be the least volatile and cryonics firms must be somewhere in the middle. Since the two sources only report the average lifespan, I modelled the average as a half-life. The results really surprised me; take a look at the following graph:

(http://imgur.com/CPoBN9u.jpg)
Even assuming cryonics firms are as well managed as S&P 500 companies, a 22.8% chance of success depends on every single other factor in the Drake equation being absolutely certain AND unfreezing technology being developed in 37 years.
But I noticed I was confused; Alcor has been around forty-ish years. Assuming it started life as a small company, the chance of that happening was one in ten thousand. That both Alcor AND The Cryonics Institute have been successfully freezing people for forty years seems literally beyond belief. I formed some possible hypotheses to explain this:
- Many cryo firms have been set up, and I only know about the successes (a kind of anthropic argument)
- Cryonics firms are unusually well-managed
- The data from one or both of my sources was wrong
- Modelling an average life expectancy as a half-life was wrong
- Some extremely unlikely event that is still more likely than the one in billion chance my model predicts – for example the BBC article is an April Fool’s joke that I don’t understand.
I’m pretty sure I can rule out 1; if many cryo firms were set up I’d expect to see four lasting twenty years and eight lasting ten years, but in fact we see one lasting about five years and two lasting indefinitely. We can also probably rule out 2; if cryo firms were demonstrably better managed than S&P 500 companies, the CEO of Alcor could go and run Microsoft and use the pay differential to support cryo research (if he was feeling altruistic). Since I can’t do anything about 5, I decided to focus my analysis on 3 and 4. In fact, I think 3 and 4 are both correct explanations; my source for the S&P 500 companies counted dropping out of the S&P 500 as a company ‘death’, when in fact you might drop out because you got taken over, because your industry became less important (but kept existing) or because other companies overtook you – your company can’t do anything about Facebook or Apple displacing them from the S&P 500, but Facebook and Apple don’t make you any more likely to fail. Additionally, modelling as a half-life must have been flawed; a company that has survived one hundred years and a company that has survived one year are not equally likely to collapse!
Consequently I searched Google Scholar for a proper academic source. I found one, but I should introduce the following caveats:
- It is UK data, so may not be comparable to the US (my understanding is that the US is a lot more forgiving of a business going bankrupt, so the UK businesses may liquidate slightly less frequently).
- It uses data from 1980. As well as being old data, there are specific reasons to believe that this time period overestimates the true survival of companies. For example, the mid-1980’s was an economic boom in the UK and 1980-1985 misses both major UK financial crashes of modern times (Black Wednesday and the Sub-Prime Crash). If the BBC is to be believed, the trend has been for companies to go bankrupt more and more frequently since the 1920’s.
I found it really shocking that this question was not better studied. Anyway, the key table that informed my model was this one, which unfortunately seems to break the website when I try to embed it. The source is Dunne, Paul, and Alan Hughes. "Age, size, growth and survival: UK companies in the 1980s." The Journal of Industrial Economics (1994): 115-140.
You see on the left the size of the company in 1980 (£1 in 1980 is worth about £2.5 now). On the top is the size of the company in 1985, with additional columns for ‘taken over’, ‘bankrupt’ or ‘other’. Even though a takeover might signal the end of a particular product line within a company, I have only counted bankruptcies as representing a threat to a frozen body; it is unlikely Alcor will be bought out by anyone unless they have an interest in cryonics.
The model is a Discrete Time Markov Chain analysis in five-year increments. What this means is that I start my hypothetical cryonics company at <£1m and then allow it to either grow or go bankrupt at the rate indicated in the article. After the first period I look at the new size of the company and allow it to grow, shrink or go bankrupt in accordance with the new probabilities. The only slightly confusing decision was what to do with takeovers. In the end I decided to ignore takeovers completely, and redistribute the probability mass they represented to all other survival scenarios.
The results are astonishingly different:

(http://imgur.com/CkQirYD.jpg)
Now your body can remain alive 415 years for a 22.8% chance of revival (assuming all other probabilities are certain). Perhaps more usefully, if you estimate the year you expect revival to occur you can read across the x axis to find the probability that your cryo company will still exist by then. For example in the OvercomingBias link above, Hanson estimates that this will occur in 2090, meaning he should probably assign something like a 0.65 chance to the probability his cryo company is still around.
Remember you don’t actually need to estimate the actual year YOUR revival will occur, but only the first year the first successful revival proves that cryogenically frozen bodies are ‘alive’ in a meaningful sense and therefore recieve protection under the law in case your company goes bankrupt. In fact, you could instead estimate the year Congress passes a ‘right to not-death’ law which would protect your body in the event of a bankruptcy even before routine unfreezing, or the year when brain-state scanning becomes advanced enough that it doesn’t matter what happens to your meatspace body because a copy of your brain exists on the internet.
My conclusion is that the survival of your cryonics firm is a lot more likely that the average person in the street thinks, but probably a lot less likely that you think if you are strongly into cryonics. This is probably not news to you; most of you will be aware of over-optimism bias, and have tried to correct for it. Hopefully these concrete numbers will be useful next time you consider the Cryo-Drake equation and the net present value of investing in cryonics.
LINK-Cryonics Institute documentary
"WE WILL LIVE AGAIN looks inside the unusual and extraordinary operations of the Cryonics Institute. The film follows Ben Best and Andy Zawacki, the caretakers of 99 deceased human bodies stored at below freezing temperatures in cryopreservation. The Institute and Cryonics Movement were founded by Robert Ettinger who, in his nineties and long retired from running the facility, still self-publishes books on cryonics, awaiting the end of his life and eagerly anticipating the next."
HPMOR hypothesis: Harry will use Timeless Decision Theory to resolve some of the time-turner paradoxes
Evidence:
- The obsession with precise times in the last few chapters, the prominence of time-turners in the plot in general, and Harry's vow to revive Hermione all indicate use of time-turners in the final arc.
- EY has involved many of his favorite ideas and themes (especially from the sequences) into HPMOR already. Timeless Decision Theory is without a doubt among his most prominent interests.
- Harry has already gained two superpowers (super-patronus and partial transfiguration) by virtue of, well, being a proponent of EY's favorite themes essentially. Why not a third?
- One specific concrete use would be to coordinate an indefinite number of selves in the way that Harry failed to during the prime-factoring experiment in the early chapters. Why did that experiment fail? Not because time is impossible to mess with, but because one of the Harry's messed up. But since then, Harry has been pushing the bounds of paradox. If he could firmly pre-commit to follow through on a course of action (perhaps with an unbreakable oath?) he could have an indefinite number of Harry's coordinate on some action. There are many ways this could be useful.
Solomonoff induction on a random string
So, I've been hearing a lot about the awesomeness of Solomonoff induction, at least as a theoretical framework. However, my admittedly limited understanding of Solomonoff induction suggests that it would form an epicly bad hypothesis if given a random string. So my question is, if I misunderstood, how does it deal with randomness? And if I understood correctly, isn't this a rather large gap?
Edit: Thanks for all the comments! My new understanding is that Solomonoff induction is able to understand that it is dealing with a random string (because it finds itself giving equal weight to programs that output a 1 or a 0 for the next bit), but despite that is designed to keep looking for a pattern forever. While this is a horrible use of resources, the SI is a theoretical framework that has infinite resources, so that's a meaningless criticism. Overall this seems acceptable, though if you want to actually implement a SI you'll need to instruct it on giving up. Furthermore, the SI will not include randomness as a distinct function in its hypothesis, which could lead to improper weighting of priors, but will still have good predictive power -- and considering that Solomonoff induction was only meant for computable functions, this is a pretty good result.
The Case For Free Will or Why LessWrong must commit to self determination
This is intended to eventually be a Main post and part of sequences on free will and religion. It will be part of the Free Will sequence.
Please comment if you do or do not think this post is ready for Main. I intend to move it there eventually. As with any post at LessWrong, I'm completely open to criticism, but I hope it's directed at improving the quality of the thinking here rather than kneejerk opposition to my ideas.
------------------------------------------------------
The main point of this post is that I intend to convince every rationalist here, and every causal reader, to commit to allowing others to have free will.
First a bit of background. I'm a conservative christian. Growing up I considered myself a rationalist. Now that I've known about Less Wrong for several years and have read the sequences, I no longer think I can classify myself that way <grin>. Nowdays I usually consider myself a pragmatist. "Being a rationalist" now carries with it a significant weight in my mind of formal Bayes Theorem and such that I've never had time to fully follow through and practice. I also have a little fear that completely committing to be Bayesian would eventually put a huge conflict between my faith and Bayesian reasoning - just a little fear. I've been reading Less Wrong for years now, they've all been resolve to my satisfaction. I also haven't simply because looking at the math that gets thrown around here in Bayes Theorem discussion seems like it would take too much time for me to understand, and I'm already very busy (and, being an engineer and not a math major, a bit intimidating).
The main reason I come here is because this community thinks about thinking, which so few people around me do. I crave that introspection that happens here, and so I'm drawn back to it. Not always often, but enough to generally stay abreast of what's going on. (I also have to admit to myself that I come back because you people are very smart, and I want you to think of me as smart too, and have your approval, but I try to keep that in check <grin>)
Now that I've been here (online only - no meetups yet) and learned with you over the years, another reason I stay here is because of the clear success of Evolutionary Psychology in predicting human behavior. The clearest example I've ever had is this:
My children and I love to chase each other around the house. It drives my wife crazy, especially when it happens right at bedtime. At some point after I read about evolutionary psychology, this chain of logic dawned on me: The natural genetic behavior that's successful gets reinforced over generations -> Things you love to do naturally are joyful to you -> You pass those things on to your children through play the way lions play hunt with cubs -> Human parents and children get true joy from chasing each other because their ancestors loved the hunt and were successful at it!
Now THAT was an eye opener! It was the answer to a question I'd never known I had, which was this. Why do children love to chase, and why do I love to chase them? Because their ancestors survived that way and it was passed to them genetically. I even like to playfully almost-catch-them-and-let-them-escape. I even playfully let them catch me, too. And we love it.
Religion has no answer to this question. Religion doesn't even know how to ask this question. But it flowed naturally out of Evolutionary Psychology just by my knowing that the concept existed! Powerful! Now, this post isn't really about religion so I won't go into why that doesn't break my faith. I'll handle that it other posts. The reason why I'm talking about it now is to get you to recognize that you are a tribal hunter by ancestry, even more fundamentally than you are the descendant of conquerors. And knowing that Politics Is The Mind Killer, you'll listen to this next part, and take it seriously.
Less Wrong rationalists are growing, and being recognized by the religious community. As militant Atheists. It's reported that this is a new thing among atheists, this new desire to spread atheist philosophies as strongly as any religion spreads it's beliefs. I've seen it in a couple places now, in about the last year.
I have a huge, scary concern for the future of our world. It's not atheism. And it's not religion. I fear future wars. As a military history enthusiast and a veteran I've learned a lot about war. A lot. And the principle is true that those who don't learn from history are doomed to repeat it. Knowing that we are tribal animals I see aetheists as one tribe and religionists as another. Now that I see the of growth and success of LW I see a future pattern emerging in the United States:
Few atheists among overwhelming Christians -> shrinking Christianity, growing Atheism -> atheism tribalness growing well connected and strong -> Natural tribal impulse to not tolerate different voices -> war between atheists and Christians.
Don't try to say this won't happen, and that Rationalists will always allow other people to believe differently. Coherent Extrapolated Volition, Politics is the Mind Killer, and Eliezar' success in creating the LW and rationalist movement say otherwise. Now, today, the commitment to altruism seems like a solution, but it isn't. You all here are so very intelligent and you seriously look down on those of faith. I see it all over the place. It's a real blind spot that you can't see because it's inside your mental algorithms. Altruism is very easily perverted into forcing other people because you know what is best for them. It's not enough by itself. It needs something else attached.
Someday there will come a time when new leaders will come up trough the rationalist movement who don't have Eliezar's commitment to freedom. And power corrupts even good, compassionate people. So now I come to my request.
This principle needs to the rationalist movement. A guarantee of free will for others that disagree with you, EVEN IF THEY ARE WRONG.
I know religions have not always had this either. Be better than the religions you despise. Recognize that they also are tribal animals trying to become civilized tribal animals.
I ask you personally to commit to making free will for all a part of your personal philosophy. And I ask you to formalize that as part of Less Wrong, the Rationalist community, and your evangelical aetheism. Plant the seed now so that is has time to grow. It is my fear that if you don't your children's children, and my childrens' children, will know a brutal war of philosophies unlike any we have ever seen.
In a future post I'll cover how religions are the empirically determined solution to problems that prevented civilization from arising, and how rationalism is the modern, more specifically planned version. And why religion is not evil like you think it is.
Sincerely,
Troshen
Siren worlds and the perils of over-optimised search
tl;dr An unconstrained search through possible future worlds is a dangerous way of choosing positive outcomes. Constrained, imperfect or under-optimised searches work better.
Some suggested methods for designing AI goals, or controlling AIs, involve unconstrained searches through possible future worlds. This post argues that this is a very dangerous thing to do, because of the risk of being tricked by "siren worlds" or "marketing worlds". The thought experiment starts with an AI designing a siren world to fool us, but that AI is not crucial to the argument: it's simply an intuition pump to show that siren worlds can exist. Once they exist, there is a non-zero chance of us being seduced by them during a unconstrained search, whatever the search criteria are. This is a feature of optimisation: satisficing and similar approaches don't have the same problems.
The AI builds the siren worlds
Imagine that you have a superintelligent AI that's not just badly programmed, or lethally indifferent, but actually evil. Of course, it has successfully concealed this fact, as "don't let humans think I'm evil" is a convergent instrumental goal for all AIs.
We've successfully constrained this evil AI in a Oracle-like fashion. We ask the AI to design future worlds and present them to human inspection, along with an implementation pathway to create those worlds. Then if we approve of those future worlds, the implementation pathway will cause them to exist (assume perfect deterministic implementation for the moment). The constraints we've programmed means that the AI will do all these steps honestly. Its opportunity to do evil is limited exclusively to its choice of worlds to present to us.
The AI will attempt to design a siren world: a world that seems irresistibly attractive while concealing hideous negative features. If the human mind is hackable in the crude sense - maybe through a series of coloured flashes - then the AI would design the siren world to be subtly full of these hacks. It might be that there is some standard of "irresistibly attractive" that is actually irresistibly attractive: the siren world would be full of genuine sirens.
Even without those types of approaches, there's so much manipulation the AI could indulge in. I could imagine myself (and many people on Less Wrong) falling for the following approach:
AI risk, executive summary
MIRI recently published "Smarter than Us", a 50 page booklet laying out the case for considering AI as an existential risk. But many people have asked for a shorter summary, to be handed out to journalists for example. So I put together the following 2-page text, and would like your opinion on it.
In this post, I'm not so much looking for comments along the lines of "your arguments are wrong", but more "this is an incorrect summary of MIRI/FHI's position" or "your rhetoric is infective here".
AI risk
Bullet points
- The risks of artificial intelligence are strongly tied with the AI’s intelligence.
- There are reasons to suspect a true AI could become extremely smart and powerful.
- Most AI motivations and goals become dangerous when the AI becomes powerful.
- It is very challenging to program an AI with safe motivations.
- Mere intelligence is not a guarantee of safe interpretation of its goals.
- A dangerous AI will be motivated to seem safe in any controlled training setting.
- Not enough effort is currently being put into designing safe AIs.
Executive summary
The risks from artificial intelligence (AI) in no way resemble the popular image of the Terminator. That fictional mechanical monster is distinguished by many features – strength, armour, implacability, indestructability – but extreme intelligence isn’t one of them. And it is precisely extreme intelligence that would give an AI its power, and hence make it dangerous.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)