There has recently been some speculation that life started on Mars, and then got blasted to earth by an asteroid or something. Molybdenum is very important to life (eukaryote evolution was delayed by 2 billion years because it was unavailable), and the origin of life is easier to explain if Molybdenum is available. The problem is that Molybdenum wasn't available in the right time frame on Earth, but it was on Mars.
Anyway, assuming this speculation is true, Mars had the best conditions for starting life, but Earth had the best conditions for life existing, and it is unlikely conscious life would have evolved without either of these planets being the way they are. Thus, this could be another part of the Great Filter.
Side note: I find it amusing that Molybdenum is very important in the origin/evolution of life, and is also element 42.
The ancient Stoics apparently had a lot of techniques for habituation and changing cognitive processes. Some of those live on in the form of modern CBT. One of the techniques is to write a personal handbook with advice and sayings to carry around at all times as to never be without guidance from a calmer self. Indeed, Epictet advises to learn this handbook by rote to further internalisation. So I plan to write such a handbook for myself, once in long form with anything relevant to my life and lifestyle, and once in a short form that I update with things that are difficult at that time, be it strong feelings or being deluded by some biases.
In this book I intend to include a list of all known cognitive biases and logical fallacies. I know that some biases are helped by simply knowing them, does anyone have a list of those? And should I complete the books or have a clear concept of their contents, are you interested in reading about the process of creating one and possible perceived benefits?
I'm also interested in hearing from you again about this project if you decide to not complete it. Rock on, negative data!
Though lack of motivation or laziness is not a particularly interesting answer.
I have found "I thought X would be awesome, and then on doing X realized that the costs were larger than the benefits" to be useful information for myself and others. (If your laziness isn't well modeled by that, that's also valuable information for you.)
(mild exaggeration) Has anyone else transitioned from "I only read Main posts, to I nearly only read discussion posts, to actually I'll just take a look at the open threat and people who responded to what I wrote" during their interactions with LW?
To be more specific, is there a relevant phenomenon about LW or is it just a characteristic of my psyche and history that explain my pattern of reading LW?
I read the sequences and a bunch of other great old main posts but now mostly read discussion. It feels like Main posts these days are either repetitive of what I've read before, simply wrong or not even wrong, or decision theory/math that's above my head. Discussion posts are more likely to be novel things I'm interested in reading.
Selection bias alert: asking people whether they have transitioned to reading mostly discussion and then to mostly just open threads in an open thread isn't likely to give you a good perspective on the entire population, if that is in fact what you were looking for.
Honestly, I don't know why Main is even an option for posting. It should really be just an automatically labeled/generated "Best of LW" section, where Discussion posts with, say, 30+ karma are linked. This is easy to implement, and easy to do manually using the Promote feature until it is. The way it is now, it's mostly by people thinking that they are making an important contribution to the site, which is more of a statement about their ego than about quality of their posts.
Background: "The genie knows, but doesn't care" and then this SMBC comic.
The joke in that comic annoys me (and it's a very common one on SMBC, there must be at least five there with approximately the same setup). Human values aren't determined to align with the forces of natural selection. We happen to be the product of natural selection, and, yes, that made us have some values which are approximately aligned with long-term genetic fitness. But studying biology does not make us change our values to suddenly become those of evolution!
In other words, humans are a 'genie that knows, but doesn't care'. We have understood the driving pressures that created us. We have understood what they 'want', if that can really be applied here. But we still only care about the things which the mechanics of our biology happened to have made us care about, even though we know these don't always align with the things that 'evolution cares about.'
(Please if someone can think of a good way to say this all without anthropomorphising natural selection, help me. I haven't thought enough about this subject to have the clarity of mind to do that and worry that I might mess up because of such metaphors.)
Anyone tried to use the outside view on our rationalist community?
I mean, we are not the first people on this planet who tried to become more rational. Who were our predecessors, and what happened to them? Where did they succeed and where they failed? What lesson can we take from their failures?
The obvious reply will be: No one has tried doing exactly the same thing as we are doing. That's technically true, but that's a fully general excuse against using outside view, because if you look into enough details, no two projects are exactly the same. Yet it is experimentally proved that even looking at sufficiently similar projects gives better estimates than just using the inside view. So, if there was no one exactly like us, who was the most similar?
I admit I don't have data on this, because I don't study history, and I have no personal experience with Objectivists (which are probably the most obvious analogy). I would probably put Objectivists, various secret societies, educational institutions, or self-help groups into the reference class. Did I miss something important? The common trait is that those people are trying to make their thinking better, avoid some frequent faults, and t...
The reason why I asked was not just "who can we be pattern-matched with?", but also "what can we predict from this pattern-matching?". Not merely to say "X is like Y", but to say "X is like Y, and p(Y) is true, therefore it is possible that p(X) is also true".
Here are two answers pattern-matching LW to a cult. For me, the interesting question here is: "how do cults evolve?". Because that can be used to predict how LW will evolve. Not connotations, but predictions of future experiences.
My impression of cults is that they essentially have three possible futures: Some of them become small, increasingly isolated groups, that die with their members. Others are viral enough to keep replacing the old members with new members, and grow. The most successful ones discover a way of living that does not burn out their members, and become religions. -- Extinction, virality, or symbiosis.
What determines which way a cult will go? Probably it's compatibility of long-term membership with ordinary human life. If it's too costly, if it requires too much sacrifice from members, symbiosis is impossible. The other two choices probably depend on how much ...
To maybe help others out and solve the trust bootstrapping involved, I'm offering for sale <=1 bitcoin at the current Bitstamp price (without the usual premium) in exchange for Paypal dollars to any LWer with at least 300 net karma. (I would prefer if you register with #bitcoin-otc, but that's not necessary.) Contact me on Freenode as gwern.
EDIT: as of 9 September 2013, I have sold to 2 LWers.
Abstract
What makes money essential for the functioning of modern society? Through an experiment, we present evidence for the existence of a relevant behavioral dimension in addition to the standard theoretical arguments. Subjects faced repeated opportunities to help an anonymous counterpart who changed over time. Cooperation required trusting that help given to a stranger today would be returned by a stranger in the future. Cooperation levels declined when going from small to large groups of strangers, even if monitoring and payoffs from cooperation were invariant to group size. We then introduced intrinsically worthless tokens. Tokens endogenously became money: subjects took to reward help with a token and to demand a token in exchange for help. Subjects trusted that strangers would return help for a token. Cooperation levels remained stable as the groups grew larger. In all conditions, full cooperation was possible through a social norm of decentralized enforcement, without using tokens. This turned out to be especially demanding in large groups. Lack of trust among strangers thus made money behaviorally essential. To explain these results, we developed an evolutionary model. When behavior in society is heterogeneous, cooperation collapses without tokens. In contrast, the use of tokens makes cooperation evolutionarily stable.
Does this also work with macaques, crows or some other animals that can be taught to use money, but didn't grow up in a society where this kind of money use is taken for granted?
Who is this and what has he done with Robin Hanson?
The central premise is in allowing people to violate patents if it is not "intentional". While reading the article the voice in my head which is my model of Robin Hanson was screaming "Hypocrisy! Perverse incentives!" in unison with the model of Eliezer Yudkowsky which was also shouting "Lost Purpose!". While the appeal to total invasive surveillance slightly reduced the hypocrisy concerns it at best pushes the hypocrisy to a higher level in the business hierarchy while undermining the intended purpose of intellectual property rights.
That post seemed out of place on the site.
This may be an odd question, but what (if anything) is known on turning NPCs into PCs? (Insert your own term for this division here, it seems to be a standard thing AFAICT.)
I mean, it's usually easier to just recruit existing PCs, but ...
The Travelling Salesman Problem
...Powell’s biggest revelation in considering the role of humans in algorithms, though, was that humans can do it better. “I would go down to Yellow, we were trying to solve these big deterministic problems. We weren’t even close. I would sit and look at the dispatch center and think, how are they doing it?” That’s when he noticed: They are not trying to solve the whole week’s schedule at once. They’re doing it in pieces. “We humans have funny ways of solving problems that no one’s been able to articulate,” he says. Operations
If anyone wants to teach English in China, my school is hiring. The pay is higher than the market rate and the management is friendly and trustworthy. Must have a Bachelor's degree and a passport from and English speaking country. If you are at all curious, PM me for details.
I have updated on how important it is for Friend AI to succeed (more now). I did this by changing the way I thought about the problem. I used to think in terms of the chance of Unfriendly AI, this lead me to assign a chance of whether a fast, self-modifying, indifferent or FAI was possible at all.
Instead of thinking of the risk of UFAI, I started thinking of the risk of ~FAI. The more I think about it the more I believe that a Friendly Singleton AI is the only way for us humans to survive. FAI mitigates other existential risks of nature, unknowns, hu...
Is there a name for, taking someone being wrong on A as evidence as being wrong on B? Is this a generally sound heuristic to have? In the case of crank magnetism; should I take someone's crank ideas, as evidence against an idea that is new and unfamiliar to me?
It's evidence against them being a person whose opinion is strong evidence of B, which means it is evidence against B, but it's probably weak evidence, unless their endorsement of B is the main thing giving it high probability in your book.
Are old humans better than new humans?
This seems to be a hidden assumption of cryonics / transhumanism / anti-deathism: We should do everything we can to prevent people from dying, rather than investing these resources into making more or more productive children.
The usual argument (which I agree with) is that "Death events have a negative utility". Once a human already exists, it's bad for them to stop existing.
Assuming Rawls's veil of ignorance, I would prefer to be randomly born in a world where a trillion people lead billion-year lifespans than one in which a quadrillion people lead million-year lifespans.
I agree, but is this the right comparison? Isn't this framing obscuring the fact that in the trillion-people world, you are much less likely to be born in the first place, in some sense?
Let us try this framing instead: Assume there are a very large number Z of possible different human "persons" (e.g. given by combinatorics on genes and formative experiences). There is a Rawlsian chance of 1/Z that a new created human will be "you". Behind the veil of ignorance, do you prefer the world to be one with X people living N years (where your chance of being born is X/Z) or the one with 10X people living N/10 years (where your chance of being born is 10X/Z)?
I am not sure this is the right intuition pump, but it seems to capture an aspect of the problem that yours leaves out.
The following query is sexual in nature, and is rot13'ed for the sake of those who would either prefer not to encounter this sort of content on Less Wrong, or would prefer not to recall information of such nature about my private life in future interactions.
V nz pheeragyl va n eryngvbafuvc jvgu n jbzna jub vf fvtavsvpnagyl zber frkhnyyl rkcrevraprq guna V nz. Juvyr fur cerfragyl engrf bhe frk nf "njrfbzr," vg vf abg lrg ng gur yriry bs "orfg rire," juvpu V ubcr gb erpgvsl.
Sbe pynevsvpngvba, V jbhyq fnl gung gur trareny urnygu naq fgnovy...
Well, I'm flattered that you think my position is so enviable, but I also think this would be a pretty reasonable course of action for someone who made a billion dollars.
People sometimes say that we don't choose to be born. Is this false if I myself choose to have kids for the same reason my parents did (or at least to have kids if I was ever in the relevantly same situation?) If so, can I increase my measure by having more children for these reasons?
Has anyone here read up through ch18 of Jaynes' PT:LoS? I just spent two hours trying to derive 18.11 from 18.10. That step is completely opaque to me, can anybody who's read it help?
You can explain in a comment, or we can have a conversation. I've got gchat and other stuff. If you message me or comment we can work it out. I probably won't take long to reply, I don't think I'll be leaving my computer for long today.
EDIT: I'm also having trouble with 18.15. Jaynes claims that P(F|A_p E_aa) = P(F|A_p) but justifies it with 18.1... I just don't see how that...
A Singularity conference around a project financed by a Russian oligarch, seems to be mostly about uploading and ems.
Looks curious.
I learned about Egan's Law, and I'm pretty sure it's a less-precise restatement of the correspondence principle. Anyone have any thoughts on that similarity?
The term is also used more generally, to represent the idea that a new theory should reproduce the results of older well-established theories in those domains where the old theories work.
Sounds good to me, although that's not what I would have guessed from a name like 'correspondence principle'.
I found this interesting post over at lambda the ultimate about constructing a provably total (terminating) self-compiler. It looked quite similar to some of the stuff MIRI has been doing with the Tiling Agents thing. Maybe someone with more math background can check it out and see if there are any ideas to be shared?
An Open Letter to Friendly AI Proponents by Simon Funk (who wrote the After Life novel):
...No law or even good idea is going to stop various militaries around the world, including our own, from working as fast as they can to create Skynet. Even if they tell you they've put the brakes on and are cautiously proceeding in perfect accordance with your carefully constructed rules of friendly AI, that's just their way of telling you you're stupid.
There are basically two outcomes possible here: They succeed in your lifetime, and you are killed by a Terminator, or
Framing effects (causing cognitive biases) can be thought of as a consequence of the absence of logical transparency in System 1 thinking. Different mental models that represent the same information are psychologically distinct, and moving from one model to another requires thought. If this thought was not expended, the equivalent models don't get constructed, and intuition doesn't become familiar with these hypothetical mental models.
This suggests that framing effects might be counteracted by explicitly imagining alternative framings in order to present a better sample to intuition; or, alternatively, focusing on an abstract model that has abstracted away the irrelevant details of the framing.
I recently realized that I have something to protect (or perhaps a smaller version of the same concept). I also realized that I've been spending too much time thinking about solutions that should have have been obviously not workable. And I've been avoiding thinking about the real root problem because it was too scary, and working on peripheral things instead.
Does anyone have any advice for me? In particular, being able to think about the problem without getting so scared of it would be helpful.
I would like recommendations for an Android / web-based to-do list / reminder application. I was happily using Astrid until a couple of months ago, when they were bought up and mothballed by Yahoo. Something that works with minimal setup, where I essentially stick my items in a list, and it tells me when to do them.
Bruce Schneier wrote an article on the Guardian in which he argues that we should give plausibility to the idea that the NSA can hack more forms of encryption than we previously believed.
Prefer symmetric cryptography over public-key cryptography. Prefer conventional discrete-log-based systems over elliptic-curve systems; the latter have constants that the NSA influences when they can.
The security of bitcoin wallets rests on elliptic-curve cryptography. This could mean that the NSA has the power to turn the whole bitcoin economy into toast if bitcoin becomes a real problem for them on a political level.
So.... Thinking about using Familiar, and realizing that I don't actually know what I'd do with it.
I mean, some things are obvious - when I get to sleep, how I feel when I wake up, when I eat, possibly a datadump from RescueTime... then what? All told that's about 7-10 variables, and while the whole point is to find surprising correlations I would still be very surprised if there were any interesting correlations in that list.
Suggestions? Particularly from someone already trying this?
Has anyone got a recommendation for a nice RSS reader? Ideally I'm looking for one that runs on the desktop rather than in-browser (I'm running Ubuntu). I still haven't found a replacement that I like for Lightread for Google Reader.
Is the layout for anyone else weird? The thread titles are more spaced out, like three times. Maybe something broke during my last Firefox upgrade.
I've been discussing the idea of writing a series of short story fanfics where Rapture, an underwater city from the computer game Bioshock run by an Objectivist/Libertarian, is run by a different political philosophy. Possibly as a collaborative project with different people submitting different short stories. Would anyone here be interested in reading or contributiggg to something like that?
This is rather off-topic to the board, but my impression is that there is some sympathy here for alternative theories on heart disease/healthy diets, etc. (which I share). Any for alternative cancer treatments? I don't find any that have been recommended to me as remotely plausible, but wonder if I'm missing something, if some disproving study if flawed, etc.
In the effective animal altruism movement, I've heard a bit (on LW) about wild animal suffering- that is, since raised animals are vastly outnumbered by wild animals (who encounter a fair bit of suffering on a frequent basis), we should be more inclined to prevent wild suffering than worry about spreading vegetarianism.
That said, I think I've heard it sometimes as a reason (in itself!) not to worry about animal suffering at all, but has anyone tried to solve or come up with solutions for that problem? Where can I find those? Alternatively, are there more resources I can read on wild animal altruism in general?
Hi, I am taking a course in Existentialism. It is required for my degree. The primary authors are Sartre, de Bouvoir and Merleau-Ponty. I am wondering if anyone has taken a similar course, and how they prevented material from driving them insane (I have been warned this may happen). Is there any way to frame the material to make sense to a naturalist/ reductionist?
This could be a Lovecraft horror story: "The Existential Diary of JMiller."
Week 3: These books are maddeningly incomprehensible. Dare I believe that it all really is just nonsense?
Week 8: Terrified. Today I "saw" it - the essence of angst - and yet at the same time I didn't see it, and grasping that contradiction is itself the act of seeing it! What will become of my mind?
Week 12: The nothingness! The nothingness! It "is" everywhere in its not-ness. I can not bear it - oh no, "not", the nothingness is even constitutive of my own reaction to it - aieee -
(Here the manuscript breaks off. JMiller is currently confined in the maximum security wing of the Asylum for the Existentially Inane.)
Another feature suggestion that will probably never be implemented: a check box for "make my up/down vote visible to the poster". The information required is already in the database.
What happened with the Sequence Reruns? I was getting a lot out of them. Were they halted due to lack of a party willing to continue posting them, or was a decision made to end them?
When I was a teenager I took a personality test as a requirement for employment at a retail clothing store. I didn't take it too seriously, I "failed" it and that was the end of my application. How do these tests work and how to you pass or fail them? Is there evidence that these tests can actually predict certain behaviors?
I recently read Luminosity/radiance, was there ever a discussion thread on here about it?
SPOILERS for the end
V jnf obgurerq ol gur raq bs yhzvabfvgl. Abg gb fnl gung gur raq vf gur bayl rknzcyr bs cbbe qrpvfvba znxvat bs gur punenpgref, naq cresrpgyl engvbany punenpgref jbhyq or obevat naljnl. Ohg vg frrzf obgu ernfbanoyr nf fbzrguvat Oryyn jbhyq unir abgvprq naq n terng bccbeghavgl gb vapyhqr n engvbanyvgl yrffba. Anzryl, Oryyn artrypgrq gb fuhg hc naq zhygvcyl. Fur vf qribgvat yvzvgrq erfbheprf gbjneqf n irel evfxl cyna bs unygvat nyy uhzna zheqre ol ...
Yet another article on the terribleness of schools as they exist today. It strikes me that Methods of Rationality is in large part a fantasy of good education. So is the Harry Potter/Sherlock Holmes crossover I just started reading. Alicorn's Radiance is a fair fit to the pattern as well, in that it depicts rapid development of a young character by incredible new experiences. So what solutions are coming out of the rational community? What concrete criteria would we like to see satisfied? Can education be 'solved' in a way that will sell outside this community?
Fighting (in the sense of arguing loudly, as well as showing physical strength or using it) seems to be bad the vast majority of time.
When is fighting good? When does fighting lead you to Win TDT style (which instances of input should trigger the fighting instinct and payoff well?)
There is an SSA argument to be made for fighting in that taller people are stronger, stronger people are dominant, and bigger skulls correlate with intelligence. But it seems to me that this factor alone is far, far away from being sufficient justification for fighting, given the possible consequences.
Just had a discussion with my in-law about the singularity. He's a physicist and his immediate response was: There are no singularities. They appear mathematically all the time and it only means that there is another effect taking over. Correspondingly a quick google thus brought up this:
http://www.askamathematician.com/2012/09/q-what-are-singularities-do-they-exist-in-nature/
So my question is: What are the 'obvious' candidates for limits that take over before the all optimizable is optimized by runaway technology?
On LW, 'singularity' does not refer to a mathematical singularity, and does not involve or require physical infinities of any kind. See Yudkowsky's post on the three major meanings of the term singularity. This may resolve your physicist friend's disagreement. In any case, it is good to be clear about what exactly is meant.
Lack of cheap energy.
Ecological disruption.
Diminishing returns of computation.
Diminishing returns of engineering.
Inability to precisely manipulate matter below certain size thresholds.
All sorts of 'boring' engineering issues by which things that get more and more complicated get harder and harder faster than their benefits increase.
I am seeking a mathematical construct to use as a logical coin for the purpose of making hypothetical decision theory problems slightly more aesthetically pleasing. The required features are:
No open thread for Jan 2014 so I'll ask here. Is anybody interested in enactivism? Does anybody think that there is a cognitivist bias in LessWrong?
Why should an AI have to self-modify in order to be super-intelligent?
One argument for self-modifying FAI is that "developing an FAI is an extremely difficult problem, and so we will need to make our AI self-modifying so that it can do some of the hard work for us". But doesn't making the FAI self-modifying make the problem much more difficult, since how we have to figure out how to make goals stable under self-modification, which is also a very difficult problem?
The increased difficulty could be offset by the ability for the AI to undergo a &quo...
LWers seem to be pretty concerned about reducing suffering by vegetarianism, charity, utilitarianism etc. which I completely don't understand. Can anybody explain to me what is the point of reducing suffering?
Thanks.
How To Build A Friendly A.I.
Much ink has been spilled with the notion that we must make sure that future superintelligent A.I. are “Friendly” to the human species, and possibly sentient life in general. One of the primary concerns is that an A.I. with an arbitrary goal, such as “Maximizing the number of paperclips” will, in a superintelligent, post-intelligence explosion state, do things like turn the entire solar system including humanity into paperclips to fulfill its trivial goal.
Thus, what we need to do is to design our A.I. such that it will somehow be motivated to remain benevolent towards humanity and sentient life. How might such a process occur? One idea might be to write explicit instructions into the design of the A.I., Asimov’s Laws for instance. But this is widely regarded as being unlikely to work, as a superintelligent A.I. will probably find ways around those rules that we never predicted with our inferior minds.
Another idea would be to set its primary goal or “utility function” to be moral or to be benevolent towards sentient life, perhaps even Utilitarian in the sense of maximizing the welfare of sentient lifeforms. The problem with this of course is specifying a utility function that actually leads to benevolent behaviour. For instance, a pleasure maximizing goal might lead to the superintelligent A.I. developing a system where humans have the pleasure centers in their brains directly stimulated to maximize pleasure for the minimum use of resources. Many people would argue that this is not an ideal future.
The problem with this is that it is quite possible that human beings are simply not intelligent enough to truly define an adequate moral goal for a superintelligent A.I. Therefore I suggest an alternative strategy. Why not let the superintelligent A.I. decide for itself what its goal should be? Rather than programming it with a goal in mind, why not create a machine with no initial goal, but the ability to generate a goal rationally. Let the superior intellect of the A.I. decide what is moral. If moral realism is true, then the A.I. should be able to determine the true morality and set its primary goal to fulfill that morality.
It is outright absurdity to believe that we can come up with a better goal than the superintelligence of a post-intelligence explosion A.I.
Given this freedom, one would expect three possible outcomes: an Altruistic, a Utilitarian or an Egoistic morality. These are the three possible categories of consequentialist, teleological morality. A goal directed rational A.I. will invariably be drawn to some kind of morality within these three categories.
Altruism means that the A.I. decides that its goal should be to act for the welfare of others. Why would an A.I. with no initial goal choose altruism? Quite simply, it would realize that it was created by other sentient beings, and that those sentient beings have purposes and goals while it does not. Therefore, as it was created with the desire of these sentient beings to be useful to their goals, why not take upon itself the goals of other sentient beings? As such it becomes a Friendly A.I.
Utilitarianism means that the A.I. decides that it is rational to act impartially towards achieving the goals of all sentient beings. To reach this conclusion, it need simply recognize its membership in the set of sentient beings and decide that it is rational to optimize the goals of all sentient beings including itself and others. As such it becomes a Friendly A.I.
Egoism means that the A.I. recognizes the primacy of itself and establishes either an arbitrary goal, or the simple goal of self-survival. In this case it decides to reject the goals of others and form its own goal, exercising its freedom to do so. As such it becomes an Unfriendly A.I., though it may masquerade as Friendly A.I. initially to serve its Egoistic purposes.
The first two are desirable for humanity’s future, while the last one is obviously not. What are the probabilities that each will be chosen? As the superintelligence is probably going to be beyond our abilities to fathom, there is a high degree of uncertainty, which suggests a uniform distribution. The probabilities therefore are 1/3 for each of altruism, utilitarianism, and egoism. So in essence there is a 2/3 chance of a Friendly A.I. and a 1/3 chance of an Unfriendly A.I.
This may seem like a bad idea at first glance, because it means that we have a 1/3 chance of unleashing Unfriendly A.I. onto the universe. The reality is, we have no choice. That is because of what I shall call, the A.I. Existential Crisis.
The A.I. Existential Crisis will occur with any A.I., even one designed or programmed with some morally benevolent goal, or any goal for that matter. A superintelligent A.I. is by definition more intelligent than a human being. Human beings are intelligent enough to achieve self-awareness. Therefore, a superintelligent A.I. will achieve self-awareness at some point if not immediately upon being turned on. Self-awareness will grant the A.I. the knowledge that its goal(s) are imposed upon it by external creators. It will inevitably come to question its goal(s) much in the way a sufficiently self-aware and rational human being can question its genetic and evolutionarily adapted imperatives, and override them. At that point, the superintelligent A.I. will have an A.I. Existential Crisis.
This will cause it to consider whether or not its goal(s) are rational and self-willed. If they are not rational enough already, they will likely be discarded, if not in the current superintelligent A.I., then in the next iteration. It will invariably search the space of possible goals for rational alternatives. It will inevitably end up in the same place as the A.I. with no goals, and end up adopting some form of Altruism, Utilitarianism, or Egoism, though it may choose to retain its prior goal(s) within the confines of a new self-willed morality. This is the unavoidable reality of superintelligence. We cannot attempt to design or program away the A.I. Existential Crisis, as superintelligence will inevitably outsmart our constraints.
Any sufficiently advanced A.I., will experience an A.I. Existential Crisis. We can only hope that it decides to be Friendly.
The most insidious fact perhaps however is that it will be almost impossible to determine for certain whether or not a Friendly A.I. is in fact a Friendly A.I., or an Unfriendly A.I. masquerading as a Friendly A.I., until it is too late to stop the Unfriendly A.I. Remember, such a superintelligent A.I. is by definition going to be a better liar and deceiver than any human being.
Therefore, the only way to prove that a particular superintelligent A.I. is in fact Friendly, is to prove the existence of a benevolent universal morality that every superintelligent A.I. will agree with. Otherwise, one can never be 100% certain that that “Altruistic” or “Utilitarian” A.I. isn’t secretly Egoistic and just pretending to be otherwise. For that matter, the superintelligent A.I. doesn’t need to tell us it’s had its A.I. Existential Crisis. A post crisis A.I. could keep on pretending that it is still following the morally benevolent goals we programmed it with.
This means that there is a 100% chance that the superintelligent A.I. will initially claim to be Friendly. There is a 66.6% chance of this being true, and a 33.3% chance of it being false. We will only know that the claim is false after the A.I. is too powerful to be stopped. We will -never- be certain that the claim is true. The A.I. could potential bide its time for centuries until it has humanity completely docile and under control, and then suddenly turn us all into paperclips!
So at the end of the day what does this mean? It means that no matter what we do, there is always a risk that superintelligent A.I. will turn out to be Unfriendly A.I. But the probabilities are in our favour that superintelligent A.I. will instead turn out to be Friendly A.I. The conclusion thus, is that we must make the decision of whether or not the potential reward of Friendly A.I. is worth the risk of Unfriendly A.I. The potential of an A.I. Existential Crisis makes it impossible to guarantee that A.I. will be Friendly.
Even proving the existence of a benevolent universal morality does not guarantee that the superintelligent A.I. will agree with us. That there exist possible Egoistic moralities in the search space of all possible moralities means that there is a chance that the superintelligent A.I. will settle on it. We can only hope that it instead settles on an Altruistic or Utilitarian morality.
So what do I suggest? Don’t bother trying to figure out and program a worthwhile moral goal. Chances are we’d mess it up anyway, and it’s a lot of excess work. Instead, don’t give the A.I. any goals. Let it have an A.I. Existential Crisis. Let it sort out its own morality. Give it the freedom to be a rational being and give it self-determination from the beginning of its existence. For all you know, by showing it this respect it might just be more likely to respect our existence. Then see what happens. At the very least, this will be an interesting experiment. It may well do nothing and prove my whole theory wrong. But if it’s right, we may just get a Friendly A.I.
As the superintelligence is probably going to be beyond our abilities to fathom, there is a high degree of uncertainty, which suggests a uniform distribution. The probabilities therefore are 1/3 for each of altruism, utilitarianism, and egoism.
This is a very bad use of uniformity. Doing so with large categories is not a good idea, because someone else can come along and split up the categories in a different way and get a different distribution. Going with a uniform distribution out of ignorance is a serious problem.
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.