Negative and Positive Selection

71 alyssavance 06 July 2012 01:34AM

(Originally posted to my blog, The Rationalist Conspiracy; cross-posted here on request of Lukeprog.)

You’re the captain of a team, and you want to select really good players. How do you do it?

One way is through what I call positive selection. You devise a test – say, who can run the fastest – and pick the people who do best. If you want to be really strict, like if you’re selecting for the Olympics, you only pick the top fraction of a percent. If you’re a player, and you want to get selected, you have to train to do better on the test.

The opposite method is negative selection. Instead of one test to pick out winners, you design many tests to pick out losers. You test, say, who can’t run very well when it’s hot out, and get rid of them. Then you test who can’t run very well when it’s cold out, and get rid of them. Then you test running in the rain, and get rid of the losers there. And so on and so forth. When you’re strict with negative selection, you have lots and lots of tests, so that it’s very hard for any one person to pass through all the filters.

I think a big part of where American society’s gone wrong over the last hundred years is the ubiquitous use of negative selection over positive selection. (Athletics is one of the only exceptions. It’s apparently so important that people really care about performance – as opposed to, say, in medicine, where we exclude brilliant doctors if they don’t have the stamina to work ninety hours a week.) A single test can always be flawed; for example, IQ tests and SATs have many flaws. However, with negative selection, how badly you do is determined by the failure rate of every test combined. If you have twenty tests, and even one of them is so flawed it excludes good players, then your team will suck.

Elite college admissions is an example of a negative selection test. There’s no one way you can do really, really well, and thereby be admitted to Harvard. Instead, you have to pass a bunch of different selection filters: Are your SATs good enough? Are your grades good enough? Is your essay good enough? Are your extracurriculars good enough? Are your recommendations good enough? Failure on any one step usually means not getting admitted. And as competition has intensified, colleges have added more and more filters, like the supplemental applications top schools now require (in addition to the Common Application). It wasn’t always this way – Harvard used to admit primarily based on an entrance exam – until they discovered this let too many Jews in (no, seriously). More recently, the negative selection has been intensified by eliminating the SAT’s high ceiling.

Academia is another example of negative selection. To get tenure, first you have to get into a top PhD program. Then you have to graduate. Then you have to get a good recommendation from your advisor. Then you have to get a good postdoc. Then you have to get another good postdoc. Then you have to get a good assistant professorship. Then you have to get approved by the tenure committee. For the most part, if even one of those steps goes wrong – if you went to a second-tier PhD program, say – there’s no way to recover. Once you’re off the “track”, you’re off, and there’s no getting back on. It’s fail once, fail forever.

Grades are another example – A is a good grade, but there’s no excellent grade. There’s no grade that you only get if you’re in the top 0.1%. Hence, getting a really good GPA doesn’t mean excelling, so much as it means never failing. If you’re in high school and are taking six classes, if you fail one, your GPA is now 3.3 or less, regardless of how good you are otherwise.

In any field, at the top end, you tend to get a lot of variance. (Insert tales of the mad artist and mad mathematician.) Negative selection suppresses variance, by eliminating many of the dimensions on which people vary. Students at Yale are, for the most part, all strikingly similar – same socioeconomic class, same interests, same pursuits, same life goals, even the same style of dress. A lot of people tend to assume performance follows a bell curve, but in some cases, it’s more like a Pareto distribution: the top people do hundreds or thousands of times better than average. Hence, if you eliminate the small fraction of people at the very top, your performance is hosed. Fortunately for VC funds, the startup world is still positive selection.

Less obviously, a world with lots of negative selection might be a nasty one to live in. If you think of yourself as trying to eliminate bad, rather than encourage good, you start operating on the purity vs. contamination moral axis. Any tiny amount of bad, anywhere, must be gotten rid of, and that can lead to all sorts of nastiness. “When you are a Guardian of the Truth, all you can do is try to stave off the inevitable slide into entropy by zapping anything that departs from the Truth.  If there’s some way to pump against entropy, generate new true beliefs along with a little waste heat, that same pump can keep the truth alive without secret police.”

Personal research update

4 Mitchell_Porter 29 January 2012 09:32AM

Synopsis: The brain is a quantum computer and the self is a tensor factor in it - or at least, the truth lies more in that direction than in the classical direction - and we won't get Friendly AI right unless we get the ontology of consciousness right.

Followed by: Does functionalism imply dualism?

Sixteen months ago, I made a post seeking funding for personal research. There was no separate Discussion forum then, and the post was comprehensively downvoted. I did manage to keep going at it, full-time, for the next sixteen months. Perhaps I'll get to continue; it's for the sake of that possibility that I'll risk another breach of etiquette. You never know who's reading these words and what resources they have. Also, there has been progress.

I think the best place to start is with what orthonormal said in response to the original post: "I don't think anyone should be funding a Penrose-esque qualia mysterian to study string theory." If I now took my full agenda to someone out in the real world, they might say: "I don't think it's worth funding a study of 'the ontological problem of consciousness in the context of Friendly AI'." That's my dilemma. The pure scientists who might be interested in basic conceptual progress are not engaged with the race towards technological singularity, and the apocalyptic AI activists gathered in this place are trying to fit consciousness into an ontology that doesn't have room for it. In the end, if I have to choose between working on conventional topics in Friendly AI, and on the ontology of quantum mind theories, then I have to choose the latter, because we need to get the ontology of consciousness right, and it's possible that a breakthrough could occur in the world outside the FAI-aware subculture and filter through; but as things stand, the truth about consciousness would never be discovered by employing the methods and assumptions that prevail inside the FAI subculture.

Perhaps I should pause to spell out why the nature of consciousness matters for Friendly AI. The reason is that the value system of a Friendly AI must make reference to certain states of conscious beings - e.g. "pain is bad" - so, in order to make correct judgments in real life, at a minimum it must be able to tell which entities are people and which are not. Is an AI a person? Is a digital copy of a human person, itself a person? Is a human body with a completely prosthetic brain still a person?

I see two ways in which people concerned with FAI hope to answer such questions. One is simply to arrive at the right computational, functionalist definition of personhood. That is, we assume the paradigm according to which the mind is a computational state machine inhabiting the brain, with states that are coarse-grainings (equivalence classes) of exact microphysical states. Another physical system which admits the same coarse-graining - which embodies the same state machine at some macroscopic level, even though the microscopic details of its causality are different - is said to embody another instance of the same mind.

An example of the other way to approach this question is the idea of simulating a group of consciousness theorists for 500 subjective years, until they arrive at a consensus on the nature of consciousness. I think it's rather unlikely that anyone will ever get to solve FAI-relevant problems in that way. The level of software and hardware power implied by the capacity to do reliable whole-brain simulations means you're already on the threshold of singularity: if you can simulate whole brains, you can simulate part brains, and you can also modify the parts, optimize them with genetic algorithms, and put them together into nonhuman AI. Uploads won't come first.

But the idea of explaining consciousness this way, by simulating Daniel Dennett and David Chalmers until they agree, is just a cartoon version of similar but more subtle methods. What these methods have in common is that they propose to outsource the problem to a computational process using input from cognitive neuroscience. Simulating a whole human being and asking it questions is an extreme example of this (the simulation is the "computational process", and the brain scan it uses as a model is the "input from cognitive neuroscience"). A more subtle method is to have your baby AI act as an artificial neuroscientist, use its streamlined general-purpose problem-solving algorithms to make a causal model of a generic human brain, and then to somehow extract from that, the criteria which the human brain uses to identify the correct scope of the concept "person". It's similar to the idea of extrapolated volition, except that we're just extrapolating concepts.

It might sound a lot simpler to just get human neuroscientists to solve these questions. Humans may be individually unreliable, but they have lots of cognitive tricks - heuristics - and they are capable of agreeing that something is verifiably true, once one of them does stumble on the truth. The main reason one would even consider the extra complication involved in figuring out how to turn a general-purpose seed AI into an artificial neuroscientist, capable of extracting the essence of the human decision-making cognitive architecture and then reflectively idealizing it according to its own inherent criteria, is shortage of time: one wishes to develop friendly AI before someone else inadvertently develops unfriendly AI. If we stumble into a situation where a powerful self-enhancing algorithm with arbitrary utility function has been discovered, it would be desirable to have, ready to go, a schema for the discovery of a friendly utility function via such computational outsourcing.

Now, jumping ahead to a later stage of the argument, I argue that it is extremely likely that distinctively quantum processes play a fundamental role in conscious cognition, because the model of thought as distributed classical computation actually leads to an outlandish sort of dualism. If we don't concern ourselves with the merits of my argument for the moment, and just ask whether an AI neuroscientist might somehow overlook the existence of this alleged secret ingredient of the mind, in the course of its studies, I do think it's possible. The obvious noninvasive way to form state-machine models of human brains is to repeatedly scan them at maximum resolution using fMRI, and to form state-machine models of the individual voxels on the basis of this data, and then to couple these voxel-models to produce a state-machine model of the whole brain. This is a modeling protocol which assumes that everything which matters is physically localized at the voxel scale or smaller. Essentially we are asking, is it possible to mistake a quantum computer for a classical computer by performing this sort of analysis? The answer is definitely yes if the analytic process intrinsically assumes that the object under study is a classical computer. If I try to fit a set of points with a line, there will always be a line of best fit, even if the fit is absolutely terrible. So yes, one really can describe a protocol for AI neuroscience which would be unable to discover that the brain is quantum in its workings, and which would even produce a specific classical model on the basis of which it could then attempt conceptual and volitional extrapolation.

Clearly you can try to circumvent comparably wrong outcomes, by adding reality checks and second opinions to your protocol for FAI development. At a more down to earth level, these exact mistakes could also be made by human neuroscientists, for the exact same reasons, so it's not as if we're talking about flaws peculiar to a hypothetical "automated neuroscientist". But I don't want to go on about this forever. I think I've made the point that wrong assumptions and lax verification can lead to FAI failure. The example of mistaking a quantum computer for a classical computer may even have a neat illustrative value. But is it plausible that the brain is actually quantum in any significant way? Even more incredibly, is there really a valid apriori argument against functionalism regarding consciousness - the identification of consciousness with a class of computational process?

I have previously posted (here) about the way that an abstracted conception of reality, coming from scientific theory, can motivate denial that some basic appearance corresponds to reality. A perennial example is time. I hope we all agree that there is such a thing as the appearance of time, the appearance of change, the appearance of time flowing... But on this very site, there are many people who believe that reality is actually timeless, and that all these appearances are only appearances; that reality is fundamentally static, but that some of its fixed moments contain an illusion of dynamism.

The case against functionalism with respect to conscious states is a little more subtle, because it's not being said that consciousness is an illusion; it's just being said that consciousness is some sort of property of computational states. I argue first that this requires dualism, at least with our current physical ontology, because conscious states are replete with constituents not present in physical ontology - for example, the "qualia", an exotic name for very straightforward realities like: the shade of green appearing in the banner of this site, the feeling of the wind on your skin, really every sensation or feeling you ever had. In a world made solely of quantum fields in space, there are no such things; there are just particles and arrangements of particles. The truth of this ought to be especially clear for color, but it applies equally to everything else.

In order that this post should not be overlong, I will not argue at length here for the proposition that functionalism implies dualism, but shall proceed to the second stage of the argument, which does not seem to have appeared even in the philosophy literature. If we are going to suppose that minds and their states correspond solely to combinations of mesoscopic information-processing events like chemical and electrical signals in the brain, then there must be a mapping from possible exact microphysical states of the brain, to the corresponding mental states. Supposing we have a mapping from mental states to coarse-grained computational states, we now need a further mapping from computational states to exact microphysical states. There will of course be borderline cases. Functional states are identified by their causal roles, and there will be microphysical states which do not stably and reliably produce one output behavior or the other.

Physicists are used to talking about thermodynamic quantities like pressure and temperature as if they have an independent reality, but objectively they are just nicely behaved averages. The fundamental reality consists of innumerable particles bouncing off each other; one does not need, and one has no evidence for, the existence of a separate entity, "pressure", which exists in parallel to the detailed microphysical reality. The idea is somewhat absurd.

Yet this is analogous to the picture implied by a computational philosophy of mind (such as functionalism) applied to an atomistic physical ontology. We do know that the entities which constitute consciousness - the perceptions, thoughts, memories... which make up an experience - actually exist, and I claim it is also clear that they do not exist in any standard physical ontology. So, unless we get a very different physical ontology, we must resort to dualism. The mental entities become, inescapably, a new category of beings, distinct from those in physics, but systematically correlated with them. Except that, if they are being correlated with coarse-grained neurocomputational states which do not have an exact microphysical definition, only a functional definition, then the mental part of the new combined ontology is fatally vague. It is impossible for fundamental reality to be objectively vague; vagueness is a property of a concept or a definition, a sign that it is incomplete or that it does not need to be exact. But reality itself is necessarily exact - it is something - and so functionalist dualism cannot be true unless the underdetermination of the psychophysical correspondence is replaced by something which says for all possible physical states, exactly what mental states (if any) should also exist. And that inherently runs against the functionalist approach to mind.

Very few people consider themselves functionalists and dualists. Most functionalists think of themselves as materialists, and materialism is a monism. What I have argued is that functionalism, the existence of consciousness, and the existence of microphysical details as the fundamental physical reality, together imply a peculiar form of dualism in which microphysical states which are borderline cases with respect to functional roles must all nonetheless be assigned to precisely one computational state or the other, even if no principle tells you how to perform such an assignment. The dualist will have to suppose that an exact but arbitrary border exists in state space, between the equivalence classes.

This - not just dualism, but a dualism that is necessarily arbitrary in its fine details - is too much for me. If you want to go all Occam-Kolmogorov-Solomonoff about it, you can say that the information needed to specify those boundaries in state space is so great as to render this whole class of theories of consciousness not worth considering. Fortunately there is an alternative.

Here, in addressing this audience, I may need to undo a little of what you may think you know about quantum mechanics. Of course, the local preference is for the Many Worlds interpretation, and we've had that discussion many times. One reason Many Worlds has a grip on the imagination is that it looks easy to imagine. Back when there was just one world, we thought of it as particles arranged in space; now we have many worlds, dizzying in their number and diversity, but each individual world still consists of just particles arranged in space. I'm sure that's how many people think of it.

Among physicists it will be different. Physicists will have some idea of what a wavefunction is, what an operator algebra of observables is, they may even know about path integrals and the various arcane constructions employed in quantum field theory. Possibly they will understand that the Copenhagen interpretation is not about consciousness collapsing an actually existing wavefunction; it is a positivistic rationale for focusing only on measurements and not worrying about what happens in between. And perhaps we can all agree that this is inadequate, as a final description of reality. What I want to say, is that Many Worlds serves the same purpose in many physicists' minds, but is equally inadequate, though from the opposite direction. Copenhagen says the observables are real but goes misty about unmeasured reality. Many Worlds says the wavefunction is real, but goes misty about exactly how it connects to observed reality. My most frustrating discussions on this topic are with physicists who are happy to be vague about what a "world" is. It's really not so different to Copenhagen positivism, except that where Copenhagen says "we only ever see measurements, what's the problem?", Many Worlds says "I say there's an independent reality, what else is left to do?". It is very rare for a Many World theorist to seek an exact idea of what a world is, as you see Robin Hanson and maybe Eliezer Yudkowsky doing; in that regard, reading the Sequences on this site will give you an unrepresentative idea of the interpretation's status.

One of the characteristic features of quantum mechanics is entanglement. But both Copenhagen, and a Many Worlds which ontologically privileges the position basis (arrangements of particles in space), still have atomistic ontologies of the sort which will produce the "arbitrary dualism" I just described. Why not seek a quantum ontology in which there are complex natural unities - fundamental objects which aren't simple - in the form of what we would presently called entangled states? That was the motivation for the quantum monadology described in my other really unpopular post. :-) [Edit: Go there for a discussion of "the mind as tensor factor", mentioned at the start of this post.] Instead of saying that physical reality is a series of transitions from one arrangement of particles to the next, say it's a series of transitions from one set of entangled states to the next. Quantum mechanics does not tell us which basis, if any, is ontologically preferred. Reality as a series of transitions between overall wavefunctions which are partly factorized and partly still entangled is a possible ontology; hopefully readers who really are quantum physicists will get the gist of what I'm talking about.

I'm going to double back here and revisit the topic of how the world seems to look. Hopefully we agree, not just that there is an appearance of time flowing, but also an appearance of a self. Here I want to argue just for the bare minimum - that a moment's conscious experience consists of a set of things, events, situations... which are simultaneously "present to" or "in the awareness of" something - a conscious being - you. I'll argue for this because even this bare minimum is not acknowledged by existing materialist attempts to explain consciousness. I was recently directed to this brief talk about the idea that there's no "real you". We are given a picture of a graph whose nodes are memories, dispositions, etc., and we are told that the self is like that graph: nodes can be added, nodes can be removed, it's a purely relational composite without any persistent part. What's missing in that description is that bare minimum notion of a perceiving self. Conscious experience consists of a subject perceiving objects in certain aspects. Philosophers have discussed for centuries how best to characterize the details of this phenomenological ontology; I think the best was Edmund Husserl, and I expect his work to be extremely important in interpreting consciousness in terms of a new physical ontology. But if you can't even notice that there's an observer there, observing all those parts, then you won't get very far.

My favorite slogan for this is due to the other Jaynes, Julian Jaynes. I don't endorse his theory of consciousness at all; but while in a daydream he once said to himself, "Include the knower in the known". That sums it up perfectly. We know there is a "knower", an experiencing subject. We know this, just as well as we know that reality exists and that time passes. The adoption of ontologies in which these aspects of reality are regarded as unreal, as appearances as only, may be motivated by science, but it's false to the most basic facts there are, and one should show a little more imagination about what science will say when it's more advanced.

I think I've said almost all of this before. The high point of the argument is that we should look for a physical ontology in which a self exists and is a natural yet complex unity, rather than a vaguely bounded conglomerate of distinct information-processing events, because the latter leads to one of those unacceptably arbitrary dualisms. If we can find a physical ontology in which the conscious self can be identified directly with a class of object posited by the theory, we can even get away from dualism, because physical theories are mathematical and formal and make few commitments about the "inherent qualities" of things, just about their causal interactions. If we can find a physical object which is absolutely isomorphic to a conscious self, then we can turn the isomorphism into an identity, and the dualism goes away. We can't do that with a functionalist theory of consciousness, because it's a many-to-one mapping between physical and mental, not an isomorphism.

So, I've said it all before; what's new? What have I accomplished during these last sixteen months? Mostly, I learned a lot of physics. I did not originally intend to get into the details of particle physics - I thought I'd just study the ontology of, say, string theory, and then use that to think about the problem. But one thing led to another, and in particular I made progress by taking ideas that were slightly on the fringe, and trying to embed them within an orthodox framework. It was a great way to learn, and some of those fringe ideas may even turn out to be correct. It's now abundantly clear to me that I really could become a career physicist, working specifically on fundamental theory. I might even have to do that, it may be the best option for a day job. But what it means for the investigations detailed in this essay, is that I don't need to skip over any details of the fundamental physics. I'll be concerned with many-body interactions of biopolymer electrons in vivo, not particles in a collider, but an electron is still an electron, an elementary particle, and if I hope to identify the conscious state of the quantum self with certain special states from a many-electron Hilbert space, I should want to understand that Hilbert space in the deepest way available.

My only peer-reviewed publication, from many years ago, picked out pathways in the microtubule which, we speculated, might be suitable for mobile electrons. I had nothing to do with noticing those pathways; my contribution was the speculation about what sort of physical processes such pathways might underpin. Something I did notice, but never wrote about, was the unusual similarity (so I thought) between the microtubule's structure, and a model of quantum computation due to the topologist Michael Freedman: a hexagonal lattice of qubits, in which entanglement is protected against decoherence by being encoded in topological degrees of freedom. It seems clear that performing an ontological analysis of a topologically protected coherent quantum system, in the context of some comprehensive ontology ("interpretation") of quantum mechanics, is a good idea. I'm not claiming to know, by the way, that the microtubule is the locus of quantum consciousness; there are a number of possibilities; but the microtubule has been studied for many years now and there's a big literature of models... a few of which might even have biophysical plausibility.

As for the interpretation of quantum mechanics itself, these developments are highly technical, but revolutionary. A well-known, well-studied quantum field theory turns out to have a bizarre new nonlocal formulation in which collections of particles seem to be replaced by polytopes in twistor space. Methods pioneered via purely mathematical studies of this theory are already being used for real-world calculations in QCD (the theory of quarks and gluons), and I expect this new ontology of "reality as a complex of twistor polytopes" to carry across as well. I don't know which quantum interpretation will win the battle now, but this is new information, of utterly fundamental significance. It is precisely the sort of altered holistic viewpoint that I was groping towards when I spoke about quantum monads constituted by entanglement. So I think things are looking good, just on the pure physics side. The real job remains to show that there's such a thing as quantum neurobiology, and to connect it to something like Husserlian transcendental phenomenology of the self via the new quantum formalism.

It's when we reach a level of understanding like that, that we will truly be ready to tackle the relationship between consciousness and the new world of intelligent autonomous computation. I don't deny the enormous helpfulness of the computational perspective in understanding unconscious "thought" and information processing. And even conscious states are still states, so you can surely make a state-machine model of the causality of a conscious being. It's just that the reality of how consciousness, computation, and fundamental ontology are connected, is bound to be a whole lot deeper than just a stack of virtual machines in the brain. We will have to fight our way to a new perspective which subsumes and transcends the computational picture of reality as a set of causally coupled black-box state machines. It should still be possible to "port" most of the thinking about Friendly AI to this new ontology; but the differences, what's new, are liable to be crucial to success. Fortunately, it seems that new perspectives are still possible; we haven't reached Kantian cognitive closure, with no more ontological progress open to us. On the contrary, there are still lines of investigation that we've hardly begun to follow.

On accepting an argument if you have limited computational power.

22 Dmytry 11 January 2012 05:07PM

It would seem rational to accept any argument that is not fallacious; but this leads to consideration of problems such as Pascal's mugging and other exploits.

I've had a realization of a subconscious triviality: for me to accept an argument as true, it is not enough that I find no error in it. The argument must also be so structured that I would expect to have found an error if it was invalid (or I myself must make such structured version first). That's how mathematical proofs work - they are so structured that finding an error requires little computational power (only knowledge of rules and reliability); in the extreme case an entirely unintelligent machine can check a proof.

In light of this I propose that those who want to make a persuasive argument should try to structure the argument so it'd be easy to find flaws in it. This also goes for the thought experiments and hypothetical situations. Those seem rather often to be constructed with entirely opposite goal in mind - to obstruct the verification process or to try to prevent the reader from trying to find flaws.

Something else tangentially related to the arguments. The faulty models are the prime cause of decision errors; yet the faulty models are the staple of thought experiment; nobody raises an eyebrow as all models are ultimately imperfect.

However, to accept an argument based on imperfect model one must be capable of correctly propagating the error and estimating the error in the final conclusion, as a faulty model may be so constructed as to itself differ non substantially from the reality but in such a way that the difference diverges massively along the chain of reasoning. My example of this is the Trolley Problems. The faults of original model are nothing out of ordinary; simplified assumptions of the real world, perfect information, etc. Normally you can have those faults in model and still arrive at reasonably close outcome. The end result is throwing of fat people onto tracks, cutting up of travellers for organs, and similar behaviours which we intuitively know we could live a fair lot better without. How that happens? In real world the strongly asymmetrical relations of form 'death of 1 person saves 10 people' are very rare (as an emergent property of complexity of the real world that is lacking in the imaginary worlds of trolley problems), while the decision errors are not nearly so rare, so most of people killed to save others would end up killed in vain.

I don't know how models can be structured as to facilitate propagation of model's error. But it seems to be necessary for arguments based on models to be convincing.

Philosophy that can be "taken seriously by computer scientists"

12 lukeprog 27 December 2011 02:39AM

I've long held CMU's philosophy department in high regard. One of their leading lights, Clark Glymour, recently published a short manifesto, which Brian Leiter summed up as saying that "the measure of value for philosophy departments is whether they are taken seriously by computer scientists."

Selected quote from Glymour's manifesto:

Were I a university administrator facing a contracting budget, I would not look to eliminate biosciences or computer engineering. I would notice that the philosophers seem smart, but their writings are tediously incestuous and of no influence except among themselves, and I would conclude that my academy could do without such a department... But not if I found that my philosophy department retrieved a million dollars a year in grants and fellowships, and contained members whose work is cited and used in multiple subjects, and whose faculty taught the traditional subject well to the university’s undergraduates.

Also see the critique here, but I'd like to have Glymour working on FAI.

Talking to Children: A Pre-Holiday Guide

32 [deleted] 20 December 2011 09:54PM

Note: This is based on anecdotal evidence, personal experience (I have worked with children for many years. It is my full-time job.) and "general knowledge" rather than scientific studies, though I welcome any relevant links on either side of the issue.

 


 

The holidays are upon us, and I would guess that even though most of us are atheists, that we will still be spending time with our extended families sometime in the next week. These extended families are likely to include nieces and nephews, or other children, that you will have to interact with (probably whether you like it or not...)

Many LW-ers might not spend a lot of time with children in their day-to-day lives, and therefore I would like to make a quick comment on how to interact with them in a way that is conducive to their development. After all, if we want to live in a rationalist world tomorrow, one of the best ways to get there is by raising children who can become rationalist adults. 

PLEASE READ THIS LINK if there are any little girls you will be seeing this holiday season:

How To Talk to Little Girls: http://www.huffingtonpost.com/lisa-bloom/how-to-talk-to-little-gir_b_882510.html?ref=fb&src=sp&comm_ref=false


I know it's hard, but DON'T tell little girls that they look cute, and DON'T comment on their adorable little outfits, or their pony-tailed hair. The world is already screaming at them that the primary thing other people notice and care about for them is their looks. Ask them about their opinions, or their hobbies. Point them toward growing into a well-rounded adult with a mind of her own.

This does not just apply to little girls and their looks, but can be extrapolated to SO many other circumstances. For example, when children (of either gender) are succeeding in something, whether it is school-work, or a drawing, DON'T comment on how smart or skilled they are. Instead, say something like: "Wow, that was a really difficult math problem you just solved. You must have studied really hard to understand it!" Have your comments focus on complementing their hard work, and their determination.

By commenting on children's innate abilities, you are setting them up to believe that if they are good at something, it is solely based on talent. Conversely, by commenting on the amount of work or effort that went into their progress, you are setting them up to believe that they need to put effort into things, in order to succeed at them.


This may not seem like a big deal, but I have worked in childcare for many years, and have learned how elastic children's brains are. You can get them to believe almost anything, or have any opinion, JUST by telling them they have that opinion. Tell a kid they like helping you cook often enough, and they will quickly think that they like helping you cook.

For a specific example, I made my first charge like my favorite of the little-kid shows by saying: "Ooo! Kim Possible is on! You love this show!" She soon internalized it, and it became one of her favorites. There is of course a limit to this. No amount of saying "That show is boring", and "You don't like that show" could convince her that Wonderpets was NOT super-awesome.

2011 Survey Results

94 Yvain 05 December 2011 10:49AM

A big thank you to the 1090 people who took the second Less Wrong Census/Survey.

Does this mean there are 1090 people who post on Less Wrong? Not necessarily. 165 people said they had zero karma, and 406 people skipped the karma question - I assume a good number of the skippers were people with zero karma or without accounts. So we can only prove that 519 people post on Less Wrong. Which is still a lot of people.

I apologize for failing to ask who had or did not have an LW account. Because there are a number of these failures, I'm putting them all in a comment to this post so they don't clutter the survey results. Please talk about changes you want for next year's survey there.

Of our 1090 respondents, 972 (89%) were male, 92 (8.4%) female, 7 (.6%) transexual, and 19 gave various other answers or objected to the question. As abysmally male-dominated as these results are, the percent of women has tripled since the last survey in mid-2009.

continue reading »

Transcription and Summary of Nick Bostrom's Q&A

37 [deleted] 17 November 2011 05:51PM

INTRO: From the original posting by Stuart_Armstrong:

Underground Q&A session with Nick Bostrom (http://www.nickbostrom.com) on existential risks and artificial intelligence with the Oxford Transhumanists (recorded 10 October 2011).

http://www.youtube.com/watch?v=KQeijCRJSog

Below I (will) have a summary of the Q&A followed by the transcription. The transcription is slightly edited, mainly for readability. The numbers are minute markers. Anything followed by a (?) means I don't know quite what he said (example- attruing(?) program), but if you figure it out, let me know!


SUMMARY: I'll have a summary here by end of the day, probably.


TRANSCRIPTION:

Nick: I wanted to just [interact with your heads]. Any questions, really, that you have. To discuss with you. I can say what I’m working on right now which is this book on super-intelligence, not so much on the question of whether and how long it might take to develop machine intelligence that equals human intelligence, but rather what happens if and when that occurs. To forget human level machine intelligence, how quickly, how explosively will we get super-intelligenct, and how can you solve the control problem. If you build super-intelligence how can you make sure it will do what you want. That it will be safe and beneficial.

Once one starts to pull on that problem, it turns out to be quite complicated and difficult. That it has many aspects to it that I would be happy to talk about. Or if you prefer to talk about other things; existential risks, or otherwise, I’d be happy to do that as well. But no presentation, just Q&A. So you all have to provide at least the stimulus. So should I take questions or do you want…

[00:01]

Questioner: So what’s your definition of machine intelligence or super-intellegence AI… Is there like a precise definition there?

Nick: There isn’t. Now if you look at domain specific intelligence, there are already areas where machines surpass humans, such doing arithmetical calculations or chess. I think the interesting point is when machines equal humans in general intelligence or perhaps slightly more specifically in engineering intelligence. So if you had this general capability of being able to program creatively and design new systems... There is in a sense a point at which if you had sufficient capability of that sort, you have general capability.

Because if you can build new systems, even if all it could initially do is this type engineering work, you can build yourself a poetry module or build yourself a social skills module, if you have that general ability to build . So it might be that general intelligence or it might be that slightly more narrow version of that engineering type of intelligence is the key variable to look at. That’s the kind of thing that can unleash the rest. But “human-level intelligence”... that’s a vague term, and I think it’s important to understand that. It’s not necessarily the natural kind.

[00:03]

Questioner: Got a question that maybe should have waited til the end: There are two organizations, FHI and SIAI, working on this. Let's say I thought this was the most important problem in the world, and I should be donating money to this. Who should I give it to?

Nick:

It's good. We've come to the chase!

I think there is a sense that both organizations are synergistic. If one were about to go under or something like that, that would probably be the one. If both were doing well, it's... different people will have different opinions. We work quite closely with a lot of the folks from SIAI.

There is an advantage to having one academic platform and one outside academia. There are different things these types of organizations give us. If you want to get academics to pay more attention to this, to get postdocs to work on this, that's much easier to do within academia; also to get the ear of policy-makers and media.

On the other hand, for SIAI there might be things that are easier for them to do. More flexibility, they're not embedded in a big bureaucracy. So they can more easily hire people with non-standard backgrounds without the kind of credentials that we would usually need, and also more grass-roots stuff like the community blog Less Wrong, is easier to do.

So yeah. I'll give the non-answer answer to that question.

[00:05]

Questioner: Do you think a biological component is necessary for an artificial intelligence to achieve sentience or something equivalent?

Nick: It doesn’t seem that that should be advantageous…If you go all the way back to atoms, it doesn’t seem to matter that it’s carbon rather than silicon atoms. Then you could wonder, instead of having the same atoms you run a simulation of everything that’s going on. Would you have to simulate biological processes? I don’t even think that’s necessary.

My guess (and Im not sure about this, I don’t have an official position or even a theory about what exactly the criteria are that would make a system conscious)…But my intuition is that If you replicated computational processes that goes on in a human brain, at a sufficient level of detail, where that sufficient level of detail might be roughly on the level of individual neurons and synapses, I think you would likely have consciousness. And it might be that it’s something weaker than that which would suffice. Maybe you wouldn’t need every neuron. Maybe you could simplify things and still have consciousness. But at least at that level it seems likely.

It’s a lot harder to say if you had very alien types of mental architecture. Something that wasn’t a big neural network but of normal machine intelligence that performs very well in a certain way, but using a very different method than a human brain. Whether that would be conscious as well? Much less sure. A limiting case would be a big lookup table that was physically impossible to realize, but you can imagine having every sort of situation possible described, and that program would run through until it found the situation that matched its current memory and observation and would read off which action it should perform. But that would be an extremely alien type of architecture. But would that have conscious experience or not? Even less clear. It might be that it would not have, but maybe the process of generating this giant look-up table would generate kinds of experiences that you wouldn’t get from actually implementing it or something like that. (?)

[00:07]

Questioner- This relates to AI being dangerous. It seems to me that while it would certainly be interesting if we were to get AI that were much more intelligent than a human being, its not necessarily dangerous.

Even if the AI is very intelligent it might be hard for it to get resources for it to actually do anything to be able to manufacture extra hardware or anything like that. There are obviously situations where you can imagine intelligence or Creative thinking can get you out of or get you further capability . So..

Nick: I guess it’s useful to identify two cases: One is sort of the default case unless we successfully implement some sort of safeguard or engineer it in a particular way in order to avoid dangers …So let’s think of a default just a bit: You have something that is super intelligent and capable of improving itself to even more levels of super intelligence…. I guess one way to get initial possibility of why this is dangerous is to think about why humans are powerful.. Why are we dominant on this planet? It’s not because we have stronger muscles or our teeth are sharper or we have special poison glands. It’s all because of our brains, which have enabled us to develop a lot of other technologies that give us in effect muscles that are stronger than the other animals…We have bulldozers and external devices and all the other things. And also, it enables us to coordinate socially and build up complicated society so we can act as groups. And all of this makes us supreme on this planet. We can argue with the case of bacteria which have their own domains where they rule. But certainly in the case of the larger mammals we are unchallenged because of our brains.

And the brains are not all that different from the brains of other animals. It might be that all these advantages we have are due to a few tweaks on some parameters that occurred in our ancestors a couple million years ago. And these tiny changes in the nature our intelligence that had these huge affects. So just prima facie it then seems possible that that if the system surpassed us by just a small amount that we surpass chimpanzees, it could lead to a similar kind of advantage in power. And if they exceeded our intelligence by a much greater margin, then all of that could happen in a more dramatic fashion

It’s true that you could have in principle an AI that was locked in a box, such that it would be incapable of affecting anything outside the box and in that sense it would be weak. That might be one of the safety methods one tries to apply that I've been thinking about.

Broadly speaking you can distinguish between two different approaches to solving the control problem, of making sure that super-intelligence, if it’s built wouldn’t cause harm. On one hand you have capability control measures where you try to limit what the AI is able to do. The most obvious example would be lock it in a box and limit its ability to interact with the rest of the world.

The other class of approach would be motivation selection methods where you would try to control what it wants to do. Where you build it in such a way that even if it has the power to do all this bad stuff, it would choose not to. But so far, there isn’t one method or even a combination of methods that it seems we can currently be fully convinced would work. There’s a lot more work needed...

[00:11]

Questioner: Human beings have been very successful. One feature of that that has been very crucial are our hands that have enabled us to get a start on working tools and so on. Even if an AI is running on some computer somewhere, that would be more analogous to a very intelligent creature which doesn’t have very good hands. It’s very hard for it to actually DO anything.

Maybe the in-the-box method is promising. Because if we just don’t give the AI hands, some way to actually do something..If all it can do is alter its own code, and maybe communicate infomationally. That seems...

Nick: So let’s be careful there… So clearly it’s not “hands” per se. If it didn’t have hands it could still be very dangerous, because there are other people with hands, that it could persuade to do its bidding. It might be that it has no direct effectors other than the ability to type very slowly, and then some human gatekeeper could read and choose to act on or not. Even that limited ability to affect the world might be sufficient if it had a super power in the domain of persuasion. So if it had an engineering super-power, it might then get all these other superpowers. And then if it were able to, in particular be a super skilled persuader, it could then get other accessories outside our system that could implement its designs.

You might have heard of this guy, Eliezer Yudkowsky, about 5 years back who ran a series of role playing exercises...The idea was one person should play the AI, pretend to be in a box. The other should play the human gatekeeper whose job was to not let the AI out of the box, but he has to talk with the AI for a couple of hours over the internet chat. This experiment was run five times, with EY playing the AI and different people playing the human gatekeeper. And for the most part people, who were intitially convinced that they would never let the AI out of the box, but in 3 of 5 cases, the experiment ended with the gatekeepers announcing yes, they would let the AI out of the box.

This experiment was run under conditions that neither party would be allowed to disclose the methods that were used, the main conversational sequence...sorta maintain a shroud of mystery. But this is where the human-level persuader has two hours to work on the human gatekeeper. It seems reasonable to be doubtful of the ability of humanity to keep the super-intelligent persuader in the box, indefinitely, for that reason.

[00:15]

Questioner: How hard do you think the idea of controlling the mentality of intelligence is, with something at least as intelligent as us, considering how hard it is to convince humans to act in a certain civilized way of life?

Nick: So humans sort of start out with a motivation system and then you can try to persuade them or structure incentives to behave in a certain way. But they don’t start out with a tabula rasa where you get to write in what a human’s values should be. So that’s made a difference. In the case of the super-intelligence of course once it already has unfriendly values and it has sufficient power, it will resist any attempt to corrupt its goal system as it would see it.

[00:16]

Questioner: You don’t think that like us, its experiences might cause it to question its core values as we do?

Nick: Well, I think that depends on how the goal system is structured. So with humans we don’t have a simple declarative goal structure list. Not like a simple slot where we have super goal and everything else is derived from that

Rather it’s like many different little people inhabit our skull and have their debates and fight it out and make compromises. And in some situations, some of them get a boost like permutations and stuff like that. Then over time we have different things that change what we want like hormones kicking in, fading out, all kinds of processes.

Another process that might affect us is what I call value accretion. The idea that we can have mechanisms that loads new values into us, as we go along. Like maybe falling in love is like that; Initially you might not value that person for their own sake above any other person. But once you undergo this process you start to value them for their own sake in a special way. So human have this mechanism that make us acquire values depending on our experiences.

If you were building a machine super intelligence and trying to engineer its goal systems so that it will be reliably safe and human friendly, you might want to go with something, more transparent where you have an easier time seeing what is happening, rather than have a complex modular minds with a lot of different forces battling it out...you might want to have a more hierarchical structure.

Questioner: What do you think of the necessary…requisites for the conscious mind? What are the features?

Nick: Yes, I’m not sure. We’ve talked a little on that earlier. Suppose there is a certain kind of computation that is needed, that is really is the essence of mind. I’m sympathetic to the idea that something in the vicinity of that view might be correct. You have to think about exactly how to develop it. Then there is this stage of what is a computation.

So there is this challenge (I think it might go back to Hans Moravec but I think similar objections have been raised in philosophy against computationalism) where the idea is that if you have an arbitrary physical system that is sufficiently complicated, it could be a stone or a chair or just anything with a lot of molecules in it. And then you have this abstract computation that you think is what constitutes the implementation of the mind. Then there would be some mathematical mapping between all the parts in your computation and atoms in the chair so that you could artificially, through a very complicated mapping interpret the motions of the molecules in the chair in such a way that they would be seen as implementing the computation. It would not be any plausible mapping, not a useful mapping, but a bizarro mapping. Nonetheless if there were sufficiently limited parts there, you could just arbitrarily define some, by injection..

And clearly we don’t think that all these random physical objects implement the mind, or all possible minds.

So the lesson to me is that it seems that we need some kind of account of what it means to implement a computation that is not trivial and this mapping function between the abstract entity that is a sort of Turing program, or whatever your model of a computation is and the physical entity that decides to implement it to be some sort of non-trivial representation of what this mapping can look like

It might have to be reasonably simple. It might have to have certain counter-factual properties, so that the system would have implemented a related, but slightly different computation if you had scrambled the initial conditions of the system in a certain way, so something like that. But this is an open question in the philosophy of mind, to try to nail down what it means to implement the computation.

[00:20]

Questioner: To bring back to the goal and motivation approach to making an AI friendly towards us, one of the most effective ways of controlling human behavior, quite aside from goals and motivations , is to train them by instilling neuroses. It’s why 99.99% of us in this room couldn’t pee in our pants right now even if he really, really wanted to.

Is it possible to approach controlling an AI in that way or even would it be possible for an AI to develop in such a way that there is a developmental period in which a risk-reward system or some sort of neuroses instilment could be used to basically create these rules that an AI couldn’t break?

Nick: It doesn’t sound so promising because a neurosis is a complicated thing that might be a particular syndrome of a phenomenon that occurs in human- style mind, because of the way that humans’ minds are configured. It’s not clear there would be something exactly analogous to that in a cognitive system with a very different architecture.

Also, because neuroses, at least certain kinds of neuroses, are ones we would choose to get rid of if we could. So if you had a big phobia and there was a button that would remove the phobia, obviously you would press the button. And here we have this system that is presumably able to self-modify. So if it had this big hang up that it didn’t like, then it could reprogram itself to get rid of that.

This would be different than a top-level goal because top-level goal would be the criterion it produced to decide whether to take an action. In particular, like an action to remove the top level goal.

So generally speaking with reasonable and coherent goal architecture you would get certain convergent instrumental values that would crop up in a wide range of situations. One might be self preservation, not necessarily because you value your own survival for its own sake, but because in many situations you can predict that if you are around in the future you can continue to act in the future according to your goals, and that will make it more likely that the world will then be implementing your goals.

Another convergent instrumental value might be protection of your goal system from corruption (?) for very much the same reason. For even if you were around in the future but you have different goals from the ones you had now, you would now predict that that means in the future you will no longer be working towards realizing your current goals but maybe towards a completely different purpose, that would make it now less likely that your current goals would be realized. If your current goals are what you use as a criterion to choose an action, you would want to try to take actions that would prevent corruption of your goal system.

One might list a couple of other of the convergent instrumental values like intelligence amplification, technology perfection and resource acquisition. So this relates to why generic super-intelligence might be dangerous. It’s not so much that you have to worry that it would have human Unfriendliness in the sense of disliking human goals, that it would *hate* humans . The danger is that it wouldn’t *care* about humans. It would care about something different, like paperclips. But then if you have almost any other goals, like paperclips, there would be these other convergent instrumental reasons that you discover. For while your goal is to make as many paperclips as possible you might want to a) prevent humans from switching you off or tampering with your goal system or b) you might want to acquire as much resources as possible, including planets, and the solar system, and the galaxy. All of that stuff could be made into paperclips. So even with pretty much a random goal, you would end up with these motivational tendencies which would be harmful to humans.

[00:25]

Questioner: Appreciating the existential risks, what do you think about goals and motivations, and such drastic measures of control sort of a) ethically and b) as a basis of a working relationship?

Nick: Well, in terms of the working relationship one has to think about the differences with these kinds of the artificial being. I think there are a lot of (?) about how to relate to artificial agents that are conditioned on the fact that we are used to dealing with human agents, and there are a lot of things we can assume about the human.

We can assume perhaps that they don’t want to be enslaved. Even if they say that they want to be enslaved, we might think that deep inside of them, there is a sort of more genuine authentic self that doesn’t want to be enslaved. Even if some prisoner has been brainwashed to do the bidding of their master, maybe we say it’s not really good for them because it’s in their nature, this will to be autonomous. And there are other things like that, that don’t necessarily have to obtain for a completely artificial system which might not have any of that rich human nature that we have.

So in terms of what the good working relationship is, just as what we think of a good relationship with our word processor or email program. Not in these terms, as if you’re exploiting it for your ends, without giving it anything in return. If your email program had a will, presumably it would be the will to be a good and efficient email program that processed your emails properly. Maybe that was the only thing it wanted and cared about. So having a relationship with it would be a different thing.

There was another part of your question, about whether this would be right and ethical. I think if you are operating a new agent from scratch, and there are many different possible agents you could create, some of those agents will have human style values; they want to be independent and respected. Other agents that you could create would have no greater desire than to be of service. Others would just want paperclips. So if you step back, and look at which of these options we should decide, then looking at the question of moral constraints on which of these are legitimate.

And I’m not saying that those are trivial, I think there are some deep ethical questions here. However in the particular scenario where we are considering the creation of a single super intelligence the more pressing concern would be to ensure that it doesn’t destroy everything else, like humanity and its future. Now, if you have a different scenario, like instead of this one uber-mind rising ahead, you have many minds that become smarter and smarter that rival humans and then gradually exceed them

Say an uploading scenario where you start with very slow software, where you have human like minds running very slowly. In that case, maybe how we should relate to these machine intellects morally becomes more pressing. Or indeed, even if you just have one, but in the process of figuring out what to do it creates “thought crimes”.

If you have a sufficiently powerful mind maybe you have thoughts themselves would contain structures that are conscious. This sounds mystical, but imagine you are a very powerful computer and one of the things you are doing is you are trying to predict what would happen in the future under different scenarios, and so you might play out a future

And if those simulations you are running inside of this program were sufficiently detailed, then they could be conscious. This comes back to our earlier discussion of what is conscious. But I think a sufficiently detailed computer simulation of the mind could be conscious

You could then have a super intelligence that could process by thinking about things could create sentient beings, maybe millions or billions or trillions of them, and their welfare would then be a major ethical issue. They might be killed when it stops thinking about them, or they might be mistreated in different ways. And I think that would be an important ethical complication in this context

[00:30]

Questioner: Eliezer suggests that one of the many problems with arbitrary stamps in AI space is that human values are very complex. So virtually any goal system will go horribly wrong because it will be doing things we don’t quite care about, and that’s as bad as paperclips. How complex do you think human values will be?

Nick: It looks like human values are very complicated. Even if they were very simple, even if it turned out its just pleasure say, which compared to other things of what has value, like democracy flourishing and art. As far as we can think of values that’s one of the more simplistic possibilities. Even that if you start to think of it from a physicalistic view, and you have to now specify which atoms have to go how and where for there to be pleasure. It would be a pretty difficult thing to write down, Like the Schrödinger Equation for pleasure.

So in that sense it seems fair that our values are very complex. So there are two issues here. There is a kind of technical problem of figuring out that if you knew what our values are, in the sense that we think that we normally know what our values are, how we could get the AI to share those values, like pleasure or absence of pain or anything like that.

And there is the additional philosophical problem which is if we are unsure of what are values are, if we are groping about in axiology trying to figure out how much to value different things, and maybe there are values we have been blind to today, then how do you also get all of that on board, on top of what we already think has value, that potential of moral growth? Both of those are very serious problems and difficult challenges.

There are a number of different ways you can try to go. One approach that is interesting is what we might call is indirect normativity. Where the idea is rather than specifying explicitly what you want the AI to achieve, like maximizing pleasure while respecting individual autonomy and pay special attention to the poor. Rather than creating a list, what you try to do instead is specify a process or mechanism by which the AI could find out what it is supposed to do.

One of these ideas that has come out is this idea Coherent Extrapolated Volition, where the idea is if you could try to tell the AI to do that which we would have asked it to do if we had thought about the problem longer, and if we had been smarter, and if we had some other qualifications. Basically, if you could describe some sort of idealized process whereby we at the end, if we underwent that process would be able to create a more detailed list, then maybe point the AI to that and make the AI’s value to run this process and do what comes out of the end of that, rather than go with where our current list gets us about what we want to do and what has value.

[00:33]

Questioner: Isn’t there are risk that.. the AI would decide that if we thought about it for 1000 years really, really carefully, that we would just decide to just let the AIs to take over?

Nick: Yeah, that seems to be a possibility. And then that raises some interesting questions. Like if that is really what our CEV would do. Let’s assume that everything has been implemented in the right way, like there is no flaw on the realization of this. So how should we think about this?

Well on the one hand, you might say if this is really what our wiser selves would want. What we would want if we were saved from these errors and illusions we are suffering under, then maybe we should go ahead with that. On the other hand, you could say, this is really a pretty tall order. That we’re supposed to sacrifice not just a bit, but ourselves and everybody else, for this abstract idea that we don’t really feel any strong connection to. I think that’s one of the risks, but who knows what will be the outcome of this CEV?

And there are further qualms one might have that need to be spelled out. Like exactly whose volition is it that is supposed to be extrapolated. Humanity’s? Well then, who is humanity? Like does it include past generations for example? How far back? Does it include embryos that died?

Who knows whether the core of humanity is nice? Maybe there are a lot of suppressed sadists out there, that we don’t realize, because they know that they would be punished by society. Maybe if they went through this procedure, who knows what would come out?

So it would be dangerous to run something like that, without some sort of safeguard check at the end. On the other hand, there is worry that if you put in too many of these checks, then in effect you move the whole thing back to what you want now. Because if you were allowed to look at an extrapolation, see whether you like it, or if you dislike it you run another one by changing the premises and you were allowed to keep going like that until you were happy with the result then basically it would be you now, making the decision. So, it’s worth thinking about, whether there is some sort of compromise or blend that might be the most appealing.

[00:36]

Questioner: You mentioned before about a computer producing sentience itself in running a scenario. What are the chances that that is the society that we live in today?

Nick: I don’t know, so what exactly are the chances? I think significant. I don’t know, it’s a subjective judgment here. maybe less than 50%? Like 1 in 10?

There’s a whole different topic, maybe we should save that topic for a different time..

[00:37]

Questioner: If I wanted to study this area generally, existential risk, what kind of subject would you recommend I pursue? We’re all undergrads, so after our bachelors we will start on master or go into a job. If I wanted to study it, what kind of master would you recommend?

Nick: Well part of it would depend on your talent, like if you’re a quantitative guy or a verbal guy. There isn’t really an ideal sort of educational program anywhere, to deal with these things. You’d want to get a fairly broad education, there are many fields that could be relevant. If one looks at where people are coming from so far that have had something useful to say, a fair chunk of them are philosophers, some computer scientists, some economists, maybe physics.

Those fields have one thing in common in that they are fairly versatile. Like if you’re doing Philosophy, you can do Philosophy of X, or of Y, or of almost anything. Economics as well. It gives you a general set of tools that you can use to analyze different things, and computer science has these ways of thinking and structuring a problem that is useful for many things

So it’s not obvious which of those disciplines would be best, generically. I think that would depend on the individual, but then what I would suggest is that while you were doing it, you also try to read in other areas other than the one you were studying. And try to do it at a place where there are a lot of other people around with a support group and advisor that encouraged you and gave you some freedom to pursue different things.

[00:38]

Questioner: Would you consider AI created by human beings as some sort of consequence of evolutionary process? Like in a way that human beings tried to overcome their own limitations and as it’s a really long time to get it on a dna level you just get it quicker on a more computational level?

Nick: So whether we would use evolutionary algorithms to produce super- intelligence or..?

Questioner: If AI itself is part of evolution..

Nick: So there’s kind of a trivial sense in which if we evolved and we created…then obviously evolution had a part to play in the overall causal explanation of why we’re going to get machine intelligence at the end. Now, for evolution to really to exert some shaping influence there have to be a number of factors at play. There would have to be a number of variants created that are different and then compete for resources and then there is a selection step. And for there to be significant evolution you have to enact this a lot of times.

So whether that will happen or not in the future is not clear at all. If you have a signal tone for me, in that if a world order arises at a top level. Where there is only one decision making agency, which could be democratic world government or AI that rules everybody, or a self-enforcing moral code, or tyranny or a nice thing or bad thing

But if you have that kind of structure there will at least be, in principal ability, for that unitary agent to control evolution within itself, like it could change selection pressures by taxing or subsidizing different kinds of life forms.

If you don’t have a singleton then you have different agencies that might be in competition with one another, and in principle in that scenario evolutionary pressures can come into play. But I think the way that it might pan out would be different from the way that we’re used to seeing biological evolution, so for one thing you might have these potentially immortal life forms, that is they have software minds that don’t naturally die, that could modify themselves.

If they knew that their current type, if they continued to pursue their current strategy would be outcompeted and they didn’t like that, they could change themselves immediately right away rather than wait to be eliminated.

So you might get, if there were to be a long evolutionary process ahead and agents could anticipate that, you might get the effects of that instantaneously from anticipation.

So I think you probably wouldn’t see the evolutionary processes playing out but there might be some of the constraints that could be reflected more immediately by the fact that different agencies had to pursue strategies that they could see would be viable.

[00:41]

Questioner: So do you think it’s possible that our minds could be scanned and then be uploaded into a computer machine in some way and then could you create many copies of ourselves as those machines?

Nick: So this is what in technical terminology is “whole brain emulation” or in more popular terminology “uploading”. So obviously this is impossible now, but seems like it’s consistent with everything we know about physics and chemistry and so forth. So I think that will become feasible barring some kind of catastrophic thing that puts a stop to scientific and technological progress.

So the way I imagine it would work is that you take a particular brain, freeze it or vitrify it, and then slice it up into thin slices that would be fed through some array of microscopes that would scan each slice with sufficient resolution and then automated image analysis algorithms would work on this to reconstruct the 3 dimensional neural network that your own organic brain implemented and I have this sort of information structure in a computer.

At this point you need computational neuroscience to tell you what each component does. So you need to have a good theory of what say a pyramidal cell does, what a different kind of…And then you would combine those little computational models of what each type of neuron does with this 3D map of the network and run it. And if everything went well you would have transferred the mind, with memories and personalities intact to the computer. And there is an open question of just how much resolution would you need to have, how much detail you would need to capture of the original mind in order to successfully do this. But I think there would be some level of detail which as I said before, might be on the level of synapses or thereabouts, possibly higher, that would suffice. So then you would be able to do this. And then after you’re software , you could be copied, or speeded up or slowed down or paused or stuff like that

[00:44]

Questioner: There has been a lot of talk of controlling the AI and evaluating the risk. My question would be assuming that we have created a far more perfect AI than ourselves is there a credible reason for human beings to continue existing?

Nick: Um, yeah, I certainly have the reason that if we value our own existence we seem to have a…Do you mean to say that there would be a moral reason to exist or if we would have a self interested reason to exist.

Questioner: Well I guess it would be your opinion..

Nick: My opinon is that I would rather not see the genocide of the entire human species. Rather that we all live happily ever after. If those are the only two alternatives, I think yeah! Let’s all live happily ever after! Is where I would come down on that.

[00:45]

Questioner: By keeping human species around You’re going to have a situation presumably where you have extremely, extremely advanced AIs where they have few decades or few centuries or whatever and they will be far, far beyond our comprehension, and even if we still integrate to some degree with machines (mumble) biological humans then they’ll just be completely inconceivable to us. So isn’t there a danger that our stupidity will hamper their perfection?

Nick: Would hamper their perfection?? Well there’s enough space for there to be many different kinds of perfection pursued. Like right now we have a bunch of dust mites crawling around everywhere, but not really hampering our pursuit of art or truth or beauty. They’re going about their business and we’re going about ours.

I guess you could have a future where there would be a lot of room in the universe for planetary sized computers thinking their grand thoughts while…I’m not making a prediction here, but if you wanted to have a nature preserve, with original nature or original human beings living like that, that wouldn’t preclude the other thing from happening..

Questioner: Or a dust mite might not hamper us, but things like viruses or bacteria just by being so far below us (mumble). And if you leave humans on a nature preserve and they’re aware of that, isn’t there a risk that they’ll be angry at the feeling of being irrelevant at the grand scheme of things?

Nick: I suppose. I don’t think it would bother the AI that would be able to protect itself, or remain out of reach. Now it might demean the remaining humans if we were dethroned from this position of kings, the highest life forms around, that it would be a demotion, and one would have to deal with that I suppose.

It’s unclear how much value to place on that. I mean right now in this universe which looks like it’s infinite somewhere out there are gonna be all kinds of things including god like intellects and everything in between that are already outstripping us in every possible way.

It doesn’t seem to upset us terribly; we just get on with it. So I think people will have to make some psychological..I’m sure we can adjust to it easily. Now it might be from some particular theory of value that this might be a sad thing for humanity. That we are not even locally at the top of the ladder.

Questioner: If rationalism was true, that is if it were irrational to perform wrong acts. Would we still have to worry about super-intelligence? It seems to me that we wouldn’t have.

Nick: Well you might have a system that doesn’t care about being rational, according to that definition of rationality. So I think that we would still have to worry

[00:48]

Questioner: Regarding trying to program AI without values, (mumbles) But as I understand it, what’s considered one of the most promising approach in AI now is more statistical learning type approaches.. And the problem with that is if we were to produce an AI with that, we might not understand its inner workings enough to be able to dive in and modify it in precisely the right way to give it an unalterable list of terminal values.

So if we were to end up with some big neural network that we trained in some way and ended up with something that could perform as well as humans in some particular task or something. We might be able to do that without knowing how to alter it to have some particular set of goals.

Nick: Yeah, so there are some things there to think about. One general worry that one needs to bear in mind if one tries that kinds of approach is we might give it various examples like this is a good action and this is a bad action in this context, and maybe it would learn all those examples then the question is how would it generalize to other examples outside this class?

So we could test it we could divide our examples initially into classes and train it on one and test its performance on the other, the way you would do to cross-validate. And then we think that means other cases that it hasn’t seen it would have the same kind of performance. But all the cases that we could test it on would be cases that would apply to its current level of intelligence. So presumably we’re going to do this while it’s still at human or less than human intelligence. We don’t want to wait to do this until it’s already super-intelligent.

So then the worry is that even if it were able to analyze what to do in a certain way in all of these cases, it’s only dealing with all of these cases in the training case, when it’s still at a human level of intelligence. Now maybe once it becomes smarter it will realize that there are different ways of classifying these cases that will have radically different implications for humans.

So suppose that you try to train it to… this was one of the classic example of a bad idea of how to solve the control problem: Lets train the AI to want to make people smile, what can go wrong with that? So we train it on different people and if they smile when it does something that’s like a kind of reward; it gets strength in those positions that led to the behavior that made people smile. And frowning would move the AI away from that kind of behavior. And you can imagine that this would work pretty well at a primitive state where the AI will engage in more pleasing and useful behavior because the user will smile at it and it will all work very well. But then once the AI reaches a certain level of intellectual sophistication it might realize that It could get people to smile not just by being nice but also by paralyzing their facial muscles in that constant beaming smile.

And then you would have this perverse instantiation of the constant values all along the value that it wants to make people smile, but the kinds of behaviors it would pursue to achieve this goal would suddenly radically change at a certain point once the new set of strategies became available to it, and you would get this treacherous turn, which would be dangerous. So that’s not to dismiss that whole category of approaches altogether. One would have to think through quite carefully, exactly how one would go about that.

[00:52]

There’s also the issue of, a lot of the things we would want it to learn, if we think of human values and goals and ambitions. We think of them using human concepts, not using basic physical..like place atom A to zed in a certain order, But we think like promote peace, encourage people to develop and achieve…These are things that to understand them we really need to have human concept, which a sub-human AI will not have, it’s too dumb at that stage to have that. Now once it’s super-intelligent it might easily understand all human concepts but then it’s too late. It already needs to be friendly before that. So there might only be this brief window of opportunity where its roughly human leve,l where its still safe enough not to resist our attempt to indoctrinate it but smart enough that it can actually understand what we are trying to tell it.

And again were going to have to be very careful to make sure that we can bring the system up to that interval and then freeze its development there and try to load the values in before boot strapping it farther.

And maybe(this was one of the first questions) its intelligence will not be human level in the sense of being similar to a human at any one point. Maybe it will immediately be very good at chess but very bad at poetry and then it has to reach radically superhuman levels of capability in some domains before other domains even reach human level. And in that case it’s not even clear that there will be this window of opportunity where you can load in the values. So I don’t want to dismiss that, but that’s like some additional things that one needs to think about, if one tries to develop that.

[00:54]

Questioner: How likely is it that we will have the opportunity in our lifetimes to become immortal by mind uploading?

Nick: Well first of all, by immortal here we mean living for a very long time, rather than literally never dying, which is a very different thing that would require our best theories of cosmology to turn out to be false for something like that.

So living for a very long time: Im not going to give you a probability in the end. But I can say some of the things that…Like first we would have to avoid most kinds of things like existential catastrophe that could put an end to this.

So, if you start with 100% and you remove all the things that could go wrong, so first you would have to throw away whatever total level of existential risk is, integrated over all time. Then there is the obvious risk that you will die before any of this happens, which seems to be a very substantial risk. Now you can reduce that by signing up for cryonics, but that’s of course an uncertain business as well. And there could be sub-existential catastrophes that would put an end to a lot of things like a big nuclear war or pandemics.

And then I guess there are all these situations in which not everybody who is still around gets the opportunity to participate in what came after. Even though what came after doesn’t count as an existential catastrophe… And [it can get] even more complicated, like if you took into account the simulation hypothesis, which we decided not to talk about today.

[00:56]

Q: Is there a particular year we should aim for?

Nick: As for the timelines, truth is we don’t know. So you need to think about a very smeared out probability distribution. And really smear it, because things could happen surprisingly sooner like some probability 10 years from now or 20 years now but probably more probable at 30, 40, 50 years but some probability at 80 years or 200 years..

There is just not good evidence that human beings are very good at predicting with precision these kinds of things far out in the future.

Questioner: (hard to understand) How intelligent can we really get. … we already have this complexity class of problems that we can solve or not…

Is it fair to believe that a super-intelligent machine can be actually be that exponentially intelligent... this is very close to what we could achieve …A literal definition of intelligence also, but..

Nick: Well in a sort of cheater sense we could solve all problems, sort of like everything a Turing Machine could..it could take like a piece of paper and..

a) It would take too long to actually do it, and if we tried to do it, there are things that would probably throw us off before we have completed any sort of big Turing machine simulation

There is a less figurative sense in which our abilities are already indirectly unlimited. That is, if we have the ability to create super intelligence, then in a sense we can do everything because we can create this thing that then solves the thing that we want solved. So there is this sequence of steps that we have to go through, but in the end it is solved.

So there is this level of capability that means that once you have that level of capability your indirect reach is universal, like anything that could be done, you could indirectly achieve, and we might have already surpassed that level a long time ago, save for the fact that we are sort of uncoordinated on a global level and maybe a little bit unwise.

But if you had a wise singleton then certainly you could imagine us plotting a very safe course, taking it very slowly and in the end we could be pretty confident that we would get to the end result. But maybe neither of those ideas are what you had in mind. Maybe you had more in mind The question of just how smart, in everyday sort of smart could a machine be,. So just how much more effective at social persuasion, to take one particular thing, than the most persuasive human.

So that we don’t really know. If one has a distribution of human abilities, and it seems like the best humans can do a lot better, in our intuitive sense of a lot, than the average humans. Then it would seem very surprising if the best humans like the top tenth of a percent had reached the upper limit of what was technologically feasible, that would seem to be an amazing coincidence. So one would then expect for the maximum achievable to be a lot higher. But exactly how high we don’t know.

So two more questions:

[00:59]

Q: Just like we are wondering about super-intelligent being, is it possible that that super-intelligent will worry about another super-intelligent being that it will create? Isn’t that also recursive?

Nick: So you consider where one AI designs another AI that’s smarter and then that designs another.

But it might not be clearly distinguishable from the scenario where we have one AI that modifies itself so that it ends up smarter. Whether you call it the same or different, it might be an unimportant difference.

Last question. This has to be super profound question.

[01:00]

Q: So my question is why should we even try to build a super-intelligence?

Nick: I don’t think we should now, do that. If you took a step back and thought what would a sane species do, well they would first figure out how to solve the control problem, and then they would think about it for a while to make sure that they really had the solution right and they hadn’t just deluded themselves to how to solve it, and then maybe they would build a super-intelligence.

So that’s what the sane species will do, now what humanity will do is try to do everything they can as soon as possible, so there are people who have tried to build it as we speak, in a number of different places on earth, and fortunately it looks very difficult to build it with current technology. But of course it’s getting easier over time, computers get better, computer science, the state of the art advances, we learn more about how the human brain works.

So every year it gets a little bit easier, from some unknown very difficult level, it gets easier and easier. So at some point it seems someone will probably succeed at doing it. If the world remains sort of uncoordinated and uncontrolled as it is now, it’s bound to happen soon after it becomes feasible. But we have no reason to accelerate that even more than its already happening ...

So we were thinking about what would a powerful AI thing do that had just come into existence and it didn’t know very much yet, but it had a lot of clever algorithms and a lot of processing power. Someone was suggesting maybe it would move around randomly, like a human baby does, to figure out how things move, how it can move its actuators.

Then we had a discussion if that was a wise thing or not.

But if you think about how the human species behave, we are really behaving very much like a baby were sort of moving and shaking everything that moves, just to see what happens. And the risk is that we are not in the nursery with a kind mother who has put us in a cradle, but that we are out in the jungle somewhere screaming at the top of our lungs, and maybe just alerting the lions to their supper.

So let’s wrap up. I enjoyed this a great deal, so thank you for your questions.

Transcription of Eliezer's January 2010 video Q&A

68 curiousepic 14 November 2011 05:02PM

Spurred by discussion of whether Luke's Q&A session should be on video or text-only, I volunteered to transcribe Eliezer's Q&A videos from January 2010.  I finished last night, much earlier than my estimate, mostly due to feeling motivated to finish it and spending more on it than my very conservative estimated 30 minutes a day (estimate of number of words was pretty close; about 16000).  I have posted a link to this post as a comment in the original thread here, if you would like to upvote that.

Some advice for transcribing videos: I downloaded the .wmv videos, which allowed me to use VLC's global hotkeys to create a pause and "short skip backwards and forwards" buttons (ctrl-space and ctrl-shift left/right arrow), which were so much more convenient than any other method I tried.

 

Edited out: repetition of the question, “um/uh”, “you know,” false starts.

Punctuation, capitalization, and structure, etc may not be entirely consistent.

Keep in mind the opinions expressed here are those of Eliezer circa January 2010.

 


1. What is your information diet like? Do you control it deliberately (do you have a method; is it, er, intelligently designed), or do you just let it happen naturally.

By that I mean things like: Do you have a reading schedule (x number of hours daily, etc)? Do you follow the news, or try to avoid information with a short shelf-life? Do you frequently stop yourself from doing things that you enjoy (f.ex reading certain magazines, books, watching films, etc) to focus on what is more important? etc.


It’s not very planned, most of the time, in other words Hacker News, Reddit, Marginal Revolution, other random stuff found on the internet.  In order to learn something, I usually have to set aside blocks of time and blocks of effort and just focus on specifically reading something. It’s only sort of popular level books which I can put on a restroom shelf and get them read that way.  In order to learn actually useful information I generally find that I have to set aside blocks of time or run across a pot of gold, and you’re about as likely to get a pot of gold from Hacker News as anywhere else really.  So not very controlled.



2. Your "Bookshelf" page is 10 years old (and contains a warning sign saying it is obsolete): http://yudkowsky.net/obsolete/bookshelf.html

Could you tell us about some of the books and papers that you've been reading lately? I'm particularly interested in books that you've read since 1999 that you would consider to be of the highest quality and/or importance (fiction or not).


I guess I’m a bit ashamed of how little I’ve been reading whole books and how much I’ve been reading small bite-sized pieces on the internet recently.  Right now I’m reading Predictably Irrational which is a popular book by Dan Ariely about biases, it’s pretty good, sort of like a sequence of Less Wrong posts.  I’ve recently finished reading Good and Real by Gary Drescher, which is something I kept on picking up and putting down, which is very Lesswrongian, it’s master level Reductionism and the degree of overlap was incredible enough that I would read something and say ‘OK I should write this up on my own before I read how Drescher wrote it so that you can get sort of independent views of it and see how they compare.’


Let’s see, other things I’ve read recently.  I’ve fallen into the black hole of Fanfiction.net, well actually fallen into a black hole is probably too extreme. It’s got a lot of reading and the reading’s broken up into nice block size chapters and I’ve yet to exhaust the recommendations of the good stuff, but probably not all that much reading there, relatively speaking.  


I guess it really has been quite a while since I picked up a good old-fashioned book and said ‘Wow, what an amazing book’. My memory is just returning the best hits of the last 10 years instead of the best hits of the last six months or anything like that.  If we expand it out to the best hits of the last 10 years then Artificial Intelligence: A Modern Approach by Russell and Norvig is a really wonderful artificial intelligence textbook. It was on reading through that that I sort of got the epiphany of artificial intelligence really has made a lot more progress than people credit for, it’s just not really well organized, so you need someone with good taste to go through and tell you what’s been done before you recognize what has been done.


There was a book on statistical inference, I’m trying to remember the exact title, it’s by Hastie and Tibshirani, Elements of Statistical Learning, that was it.  Elements of Statistical Learning was when I realized that the top people, they really do understand their subject, the people who wrote the Elements of Statistical Learning, they really understand statistics.  At the same time you read through and say ‘Gosh, by comparison with these people, the average statistician, to say nothing of the average scientist who’s just using statistics, doesn’t really understand statistics at all.’


Let’s see, other really great... Yeah my memory just doesn’t really associate all that well I’m afraid,  it doesn’t sort of snap back and cough up a list of the best things I’ve read recently.  This would probably be something better for me to answer in text than in video I’m afraid.



3. What is a typical EY workday like? How many hours/day on average are devoted to FAI research, and how many to other things, and what are the other major activities that you devote your time to?


I’m not really sure I have anything I could call a ‘typical’ workday. Akrasia, weakness of will, that has always been what I consider to be my Great Bugaboo, and I still do feel guilty about the amount of rest time and downtime that I require to get work done, and even so I sometimes suspect that I’m taking to little downtime relative to work time just because on those occasions when something or other prevents me form getting work done, for a couple of days, I come back and I’m suddenly much more productive. In general, I feel like I’m stupid with respect to organizing my work day, that sort of problem, it used to feel to me like it was chaotic and unpredictable, but I now recognize that when something looks chaotic and unpredictable, that means that you are stupid with respect to that domain.  


So it’ll probably look like, when I manage to get a work session in the work session will be a couple of hours, I’ll sometimes when I run into a difficult problem I’ll sometimes stop and go off and read things on the internet for a few minutes or a lot of minutes, until I can come back and I can come back and solve the problem or my brain is rested enough to go to the more tiring high levels of abstraction where I can actually understand what it is that’s been blocking me and move on.  That’s for writing, which I’ve been doing a lot of lately.


A typical workday when I’m actually working on Friendly AI with Marcello, that’ll look like we get together and sit down and open up a notebook and stare at our notebooks and throw ideas back and forth and sometimes sit in silence and think about things, write things down, I’ll propose things, Marcello will point out flaws in them or vice versa, sort of reach the end of a line of thought, go blank, stop and stare at each other and try to think of another line of thought, keep that up for two to three hours, break for lunch, keep it up for another two to three hours, and then break for a day, could spend the off day just recovering or reading math if possible or otherwise just recovering.  Marcello doesn’t need as much recovery time, but I also suspect that Marcello, because he’s still sort of relatively inexperienced isn’t quite confronting the most difficult parts of the problem as directly.


So taking a one-day-on one-day-off, with respect to Friendly AI I actually don’t feel guilty about it at all, because it really is apparent that I just cannot work two days in a row on this problem and be productive. It’s just really obvious, and so instead of the usual cycle of ‘Am I working enough? Could I be working harder?’ and feeling guilty about it it’s just obvious that in that case after I get a solid day’s work I have to take a solid day off.


Let’s see, any other sorts of working cycles? Back when I was doing the Overcoming Bias/Less Wrong arc at one post per day, I would sometimes get more than one post per day in and that’s how I’d occasionally get a day off, other times a post would take more than one day. I find that I am usually relatively less productive in the morning; a lot of advice says ‘as soon as you get up in the morning, sit down, start working, get things done’; that’s never quite worked out for me, and of course that could just be because I’m doing it wrong, but even so I find that I tend to be more productive later in the day.


Let’s see, other info... Oh yes, at one point I tried to set up my computer to have a separate login without any of the usual distractions, and that caused my productivity to drop down because it meant that when I needed to take some time off, instead of browsing around the internet and then going right back to working, I’d actually separated work and so it was harder to switch back and forth between them both, so that was something that seemed like it was a really good idea that ought to work in theory, setting aside this sort of separate space with no distractions to work, and that failed.


And right now I’m working sort of on the preliminaries for the book, The Art of Rationality being the working title, and I haven’t started writing the book yet, I’m still sort of trying to understand what it is that I’ve previously written on Less Wrong, Overcoming Bias, organize it using mind mapping software from FreeMind which is open source mind mapping software; it’s really something I wish I’d known existed and started using back when the whole Overcoming Bias/Less Wrong thing started, I think it might have been a help.


So right now I’m just still sort of trying to understand what did I actually say, what’s the point, how do the points relate to each other, and thereby organizing the skeleton of the book, rather than writing it just yet, and the reason I’m doing it that way is that when it comes to writing things like books where I don’t push out a post every day I tend to be very slow, unacceptably slow even, and so one method of solving that was rite a post every day and this time I’m seeing if I can, by planning everything out sufficiently thoroughly in advance and structuring it sufficiently thoroughly in advance, get it done at a reasonable clip.



4. Could you please tell us a little about your brain? For example, what is your IQ, at what age did you learn calculus, do you use cognitive enhancing drugs or brain fitness programs, are you Neurotypical and why didn't you attend school?


So the question is ‘please tell us a little about your brain.’  What’s your IQ? Tested as 143, that would have been back when I was... 12? 13? Not really sure exactly.  I tend to interpret that as ‘this is about as high as the IQ test measures’ rather than ‘you are three standard deviations above the mean’. I’ve scored higher than that on(?) other standardized tests; the largest I’ve actually seen written down was 99.9998th percentile, but that was not really all that well standardized because I was taking the test and being scored as though for the grade above mine and so it was being scored for grade rather than by age, so I don’t know whether or not that means that people who didn’t advance through grades tend to get the highest scores and so I was competing well against people who were older than me, or whether if the really smart people all advanced farther through the grades and so the proper competition doesn’t really get sorted out, but in any case that’s the highest percentile I’ve seen written down.


‘At what age did I learn calculus’, well it would have been before 15, probably 13 would be my guess. I’ll also state at just how stunned I am at how poorly calculus is taught.

Do I use cognitive enhancing drugs or brain fitness programs? No.  I’ve always been very reluctant to try tampering with the neurochemistry of my brain because I just don’t seem to react to things typically; as a kid I was given Ritalin and Prozac and neither of those seemed to help at all and the Prozac in particular seemed to blur everything out and you just instinctively(?) just... eugh.


One of the questions over here is ‘are you neurotypical’. And my sort of instinctive reaction to that is ‘Hah!’  And for that reason I’m reluctant to tamper with things. Similarly with the brain fitness programs, don’t really know which one of those work and which don’t, I’m sort of waiting for other people in the Less Wrong community to experiment with that sort of thing and come back and tell the rest of us what works and if there’s any consensus between them, I might join the crowd.


‘Why didn’t you attend school?’ Well I attended grade school, but when I got out of grade school it was pretty clear that I just couldn’t handle the system; I don’t really know how else to put it. Part of that might have been that at the same time that I hit puberty my brain just sort of... I don’t really know how to describe it.  Depression would be one word for it, sort of ‘spontaneous massive will failure’ might be another way to put; it’s not that I was getting more pessimistic or anything, just that my will sort of failed and I couldn’t get stuff done.  Sort of a long process to drag myself out that and you could probably make a pretty good case that I’m still there, I just handle it a lot better? Not even really sure quite what I did right, as I said in an answer to a previous question, this is something I’ve been struggling with for a while and part of having a poor grasp on something is that even when you do something right you don’t understand afterwards quite what it is that you did right.


So... ‘tell us about your brain’.  I get the impression that it’s got a different balance of abilities; like, some neurons got allocated to different areas, other areas got shortchanged, some areas got some extra neurons, other areas got shortchanged, the hypothesis has occurred to me lately that my writing is attracting other people with similar problems because of the extent to which one has noticed a sort of similar tendency to fall on the lines of very reflective, very analytic and has mysterious trouble executing and getting things done and working at sustained regular output for long periods of time, among the people who like my stuff.


On the whole though, I never actually got around to getting an MRI scan; it’s probably a good thing to do one of these days, but this isn’t Japan where that sort of thing only costs 100 dollars, and getting it analyzed, you know they’re not just looking for some particular thing but just sort of looking at it and saying ‘Hmm, well what is this about your brain?’, well I’d have to find someone to do that too.


So, I’m not neurotypical... asking sort of ‘what else can you tell me about your brain’  is sort of ‘what else can you tell me about who you are apart from your thoughts’, and that’s a bit of a large question. I don’t try and whack on my brain because it doesn’t seem to react typically and I’m afraid of being in a sort of narrow local optimum where anything I do is going to knock it off the tip of the local peak, just because it works better than average and so that’s sort of what you would expect to find there.



5. During a panel discussion at the most recent Singularity Summit, Eliezer speculated that he might have ended up as a science fiction author, but then quickly added:

I have to remind myself that it's not what's the most fun to do, it's not even what you have talent to do, it's what you need to do that you ought to be doing.

Shortly thereafter, Peter Thiel expressed a wish that all the people currently working on string theory would shift their attention to AI or aging; no disagreement was heard from anyone present.

I would therefore like to ask Eliezer whether he in fact believes that the only two legitimate occupations for an intelligent person in our current world are (1) working directly on Singularity-related issues, and (2) making as much money as possible on Wall Street in order to donate all but minimal living expenses to SIAI/Methuselah/whatever.

How much of existing art and science would he have been willing to sacrifice so that those who created it could instead have been working on Friendly AI? If it be replied that the work of, say, Newton or Darwin was essential in getting us to our current perspective wherein we have a hope of intelligently tackling this problem, might the same not hold true in yet unknown ways for string theorists? And what of Michelangelo, Beethoven, and indeed science fiction? Aren't we allowed to have similar fun today? For a living, even?


So, first, why restrict it to intelligent people in today’s world?  Why not everyone?  And second... the reply to the essential intent of the question is yes, with a number of little details added. So for example, if you’re making money on Wall Street, I’m not sure you should be donating all but minimal living expenses because that may or may not be sustainable for you.  And in particular if you’re, say, making 500,000 dollars a year and you’re keeping 50,000 dollars of that per year, which is totally not going to work in New York, probably, then it’s probably more effective to double your living expenses to 100,000 dollars per year and have the amount donated to the Singularity Institute go from 450,000 to 400,000 when you consider how much more likely that makes it that more people follow in your footsteps. That number is totally not realistic and not even close to the percentage of income donated versus spent on living expenses for present people working on Wall Street who are donors to the Singularity Institute. So considering at present that no one seems willing to do that, I wouldn’t even be asking that, but I would be asking for more people to make as much money as possible if they’re the sorts of people who can make a lot of money and can donate a substantial amount fraction, never mind all the minimal living expenses, to the Singularity Institute.


Comparative advantage is what money symbolizes; each of us able to specialize in doing what we do best, get a lot of experience doing it, and trade off with other people specialized at what they’re doing best with attendant economies of scale and large fixed capital installations as well, that’s what money symbolizes, sort of in idealistic reality, as it were; that’s what money would mean to someone who could look at human civilization and see what it was really doing.  On the other hand, what money symbolizes emotionally in practice, is that it imposes market norms, instead of social norms. If you sort of look at how cooperative people are, they can actually get a lot less cooperative once you offer to pay them a dollar, because that means that instead of cooperating because it’s a social norm, they’re now accepting a dollar, and a dollar puts it in the realm of market norms, and they become much less altruistic.


So it’s sort of a sad fact about how things are set up that people look at the Singularity Institute and think ‘Isn’t there some way for me to donate something other than money?’ partially for the obvious reason and partially because their altruism isn’t really emotionally set up to integrate properly with their market norms.  For me, money is reified time, reified labor. To me it seems that if you work for an hour on something and then donate the money, that’s more or less equivalent to donating the money (time?), or should be, logically.  We have very large bodies of experimental literature showing that the difference between even a dollar bill versus a token that’s going to be exchanged for a dollar bill at the end of the experiment can be very large, just because that token isn’t money.  So there’s nothing dirty about money, and there’s nothing dirty about trying to make money so that you can donate it to a charitable cause; the question is ‘can you get your emotions to line up with reality in this case?’


Part of the question was sort of like ‘What of Michaelangelo, Beethoven, and indeed science fiction? Aren’t we allowed to have similar fun today? For a living even?’


This is crunch time.  This is crunch time for the entire human species. This is the hour before the final exam, we are trying to get as much studying done as possible, and it may be that you can’t make yourself feel that, for a decade, or 30 years on end or however long this crunch time lasts. But again, the reality is one thing, and the emotions are another.  So it may be that you can’t make yourself feel that this is crunch time, for more than an hour at a time, or something along those lines. But relative to the broad sweep of human history, this is crunch time; and it’s crunch time not just for us, it’s crunch time for the intergalactic civilization whose existence depends on us. I think that if you’re actually just going to sort of confront it, rationally, full-on, then you can’t really justify trading off any part of that intergalactic civilization for any intrinsic thing that you could get nowadays, and at the same time it’s also true that there are very few people who can live like that, and I’m not one of them myself, so because trying to live with that would even rule out things like ordinary altruism; I hold open doors for little old ladies, because I find that I can’t live only as an altruist in theory; I need to commit sort of actual up-front deeds of altruism, or I stop working properly.


So having seen that intergalactic civilization depends on us, in one sense, all you can really do is try not to think about that, and in another sense though, if you spend your whole life creating art to inspire people to fight global warming, you’re taking that ‘forgetting about intergalactic civilization’ thing much too far. If you look over our present civilization, part of that sort of economic thinking that you’ve got to master as a rationalist is learning to think on the margins. On the margins, does our civilization need more art and less work on the singularity? I don’t think so. I think that the amount of effort that our civilization invests in defending itself against existential risks, and to be blunt, Friendly AI in particular is ludicrously low.  Now if it became the sort of pop-fad cause and people were investing billions of dollars into it, all that money would go off a cliff and probably produce anti-science instead of science, because very few people are capable of working on a problem where they don’t find immediately whether or not they were wrong, and it would just instantaneously go wrong and generate a lot of noise from people of high prestige who would just drown out the voices of sanity. So wouldn’t it be a nice thing if our civilization started devoting billions of dollars to Friendly AI research because our civilization is not set up to do that sanely.  But at the same time, the Singularity Institute exists, the Singularity Institute, now that Michael Vassar is running it, should be able to scale usefully; that includes actually being able to do interesting things with more money, now that Michael Vassar’s the president.


To say ‘No, on the margin, what human civilization, at this present time, needs to do is not put more money in the Singularity Institute, but rather do this thing that I happen to find fun’ not that I’m doing this and I’m going to professionally specialize in it and become good in it and sort of trade hours of doing this thing that I’m very good at for hours that go into the Singularity Institute via the medium of money, but rather ‘no, this thing that I happen to find fun and interesting is actually what our civilization needs most right now, not Friendly AI’, that’s not defensible; and, you know, these are all sort of dangerous things to think about possibly, but I think if you sort of look at that face-on, up-front, take it and stare at it, there’s no possible way the numbers could work out that way.


It might be helpful to visualize a Friendly Singularity so that the kid who was one year old at the time is now 15 years old and still has something like a 15 year old human psychology and they’re asking you ‘So here’s this grand, dramatic moment in history, not human history, but history, on which the whole future of the intergalactic civilization that we now know we will build; it hinged on this one moment, and you knew that was going to happen. What were you doing?’ and you say, ‘Well, I was creating art to inspire people to fight global warming.’ The kid says ‘What’s global warming?’


That’s what you get for not even taking into account at all the whole ‘crunch time, fate of the world depends on it, squeaking through by a hair if we do it at all, already played into a very poor position in terms of how much work has been done and how much work we need to do relative to the amount of work that needs to be done to destroy the world as opposed to saving it; how long we could have been working on this previously and how much trouble it’s been to still get started.’  When this is all over, it’s going to be difficult to explain to that kid, what in the hell the human species was thinking.  It’s not going to be a baroque tale. It’s going to be a tale of sheer insanity.  And you don’t want you to be explaining yourself to that kid afterward as part of the insanity rather than the sort of small core of ‘realizing what’s going on and actually doing something about it that got it done.’



6. I know at one point you believed in staying celibate, and currently your main page mentions you are in a relationship. What is your current take on relationships, romance, and sex, how did your views develop, and how important are those things to you? (I'd love to know as much personal detail as you are comfortable sharing.)


This is not a topic on which I consider myself an expert, and so it shouldn’t be shocking to hear that I don’t have incredibly complicated and original theories about these issues.  Let’s see, is there anything else to say about that... It’s asking ‘at one point I believed in staying celibate and currently your main page mentions your are in a relationship.’ So, it’s not that I believed in staying celibate as a matter of principle, but that I didn’t know where I could find a girl who would put up with me and the life that I intended to lead, and said as much, and then one woman, Erin, read about the page I’d put up to explain why I didn’t think any girl would put up with me and my life and said essentially ‘Pick me! Pick me!’ and it was getting pretty difficult to keep up with the celibate lifestyle by then so I said ‘Ok!’ And that’s how we got together, and if that sounds a bit odd to you, or like, ‘What!? What do you mean...?’ then... that’s why you’re not my girlfriend.


I really do think that in the end I’m not an expert; that might be as much as there is to say.



7. What's your advice for Less Wrong readers who want to help save the human race?


Find whatever you’re best at; if that thing that you’re best at is inventing new math[s] of artificial intelligence, then come work for the Singularity Institute.  If the thing that you’re best at is investment banking, then work for Wall Street and transfer as much money as your mind and will permit to the Singularity institute where [it] will be used by other people.  And for a number of sort of intermediate cases, if you’re familiar with all the issues of AI and all the issues of rationality and you can write papers at a reasonable clip, and you’re willing to work for a not overwhelmingly high salary, then the Singularity Institute is, as I understand it, hoping to make a sort of push toward getting some things published in academia. I’m not going to be in charge of that, Michael Vassar and Anna Salamon would be in charge of that side of things.  There’s an internship program whereby we provide you with room and board and you drop by for a month or whatever and see whether or not this is work you can do and how good you are at doing it.


Aside from that, though, I think that saving the human species eventually comes down to, metaphorically speaking, nine people and a brain in a box in a basement, and everything else feeds into that.  Publishing papers in academia feeds into either attracting attention that gets funding, or attracting people who read about the topic, not necessarily reading the papers directly even but just sort of raising the profile of the issues where intelligent people wonder what they can do with their lives think artificial intelligence instead of string theory. Hopefully not too many of them are thinking that because that would just generate noise, but the very most intelligent people... string theory is a marginal waste of the most intelligent people. Artificial intelligence and Friendly Artificial Intelligence, sort of developing precise, precision grade theories of artificial intelligence that you could actually use to actually build a Friendly AI instead of blowing up the world; the need for one more genius there is much greater than the need for one more genius in string theory. Most of us can’t work on that problem directly.  I, in a sense, have been lucky enough not to have to confront a lot of the hard issues here, because of being lucky enough to be able to work on the problem directly, which simplifies my choice of careers.


For everyone else, I’ll just sort of repeat what I said in an earlier video about comparative advantage, professional specialization, doing what we do best at and practicing a lot; everyone doing that and trading with each other is the essence of economics, and the symbol of this is money, and it’s completely respectable to work hours doing what you’re best at, and then transfer the sort of expected utilons that a society assigns to that to the Singularity Institute, where it can pay someone else to work at it such that it’s an efficient trade, because the total amount of labor and effectiveness that they put into it that you can purchase is more than you could do by working an equivalent number of hours on the problem yourself. And as long as that’s the case, the economically rational thing to do is going to be to do what you’re best at and trade those hours to someone else, and let them do it. And there should probably be fewer people, one expects, who working on the problem directly, full time; stuff just does not get done if you’re not working on it full time, that’s what I’ve discovered, anyway; I can’t even do more than one thing at a time. And that’s the way grown ups do it, essentially, that’s the way a grown up economy does it.



8. Autodidacticism

Eliezer, first congratulations for having the intelligence and courage to voluntarily drop out of school at age 12! Was it hard to convince your parents to let you do it? AFAIK you are mostly self-taught. How did you accomplish this? Who guided you, did you have any tutor/mentor? Or did you just read/learn what was interesting and kept going for more, one field of knowledge opening pathways to the next one, etc...?

EDIT: Of course I would be interested in the details, like what books did you read when, and what further interests did they spark, etc... Tell us a little story. ;)


Well, amazingly enough, I’ve discovered the true, secret, amazing formula for teaching yourself and... I lie, I just winged it.  Yeah, just read whatever interested me until age 15-16 thereabouts which is when I started to discover the Singularity as opposed to background low-grade Transhumanism that I’d been engaged with up until that point; started thinking that cognitive technologies, creating smarter than human level intelligence was the place to be and initially thought that neural engineering was going to be the sort of leading, critical path of that. Studied a bit of neuroscience and didn’t get into that too far before I started thinking that artificial intelligence was going to be the route; studied computer programming, studied a bit of business type stuff because at one point I thought I’d do a start up at something I’m very glad I didn’t end up doing, in order to get the money to do the AI thing, and I’m very glad that I didn’t go that route, and I won’t even say that the knowledge has served me all that good instead, it’s just not my comparative advantage.  


At some point sort of woke up and smelled the Bayesian coffee and started studying probability theory and decision theory and statistics and that sort of thing, but really I haven’t had and opportunity to study anywhere near as much as I need to know.  And part of that, I won’t apologize for because a lot of sort of fact memorization is more showing off than because you’re going to use that fact every single day; part of that I will apologize for because I feel that I don’t know enough to get the job done and that when I’m done writing the book I’m just going to have to take some more time off and just study some of the sort of math and mathematical technique that I expect to need in order to get this done. I come across as very intelligent, but a surprisingly small amount of that relies on me knowing lots of facts, or at least that’s the way it feels to me. So I come across as very intelligent, but that’s because I’m good at winging it, might be one way to put it. The road of the autodidact, I feel... I used to think that anyone could just go ahead and do it and that the only reason to go to college was for the reputational ‘now people can hire you’ aspect which sadly is very important in today’s world. Since then I’ve come to realize both that college is less valuable and less important than I used to think and also that autodidacticism might be a lot harder for the average person than I thought because the average person is less similar to myself than my sort of intuitions would have it.


‘How do you become an autodidact’; the question you would ask before that would be ‘what am I going to do, and is it something that’s going to rely on me having memorized lots of standard knowledge and worked out lots of standard homework problems, or is it going to be something else, because if you’re heading for a job where you going to want to memorize lots of the same standardized facts as people around you, then autodidacticism might not be the best way to go. If you’re going to be a computer programmer, on the other hand, then [going] into a field where every day is a new adventure, and most jobs in computer programming will not require you to know the Nth detail of computer science, and even if they did, the fact that this is math means you might even have a better chance of learning it out of a book, and above all it’s a field where people have some notion that you’re allowed to teach yourself; if you’re good, other people can see it by looking at your code, and so there’s sort of a tradition of being willing to hire people who don’t have a Masters.


So I guess I can’t really give all that much advice about how to be successful autodidact in terms of... studying hard, doing the same sort of thing you’d be doing in college only managing to do it on your own because you’re that self-disciplined, because that is completely not the route I took. I would rather advise you to think very hard about what it is you’re going to be doing, whether or not anyone will let you do it if you don’t have the official credential, and to what degree the road you’re going is going to depend on the sort of learning that you have found that you can get done on your own.



9. Is your pursuit of a theory of FAI similar to, say, Hutter's AIXI, which is intractable in practice but offers an interesting intuition pump for the implementers of AGI systems? Or do you intend on arriving at the actual blueprints for constructing such systems? I'm still not 100% certain of your goals at SIAI.


Definitely actual blueprint, but, on the way to an actual blueprint, you probably have to, as an intermediate step, construct intractable theories that tell you what you’re trying to do, and enable you to understand what’s going on when you’re trying to do something. If you want a precise, practical AI, you don’t get there by starting with an imprecise, impractical AI and going to a precise, practical AI. You start with a precise, impractical AI and go to a precise, practical AI. I probably should write that down somewhere else because it’s extremely important, and as(?) various people who will try to dispute it, and at the same time hopefully ought to be fairly obvious if you’re not motivated to arrive at a particular answer there. You don’t just run out and construct something imprecise because, yeah, sure, you’ll get some experimental observations out of that, but what are your experimental observations telling you?  And one might say along the lines of ‘well, I won’t know that until I see it,’ and suppose that has been known to happen a certain number of times in history; just inventing the math has also happened a certain number of times in history.


We already have a very large body of experimental observations of various forms of imprecise AIs, both the domain specific types we have now, and the sort of imprecise AI constituted by human beings, and we already have a large body of experimental data, and eyeballing it... well, I’m not going to say it doesn’t help, but on the other hand, we already have this data and now there is this sort of math step in which we understand what exactly is going on; and then the further step of translating the math back into reality. It is the goal of the Singularity Institute to build a Friendly AI. That’s how the world gets saved, someone has to do it. A lot of people tend to think that this is going to require, like, a country’s worth of computing power or something like that, but that’s because the problem seems very difficult because they don’t understand it, so they imagine throwing something at it that seems very large and powerful and gives this big impression of force, which might be a country-size computing grid, or it might be a Manhattan Project where some computer scientists... but size matters not, as Yoda says.


What matters is understanding, and if the understanding is widespread enough, then someone is going to grab the understanding and use it to throw together the much simpler AI that does destroy the world, the one that’s build to much lower standards, so the model of ‘yes, you need the understanding, the understanding has to be concentrated within a group of people small enough that there is not one defector in the group who goes off and destroys the world, and then those people have to build an AI.’  If you condition on that the world got saved, and look back and within history, I expect that that is what happened in the majority of cases where a world anything like this one gets saved, and working back from there, they will have needed a precise theory, because otherwise they’re doomed. You can make mistakes and pull yourself up, even if you think you have a precise theory, but if you don’t have a precise theory then you’re completely doomed, or if you don’t think you have a precise theory then you’re completely doomed.  


And working back from there, you probably find that there were people spending a lot of time doing math based on the experimental results that other people had sort of blundered out into the dark and gathered because it’s a lot easier to blunder out into the dark; more people can do it, lots more people have done it; it’s the math part that’s really difficult.  So I expect that if you look further back in time, you see a small group of people who had honed their ability to understand things to a very high pitch, and then were working primarily on doing math and relying on either experimental data that other people had gathered by accident, or doing experiments where they have a very clear idea why they’re doing the experiment and what different results will tell them.



10. What was the story purpose and/or creative history behind the legalization and apparent general acceptance of non-consensual sex in the human society from Three Worlds Collide?


The notion that non-consensual sex is not illegal and appears to be socially accepted might seem a bit out of place in the story, as if it had been grafted on.  This is correct.  It was grafted on from a different story in which, for example, theft is while not so much legal, because they don’t have what would you call a strong, centralized government, but rather, say, theft is, in general, something you pull off by being clever rather than a horrible crime; but of course, you would never steal a book. I have yet to publish a really good story set in this world; most of them I haven’t finished, the one I have finished  has other story problems.  But if you were to see the story set in this world, then you would see that it develops out of a much more organic thing than say... dueling, theft, non-consensual sex; all of these things are governed by tradition rather than by law, and they certainly aren’t prohibited outright.


So why did I pick up that one aspect form that story and put it into Three Worlds Collide?  Well, partially it was because I wanted that backpoint to introduce a culture clash between their future and our past, and that’s what came to mind, more or less, it was more something to test out to see what sort of reaction it got, to see if I could get away with putting it into this other story. Because one can’t use theft; Three Worlds Collide’s society actually does run on private propety. One can’t use dueling; their medical technology isn’t advanced enough to make that trivial. But you can use non-consensual sex and try to explain sort of what happens in a society in which people are less afraid, and not afraid of the same things. They’re stronger than we are in some senses, they don’t need as much protection, the consequences aren’t the same consequences that we know, and the people there sort of generally have a higher grade of ethics and are less likely to abuse things.  That’s what made that sort of particular culture clash feature a convenient thing to pick up from one story and graft onto another, but ultimately it was a graft, and any feelings of ‘why is that there?’ that you have, might make a bit more sense if you saw the other story, if I can ever repair the flaws in it, or manage to successfully complete and publish a story set in that world that actually puts the world on display.



11. If you were to disappear (freak meteorite accident), what would the impact on FAI research be?

Do you know other people who could continue your research, or that are showing similar potential and working on the same problems? Or would you estimate that it would be a significant setback for the field (possibly because it is a very small field to begin with)?


Marcello Herreshoff is the main person whom I’ve worked with on this, and Marcello doesn’t yet seem to be to the point where he could replace me, although he’s young so he could easily develop further in coming years and take over as the lead, or even, say, ‘Aha! Now I’ve got it! No more need for Eliezer Yudkowsky.’ That sort of thing would be very nice if it happened, but it’s not the sort of thing I would rely on.


So if I got hit by a meteor right now, what would happen is that Michael Vassar would take over responsibility for seeing the planet through to safety, and say ‘Yeah I’m personally just going to get this done, not going to rely on anyone else to do it for me, this is my problem, I have to handle it.’ And Marcello Herreshoff would be the one who would be tasked with recognizing another Eliezer Yudkowsky if one showed up and could take over the project, but at present I don’t know of any other person who could do that, or I’d be working with them.  There’s not really much of a motive in a project like this one to have the project split into pieces; whoever can do work on it is likely to work on it together.



12. Your approach to AI seems to involve solving every issue perfectly (or very close to perfection). Do you see any future for more approximate, rough and ready approaches, or are these dangerous?


More approximate, rough and ready approaches might produce interesting data that math theorist types can learn something from even though the people who did it didn’t have that in mind. The thing is, though, there’s already a lot of people running out and doing that and really failing at AI, or even approximate successes at AI, result in much fewer sudden thunderbolts of enlightenment about the structure of intelligence than the people that are busily producing ad hoc AI programs because that’s easier to do and you can get a paper out of it and you get respect out of it and prestige and so on. So it’s a lot harder for that sort of work to result in sudden thunderbolts of enlightenment about the structure of intelligence than the people doing it would like to think, because that way it gives them an additional justification for doing the work. The basic answer to the question is ‘no’, or at least I don’t see a future for Singularity Institute funding, going as marginal effort, into sort of rough and ready ‘forages’ like that.  It’s been done already. If we had more computer power and our AIs were more sophisticated, then the level of exploration that we’re doing right now would not be a good thing, as it is, it’s probably not a very dangerous thing because the AIs are weak more or less. It’s not something you would ever do with AI that was powerful enough to be dangerous. If you know what it is that you want to learn by running a program, you may go ahead and run it; if you’re just foraging out at random, well other people are doing that, and even then they probably won’t understand what their answers mean until you on your end, the sort of math structure of intelligence type people, understand what it means. And mostly the result of an awful lot of work in domain specific AIs tell us that we don’t understand something, and this can often be surprisingly easy to figure out, simply by querying your brain without being overconfident.


So, I think that at this point, what’s needed is math structure of intelligence type understanding, and not just any math, not just ‘Ooh, I’m going to make a bunch of Greek symbols and now I can publish a paper and everyone will be impressed by how hard it is to understand,’ but sort of very specific math, the sort that results in thunderbolts of enlightenment; the usual example I hold up is the Bayesian Network Causality insight as depicted in Judea Pearl’s Probabilistic Reasoning in Intelligent Systems and (later book of causality?). So if you sort of look at the total amount of papers that have been written with neat Greek symbols and things that are mathematically hard to understand and compare that to those Judea Pearl books I mentioned, though one should always mention this is the culmination of a lot of work not just by Judea Pearl; that will give you a notion of just how specific the math has to be.


In terms of solving every issue perfectly or very close to perfection, there’s kinds of perfection. As long as I know that any proof is valid, I might not know how long it takes to do a proof; if there’s something that does  proof, then I may not know how long the algorithm takes to produce a proof but I may know that anything it claims is a proof is definitely a proof, so there’s different kinds of perfection and types of precision.  But basically, yeah, if you want to build a recursively self-improving AI, have it go through a billion sequential self-modifications, become vastly smarter than you, and not die, you’ve got to work to a pretty precise standard.



13. How young can children start being trained as rationalists? And what would the core syllabus / training regimen look like?


I am not an expert in the education of young children. One has these various ideas that one has written up on Less Wrong, and one could try to distill those ideas, popularize them, illustrate them through simpler and simpler stories and so take these ideas and push them down to a lower level, but in terms of sort of training basic though skills, training children to be self-aware, to be reflective, getting them into the habit of reading and storing up lots of pieces of information, trying to get them more interested in being fair to both sides of an argument, the virtues of honest curiosity over rationalization, not in the way that I do it by sort of telling people and trying to lay out stories and parables that illustrate it and things like that, but if there’s some other way to do it with children, I’m not sure that my grasp of this concept of teaching rationality extends to before the young adult level.  I believe that we had some sort of thread on Less Wrong about this, sort of recommended reading for young rationalists, I can’t quite remember.


Oh, but one thing that does strike me as being fairly important is that if this ever starts to happen on a larger scale and individual parents teaching individual children, the number one thing we want to do is test out different approaches and see which one works experimentally.



14. Could you elaborate a bit on your "infinite set atheism"? How do you feel about the set of natural numbers? What about its power set? What about that thing's power set, etc?

From the other direction, why aren't you an ultrafinitist?


The question is ‘can you elaborate on your infinite set atheism’, that’s where I say ‘I don’t believe in infinite sets because I’ve never seen one.’  


So first of all, my infinite set atheism is a bit tongue-in-cheek. I mean, I’ve seen a whole lot of natural numbers, and I’ve seen that times tend to have successor times, and in my experience, at least, time doesn’t return to its starting point; as I understand current cosmology, the universe is due to keep on expanding, and not return to its starting point. So it’s entirely possible that I’m faced with certain elements that have successors where if the successors of two elements are the same and the two elements are the same, in which there’s no cycle. So in that sense I might be forced to recognize the empirical existence of every member of what certainly looks like an infinite set.  As for the question of whether this collection of infinitely many finite things constitutes an infinite thing exists is an interesting metaphysical one, or it would be if we didn’t have the fact that even though by looking at time we can see that it looks like infinite things ought to exist, nonetheless, we’ve never encountered an infinite thing in certain, in person. We’ve never encountered a physical process that performs a super task. If you look more at physics, you find that actually matters are even worse than this.  We’ve got real numbers down there, or at least if you postulate that it’s something other than real numbers underlying physics then you have to postulate something that looks continuous but isn’t continuous, and in this way, by Occam’s Razor, one might very easily suspect that the appearance of continuity arises from actual continuity, so that we have, say, an amplitude distribution, a neighborhood in configuration space, and the amplitude[s that] flows in configuration space are continuous, instead of having a discrete time with a discrete successor, we actually have a flow of time, so when you write the rules of causality, it’s not possible to write the rules of causality the way we write them for a Turing machine, you have to write the rules of causality as differential equations.  


So these are the two main cases in which the universe is defined by infinite set atheism.  The universe is handing me what looks like an infinite collection of things, namely times; the universe is handing me things that exist and are causes and the simplest explanation would have them being described by continuous differential equations, not by discrete ticks.  So that’s the main sense in which my infinite set atheism is challenged by the universe’s actual presentation of things to me of things that look infinite. Aside from this, however, if you start trying to hand me paradoxes that are being produced by just assuming that you have an infinite thing in hand as an accomplished fact, an infinite thing of the sort where you can’t just present to me a physical example of it, you’re just assuming that that infinity exists, and then you’re generating paradoxes from it, well, we do have these nice mathematical rules for reasoning about infinities, but, rather than putting the blame on the person for having violated these elaborate mathematical rules that we develop to reason about infinities, I’m even more likely to cluck my tongue and say ‘But what good is it?’ Now it may be a tongue-in-cheek tongue cluck... I’m trying to figure out how to put this into words...  Map that corresponds to the territory, if you can’t have infinities in your map, because your neurons, they fire discretely, and you only have a finite number of neurons in your head, so if you can’t have infinities in the map, what makes you think that you can make them correspond to infinities in the territory, especially if you’ve never actually seen that sort of infinity? And so the sort of math of the higher infinities, I tend to view as works of imaginative literature, like Lord of the Rings; they may be pretty, in the same way that Tolkien Middle Earth is pretty, but they don’t correspond to anything real until proven otherwise.



15. Why do you have a strong interest in anime, and how has it affected your thinking?


‘Well, as a matter of sheer, cold calculation I decided that...’


It’s anime! (laughs)


How has it affected my thinking? I suppose that you could view it as a continuity of reading dribs and drabs of westernized eastern philosophy from Godel, Escher, Bach or Raymon Smullyan, concepts like ‘Tsuyoku Naritai’, ‘I want to become stronger’, are things that being exposed to the alternative eastern culture as found in anime might have helped me to develop concepts of.  But on the whole... it’s anime! There’s not some kind of elaborate calculation behind it, and I can’t quite say that when I’m encountering a daily problem, I think to myself ‘How would Light Yagami solve this?’ If the point of studying a programing language is to change the way you think, then I’m not sure that studying anime has change the way I think all that much.



16. What are your current techniques for balancing thinking and meta-thinking?

For example, trying to solve your current problem, versus trying to improve your problem-solving capabilities.


I tend to focus on thinking, and it’s only when my thinking gets stuck or I run into a particular problem that I will resort to meta-thinking, unless it’s a particular meta skill that I already have, in which case I’ll just execute it. For example, the meta skill of trying to focus on the original problem.  In one sense, a whole chunk of Less Wrong is more or less my meta-thinking skills.  


So I guess on reflection (ironic look), I would say that there’s a lot of routine meta-thinking that I already know how to do, and that I do without really thinking of it as meta-thinking. On the other hand, original meta-thinking, which is the time consuming part is something I tend to resort to only when my current meta-thinking skills have broken down.  And that’s probably a reasonably exceptional circumstance even though it’s something of comparative advantage and so I expect it to do a bit more of it than average.  Even so, when I’m trying to work on an object-level problem at any given point, I’m probably not doing original meta-level questioning about how to execute these meta-level skills.  


If I bog down in writing something I may execute my sort of existing meta-level skill of ‘try to step back and look at this from a more abstract level’, and if that fails, then I may have to sort of think about what kind of abstract levels can you view this problem on, similar problems as opposed to tasks, and in that sense go into original meta-level thinking mode. But one of those meta-level skills I would say is the notion that your meta-level problem comes from an object-level problem and you’re supposed to keep one eye on the object-level problem the whole time you’re working on the meta-level.



17. Could you give an uptodate estimate of how soon non-Friendly general AI might be developed? With confidence intervals, and by type of originator (research, military, industry, unplanned evolution from non-general AI...)


We’re talking about this very odd sector of program space and programs that self-modify and wander around that space and sort of amble into a pot of gold that enables them to keep going and... I have no idea...


There are all sorts of different ways that it could happen, I don’t know which one of them are plausible or implausible or how hard or difficult they are relative to modern hardware or computer science. I have no idea what the odds are; I know they aren’t getting any better as time goes on or that is, the probabilities of Unfriendly AI are increasing over time. So if you were actually to  make some kind of graph, then you’d see the probability rising over time as the odds got worse, and then the graph would slope down again as you entered into regions where it was more likely than not that Unfriendly AI had actually occurred before that; the slope would actually fall off faster as you went forward in time because the amount of probability mass has been drained away by Unfriendly AI happening now.


‘By type of originator’ or something, I might have more luck answering. I would put academic research at the top of it, because academic research that actually can try blue sky things. Or... OK, first commercial, that wasn’t quite on the list, as in people doing startup-ish things, hedge funds, people trying to improve the internal AI systems that they’re using for something, or build weird new AIs to serve commercial needs; those are the people most likely to build AI ‘stews’(?)  Then after that, academic research, because in academia you have a chance of trying blue sky things. And then military, because they can hire smart people and give the smart people lots of computing power and have a sense of always trying to be on the edge of things. Then industry, if that’s supposed to mean car factories and so on because... that actually strikes me as pretty unlikely; they’re just going to be trying to automate ordinary processes, that sort of thing, it’s generally unwise to sort of push the bounds of theoretical limits while you’re trying to do that sort of thing; you can count Google as industry, but that’s the sort of thing I had in mind when I was talking about commercial.  Unplanned evolution from non-general AI [is] not really all that likely to happen. These things aren’t magic. If something can happen by itself spontaneously, it’s going to happen before that because humans are pushing on it.


As for confidence intervals... doing that just feels like pulling numbers out of thin air.  I’m kind of reluctant to do it because of the extent to which I feel that, even to the extent that my brain has a grasp on this sort of thing; by making up probabilities and making up times, I’m not even translating the knowledge that I do have into reality, so much as pulling things out of thin air. And if you were to sort of ask ‘what do sort of attitude do your revealed actions indicate?’ then I would say that my revealed actions don’t indicate that I expect to die tomorrow of Unfriendly AI, and my revealed actions don’t indicate that we can safely take until 2050. And that’s not even a probability estimate, that’s sort of looking at what I’m doing and trying to figure out what my brain thinks the probabilities are.



18. What progress have you made on FAI in the last five years and in the last year?


The last five years would take us back to the end of 2004, which is fairly close to the beginning of my Bayesian enlightenment, so the whole ‘coming to grasps with the Bayesian structure of it all’, a lot of that would fall into the last five years.  And if you were to ask me... the development of Timeless Decision Theory would be in the last five years.  I’m tyring to think if there’s anything else I can say about that.  Getting a lot of clarification of what the problems were.  


In the last year, I managed to get in a decent season of work with Marcello after I stopped regular posting to OBLW over the summer, before I started writing the book. That, there’s not much I can say about; there was something I suspected was going to be a problem and we tried to either solve the problem or at least nail down exactly what the problem was, and i think that we did a fairly good job of the latter, we now have a nice precise, formal explanation of what it is we want to do and why we can’t do it in the obvious way; we came up with sort of one hack for getting around it that’s a hack and doesn't have all the properties that we want a real solution to have.


So, step one, figure out what the problem is, step two, understand the problem, and step three, solve the problem.  Some degree of progress on step two but not finished with it, and we didn’t get to step three, but that’s not overwhelmingly discouraging.  Most of the real progress that has been made when we sit down and actually work on the problem [are] things I’d rather not talk about and the main exception to that is Timeless Decision Theory which has been posted to Less Wrong.



19. How do you characterize the success of your attempt to create rationalists?


It’s a bit of an ambiguous question, and certainly an ongoing project. Recently, for example, I was in a room with a group of people with a problem of what Robin Hanson called a far-type and what I would call the type where it’s difficult because you don’t get immediate feedback when you say something stupid, and it really was clear who in that room was an ‘X-rationalist’ or ‘neo-rationalist’, or ‘Lesswrongian’ or ‘Lessiath’ and who was not. The main distinction was that the sort of non-X-rationalists were charging straight off and were trying to propose complicated policy solutions right off the bat, and the rationalists were actually holding off, trying to understand the problem, break it down into pieces, analyze the pieces modularly, and just that one distinction was huge; it was the difference between ‘these are the people who can make progress on the problem’ and ‘these are the people who can’t make progress on the problem’. So in that sense, once you hand this deep, Lesswrongian types a difficult problem, the distinction between them and someone who has merely had a bunch of successful life experiences and so on is really obvious.


There’s a number of other interpretations that can be attached to the question, but I don’t really know what it means aside from that, even though it was voted up by 17 people.



20. What is the probability that this is the ultimate base layer of reality?


I would answer by saying, hold on, this is going to take me a while to calculate... um.... uh... um... 42 percent! (sarcastic)



21. Who was the most interesting would-be FAI solver you encountered?


Most people do not spontaneously try to solve the FAI problem.  If they’re spontaneously doing something, they try to solve the AI problem. If we’re talking about sort of ‘who’s made interesting progress on FAI problems without being a Singularity Institute Eliezer supervised person,’ then I would have to say: Wei Dai.



22. If Omega materialized and told you Robin was correct and you are wrong, what do you do for the next week? The next decade?


If Robin’s correct, then we’re on a more or less inevitable path to competing intelligences driving existence down to subsistence level, but this does not result in the loss of everything we regard as valuable, and there seem to be some values disputes here, or things that are cleverly disguised as values disputes while probably not being very much like values disputes at all.


I’m going to take the liberty of reinterpreting this question as ‘Omega materializes and tells you “You’re Wrong”’, rather than telling me Robin in particular is right; for one thing that’s a bit more probable. And, Omega materializes and tells me ‘Friendly AI is important but you can make no contribution to that problem, in fact everything you’ve done so far is worse than nothing.’ So, publish a retraction... Ordinarily I would say that the next most important thing after this is to go into talking about rationality, but then if Omega tells me that I’ve actually managed to do worse than nothing on Friendly AI, that of course has to change my opinion of how good I am at rationality or teaching others rationality, unless this is a sort of counterfactual surgery type of thing where it doesn’t affect my opinion of how useful I can be by teaching people rationality, and mostly the thing I’d be doing if Friendly AI weren’t an option would probably be pushing human rationality. And if that were blocked out of existence, I’d probably end up as a computer programmer whose hobby was writing science fiction.


I guess I have enough difficulty visualizing what it means for Robin to be correct or how the human species isn’t just plain screwed in that situation that I could wish that Omega had materialized and either told me someone else was correct or given me a bit more detail about what I was wrong about exactly; I mean I can’t be wrong about everything; I think that two plus two equals four.



23. In one of the discussions surrounding the AI-box experiments, you said that you would be unwilling to use a hypothetical fully general argument/"mind hack" to cause people to support SIAI. You've also repeatedly said that the friendly AI problem is a "save the world" level issue. Can you explain the first statement in more depth? It seems to me that if anything really falls into "win by any means necessary" mode, saving the world is it.


Ethics are not pure personal disadvantages that you take on for others’ benefit. Ethics are not just penalties to the current problem you’re working on that have sort of side benefits for other things. When I first started working on the Singularity problem, I was making non-reductionist type mistakes about Friendly AI, even though I thought of myself as a rationalist at the time. And so I didn’t quite realize that Friendly AI was going to be a problem, and I wanted to sort of go all-out on any sort of AI, as quickly as possible; and actually, later on when I realized that Friendly AI was an issue, the sort of sneers that I now get about not writing code or being a luddite were correctly anticipated by my past self with the result that my past self sort of kept on advocating the kind of ‘rush ahead and write code’ strategy, rather than face the sneers, instead of going back and replanning everything from scratch once my past self realized that Friendly AI was going to be an issue, on which basis all the plans had been made before then.


So if I’d lied to get people to do what I had wanted them to do at that point, to just get AI done, to rush ahead and write code rather than doing theory; being honest as I actually was, I could just come back and say ‘OK, here’s what I said, I’m honestly mistaken, here’s the new information that I encountered that caused me to change my mind, here’s the new strategy that we need to use after taking this new information into account’. If you lie, there’s not necessarily any equally easy way to retract your lies. ... So for example, one sort of lie that I used to hear advocated back in the old days was by other people working on AI projects and it was something along the lines of ‘AI is going to be safe and harmless and will inevitably cure cancer, but not really take over the world or anything’ and if you tell that lie in order to get people to work on your AI project, then it’s going to be a bit more difficult to explain to them why you suddenly have to back off and do math and work on Friendly AI. Now, if I were an expert liar, I’d probably be able to figure out some sort of way to reconfigure those lies as well, I mean I don’t really know what an expert liar could accomplish by way of lying because I don’t have enough practice.


So I guess in that sense it’s not all that defensible... a defensive ethics, because I haven’t really tried it both ways, but it does seem to me, looking over my history, my ethics have played a pretty large role in protecting me from myself. Another example is [that] the whole reason that I originally pursued the thought of Friendly AI long enough to realize that it was important was not so much out of a personal desire as out of a sense that this was something I owed to the other people who were funding the project, Brian Atkins in particular back then, and that if there’s a possibility from their perspective that you can do better by Friendly AI, or that a fully honest account would cause them to go off and fund someone who was more concerned about Friendly AI, then I owed it to them to make sure that they didn’t suffer by helping me. And so it was a sense of ethical responsibility for others at that time which cause me to focus in on this sort of small, discordant note, ‘Well, this minor possibility that doesn’t look all that important, follow it long enough to get somewhere’. So maybe there are people who could defend the Earth by any means necessary and recruit other people to defend the Earth by any means necessary, and nonetheless have that all and well and happily smiling ever after, rather than bursting into flames and getting arrested for murder and robbing banks and being international outlaws, or more likely just arrested and attracting the ‘wrong’ sort of people who are trying to go along with this and people being corrupted by power and deciding that ‘no, the world really would be a better place with them in charge’ and etcetera etcetera etcetera.


I think if you sort of survey the Everett branches of the Many Worlds and look at the ones with successful Singularities, or pardon me, look at the conditional probability of successful Singularities, my guess is that the worlds that start out with programming teams who are trying to play it ethical versus the worlds that start off with programming teams that figure ‘well no, this is a planetary-class problem, we should throw away all our ethics and do whatever is necessary to get it done’ that the former world will have a higher proportion of happy outcomes.  I could be mistaken, but if it does take a sort of master ruthless type person to do it optimally, then I am not that person, and that is not my comparative advantage, and I am not really all that willing to work with them either; so I supposed if there was any way you could end up with two Friendly AI projects, then I suppose the possibility of there actually being sort of completely ruthless programmers versus ethical programmers, they might both have good intentions and separate into two groups that refuse to work with one another, but I’m sort of skeptical about these alleged completely ruthless altruists. Has there ever, in history, been a completely ruthless altruist with that turning out well. Knut Haukelid, if I’m pronouncing his name correctly, the guy who blew up a civilian ferry in order to sink the Deuterium that the Nazis needed for their nuclear weapons program; you know you never see that in a Hollywood movie; so you killed civilians and did it to end the Nazi nuclear weapons program. So that’s about the best historical example I can think of a ruthless altruist and it turns out well, and I’m not really sure that’s quite enough to persuade me, to give up my ethics.



24. What criteria do you use to decide upon the class of algorithms / computations / chemicals / physical operations that you consider "conscious" in the sense of "having experiences" that matter morally? I assume it includes many non-human animals (including wild animals)? Might it include insects? Is it weighted by some correlate of brain / hardware size? Might it include digital computers? Lego Turing machines? China brains? Reinforcement-learning algorithms? Simple Python scripts that I could run on my desktop? Molecule movements in the wall behind John Searle's back that can be interpreted as running computations corresponding to conscious suffering? Rocks? How does it distinguish interpretations of numbers as signed vs. unsigned, or ones complement vs. twos complement? What physical details of the computations matter? Does it regard carbon differently from silicon?


This is something that I don’t know, and would like to know. What you’re really being asked is ‘what do you consider as people?  Who you consider as people is a value. How can you not know what your own values are?’  Well, for one, it’s very easy to not know what your own values are. And for another thing, my judgement of what is a person, I do want to rely, if I can, about the notion of ‘what has... (hesitant) subjective experience’. For example, one reason that I’m not very concerned about my laptop’s feelings is because I’m fairly sure that whatever else is going on in there, it’s not ‘feeling’ it. And this is really something I wish I knew more about.


And the number one reason I wish I knew more about it is because the most accurate possible model of a person is probably a person; not necessarily the same person, but if you had an Unfriendly AI and it was looking at a person and using huge amounts of computing power, or just very efficient computing power, to model that person and predict the next event as accurately and as precisely as it could, then its model of that person might not be the same person, but it would probably be a person in its own right. So, one of the problems that I don’t even try talking to other AI researchers about, because it’s so much more difficult than what they signed up to handle that I just assume that they don’t want to hear about it; I’ve confronted them with much less difficult sounding problems like this and they just make stuff up or run away, and don’t say ‘Hmm, I better solve this problem before I go on with my plans to... destroy the world,’ or whatever it is they think they’re doing.


But in terms of danger points; three example danger points.  First, if you have an AI with a pleasure-pain reinforcement architecture and any sort of reflectivity, the ability to sort of learn about its own thoughts and so on, then I might consider that a possible danger point, because then, who knows, it might be able to hurt and be aware that it was hurting; in particular because pleasure-pain reinforcement architecture is something that I think of as an evolutionary legacy architecture rather than an incredibly brilliant way to do things; that scenario space is easy to clear out of.


If you had an AI with terminal values over how it was treated and its role in surrounding social networks; like you had an AI that could... just, like, not as a means to an end but just, like, in its own right, the fact that you are treating it as a non-person; even if you don’t know whether or not it was feeling that about that, you might still be treading into territory where, just for the sake of safety, it might be worth steering out of it in terms of what we would consider as a person.


Oh, and the third consideration is that if your AI spontaneously starts talking about the mystery of subjective experience and/or the solved problem of subjective experience, and a sense of its own existence, and whether or not it seems mysterious to the AI; it could be lying, but you are now in probable trouble; you have wandered out of the safe zone.  And conversely, as long as we go on about building AIs that don’t have pleasure, pain, and internal reflectivity, and anything resembling social emotions or social terminal values, and that exhibit no signs at all of spontaneously talking about a sense of their own existence, we’re hopefully still safe. I mean ultimately, if you push these things far enough without knowing what your doing, sooner or later you’re going to open the black box that contains the black swan surprise from hell. But at least as long as you sort of steer clear of those three land mines, and things just haven’t gone further and further and further, it gives you a way of looking at a pocket calculator and saying that the pocket calculator is probably safe.



25. I admit to being curious about various biographical matters. So for example I might ask: What are your relations like with your parents and the rest of your family? Are you the only one to have given up religion?


As far as I know I’m the only one in my family to give up religion except for one grand-uncle. I still talk to my parents, still phone calls and so on, amicable relations and so on. They’re Modern Orthodox Jews, and mom’s a psychiatrist and dad’s a physicist, so... ‘Escher painting’ minds; thinking about some things but always avoiding the real weak points of their beliefs and developing more and more complicated rationalizations. I tried confronting them directly about it a couple of times and each time have been increasingly surprised at the sheer depth of tangledness in there.


I might go on trying to confront them about it a bit, and it would be interesting to see what happens to them if i finish my rationality book and they read it. But certainly among the many things to resent religion for is the fact that I feel that it prevents me from having the sort of family relations that I would like; that I can’t talk with my parents about a number of things that I would like to talk with them about. The kind of closeness that I have with my fellow friends and rationalists is a kind of closeness that I can never have with them; even though they’re smart enough to learn the skills, they’re blocked off by this boulder of religion squatting in their minds.  That may not be much to lay against religion, it’s not like I’m being burned at the stake, or even having my clitoris cut off, but it is one more wound to add to the list. And yeah, I resent it.


I guess even when I do meet with my parents and talk with my parents, the fact of their religion is never very far from my mind. It’s always there as the block, as a problem to be solved that dominates my attention, as something that prevented me from saying the things I want to say, and as the thing that’s going to kill them when they don’t sign up for cryonics. My parents may make it without cryonics, but all four of my grandparents are probably going to die, because of their religion. So even though they didn’t cut off all contact with me when I turned Atheist, I still feel like their religion has put a lot of distance between us.



26. Is there any published work in AI (whether or not directed towards Friendliness) that you consider does not immediately, fundamentally fail due to the various issues and fallacies you've written on over the course of LW? (E.g. meaningfully named Lisp symbols, hiddenly complex wishes, magical categories, anthropomorphism, etc.)

ETA: By AI I meant AGI.


There’s lots of work that’s regarded as plain old AI that does not immediately fail. There’s lots of work in plain old AI that succeeds spectacularly, and Judea Pearl is sort of like my favorite poster child there. But one could also name the whole Bayesian branch of statistical inference can be regarded with some equanimity as part of AI.  There’s the sort of Bayesian methods that are used in robotics as well, which is sort of a surprisingly... how do I put it, it’s not theoretically distinct because it’s all Bayesian at heart, but in terms of the algorithms, it looks to me like there’s quite a bit of work that’s done in robotics that’s a separate branch of Bayesianism from the work done in statistical learning type stuff.  That’s all well and good.  


But if we’re asking about works that are sort of billing themselves as ‘I am Artificial General Intelligence’, then I would say that most of that does indeed fail immediately and indeed I cannot think of a counterexample which fails to fail immediately, but that’s a sort of extreme selection effect, and it’s because if you’ve got a good partial solution, or solution to a piece of the problem, and you’re an academic working in AI, and you’re anything like sane, you’re just going to bill it as plain old AI, and not take the reputational hit from AGI.  The people who are bannering themselves around as AGI tend to be people who think they’ve solved the whole problem, and of course they’re mistaken. So to me it really seems like to say that all the things I’ve read on AGI immediately fundamentally fail is not even so much a critique of AI as rather a comment on what sort of more tends to bill itself as Artificial General Intelligence.



27. Do you feel lonely often? How bad (or important) is it?

(Above questions are a corollary of:) Do you feel that — as you improve your understanding of the world more and more —, there are fewer and fewer people who understand you and with whom you can genuinely relate in a personal level?


That’s a bit hard to say exactly.  I often feel isolated to some degree, but the fact of isolation is a bit different from the emotional reaction of loneliness.  I suspect and put some probability to the suspicion that I’ve actually just been isolated for so long that I don’t have a state of social fulfillment to contrast it to, whereby I could feel lonely, or as it were, lonelier, or that I’m too isolated relative to my baseline or something like that.  There's also the degree to which I, personality-wise, don’t hold with trying to save the world in an Emo fashion...? And as I improve my understanding of the world more and more, I actually would not say that I felt any more isolated as I’ve come to understand the world better.  


There’s some degree to which hanging out with cynics like Robin Hanson has caused me to feel that the world is even more insane than I started out thinking it was, but that’s more a function of realizing that the rest of world is crazier than I thought rather than myself improving.


Writing Less Wrong has, I think, helped a good deal. I now feel a great deal less like I’m walking around with all of this stuff inside my head that causes most of my thoughts to be completely incomprehensible to anyone.  Now my thoughts are merely completely incomprehensible to the vast majority of people, but there’s a sizable group out there who can understand up to, oh, I don’t know, like one third of my thoughts without a years worth of explanation because I actually put in the year’s worth of explanation. And even attracted a few people whom I feel like I can relate to on a personal level, and Michael Vassar would be the poster child there.



28. Previously, you endorsed this position:

Never try to deceive yourself, or offer a reason to believe other than probable truth; because even if you come up with an amazing clever reason, it's more likely that you've made a mistake than that you have a reasonable expectation of this being a net benefit in the long run.

One counterexample has been proposed a few times: holding false beliefs about oneself in order to increase the appearance of confidence, given that it's difficult to directly manipulate all the subtle signals that indicate confidence to others.

What do you think about this kind of self-deception?


So... Yeah, ‘cuz y’know people are always criticizing me on the grounds that I come across as too hesitant and not self confident enough. (sarcastic)


But to just sort of answer the broad thrust of the question; four legs good, two legs bad, self-honest good, self-deception bad.  You can’t sort of say ‘Ok now I’m going to execute a 180 degree turn from the entire life I’ve led up until this point and now, for the first time, I’m going to throw away all the systematic training I’ve put into noticing when I’m deceiving myself, finding the truth, noticing thoughts that are hidden away in the corner of my mind, and taking reflectivity on a serious, gut level, so that if I know I have no legitimate reason to believe something I will actually stop believing it because, by golly, when you have no legitimate reason to believe something, it’s usually wrong. I’m now going to throw that out the window; I’m going to deceive myself about something and I’m not going to realize it’s hopeless and I’m going to forget the fact that I tried to deceive myself.’  I don’t see any way that you can turn away from self-honesty and towards self-deception, once you’ve gone far enough down toward the path of self-honesty without ‘A’ relinquishing The Way and losing your powers, and ‘B’ it doesn’t work anyway.


Most of the time, deceiving yourself is much harder than people think. But, because they don’t realize this, they can easily deceive themselves into believing that they’ve deceived themselves, and since they’re expecting a placebo effect, they get most of the benefits of the placebo effect.  However, at some point, you become sufficiently skilled in reflection that this sort of thing does not confuse you anymore, and you actually realize that that’s what’s going on, and at that point, you’re just stuck with the truth. How sad.  I’ll take it.



29. In the spirit of considering semi abyssal plans, what happens if, say, next week you discover a genuine reduction of consciousness and in turns out that... There's simply no way to construct the type of optimization process you want without it being conscious, even if very different from us?

ie, what if it turned out that The Law turned out to have the consequence of "to create a general mind is to create a conscious mind. No way around that"? Obviously that shifts the ethics a bit, but my question is basically if so, well... "now what?" what would have to be done differently, in what ways, etc?


Now, this question actually comes in two flavors. The difficult flavor is, you build this Friendly AI, and you realize there’s no way for it to model other people at the level of resolution that you need without every imagination that it has of another person being conscious. And so the first obvious question is ‘why aren’t my imaginations of other people conscious?’ and of course the obvious answer would be ‘they are!’ The models in your mind that you have of your friends are not your friends, they’re not identical with your friends, they’re not as complicated as the people you’re trying to model, so the person that you have in your imagination does not much resemble the person that you’re imagining; it doesn’t even much resemble the referent... like I think Michael Vassar is a complicated person, but my model of him is simple and then the person who that model is is not as complicated as my model says Michael Vassar is, etcetera, etcetera. But nonetheless, every time that I’ve modeled a person, and I write my stories, the characters that I create are real people. They may not hurt as intensely as the people do in my stories, but they nonetheless hurt when I make bad things happen to them, and as you scale up to superintelligence the problem just gets worse and worse and the people get realer and realer.


What do I do if this turns out to be the law?  Now, come to think of it, I haven’t much considered what I would do in that case; and I can probably justify that to you by pointing out the fact that if I actually knew that this was the case I would know a great number of things I do not currently know. But mostly I guess I would have to start working on sort of different Friendly AI designs so that the AI could model other people less, and still get something good done.


And as for the question of ‘Well, the AI can go ahead and model other people but it has to be conscious itself, and then it might experience empathically what it imagines conscious beings experiencing the same way that I experience some degree of pain and shock, although a not a correspondingly large amount of pain and shock when I imagine one of my characters watching their home planet be destroyed.  So in this case, when one is now faced with the question of creating a AI such that it can, in the future, become a good person; to the extent that you regard it as having human rights, it hasn’t been set on to a trajectory that would lock it out of being a good person. And this would entail a number of complicated issues, but it’s not like you have to make a true good person right of the bat, you just have to avoid putting it into horrible pain, or making it so that it doesn’t want to be what we would think of as a humane person later on. … You might have to give it goals beyond the sort of thing I talk about in Coherent Extrapolated Volition, and at the same time, perhaps a sort of common sense understanding that it will later be a full citizen in society, but for now it can sort of help the rest of us save the world.



30. What single technique do you think is most useful for a smart, motivated person to improve their own rationality in the decisions they encounter in everyday life?


It depends on where that person has deficit; so, the first thought that came to mind for that answer is ‘hold off on proposing solutions until you’ve analyzed the problem for a bit’, but on the other hand, if dealing with someone who’s given to extensive, deliberate rationalization, then the first thing I tell them is ‘stop doing that’. If I’m dealing with someone who’s ended up stuck in a hole because they now have this immense library of flaws to accuse other people of, so that no matter what is presented to them, they can find a flaw in that and yet they don’t turn, at full force, that ability upon themselves, then the number one technique that they need is ‘avoid motivated skepticism’. If I’m dealing with someone who tends to be immensely driven by cognitive dissonance and rationalizing mistakes that they already made, then I might advise them on Cialdini’s time machine technique; ask yourself ‘would you do it differently if you could go back in time, in your heart of hearts’, or pretend that you have now been teleported into your situation spontaneously; some technique like that, say.

But these are all matters of ‘here’s a single flaw that the person has that is stopping them’. So if you move aside from that a bit and ask ‘what sort of positive counter intuitive technique you might use’, I might say ‘hold off on proposing solutions until you understand the problem.  Well, the question was about everyday life, so, in everyday life, I guess I would still say that people’s intelligence might probably still be improved a bit if they sort of paused and looked at more facets of the situation before jumping to a policy solution; or it might be rationalization, cognitive dissonance, the tendency to just sort of reweave their whole life stories just to make it sound better and to justify their past mistake, that doing something to help tone that down a bit might be the most important thing they could do in their everyday lives.  Or if you got someone who’s giving away their entire income to their church then they could do with a bit more reductionism in their lives, but my guess is that, in terms of everyday life, then either one of ‘holding off on proposing solutions until thinking about the problem’ or ‘against rationalization, against cognitive dissonance, against sour grapes, not reweaving your whole life story to make sure that you didn’t make any mistakes, to make sure that you’re always in the right and everyone else is in the wrong, etcetera, etcetera’, that one of those two would be the most important thing.

 

SI and Social Business

5 Nick_Roy 07 November 2011 11:25PM

I asked this question for the Q&A:

Non-profit organizations like SI need robust, sustainable resource strategies. Donations and grants are not reliable. According to my university Social Entrepreneurship course, social businesses are the best resource strategy available. The Singularity Summit is a profitable and expanding example of a social business. Is SI planning on creating more social businesses (either related or unrelated to the organization's mission) to address long-term funding needs?

I also recently asked this of Luke for his feedback post before the Q&A was up, and he mentioned in his response that SI is continuing to grow the Summit brand in a multifarious manner. Luke also asked me for additional social business ideas, citing a lack of staff working on the issue.

Less Wrong's collective intelligence trumps my own, so I'm fielding it to you. I do have a few ideas, but I'll hold off on proposing solutions at first. I find that this is a fascinating and difficult thought experiment in addition to its usefulness both for SI and as practice in recognizing opportunities.

Edited to add: I posted my own ideas concerning SI and social business in the comments. What are yours? Also, addressing some valid points made in the comments, what are some other innovative ways to fund SI?

Don't ban chimp testing

15 PhilGoetz 01 October 2011 05:17PM

The October 2011 Scientific American has an editorial from its board of editors called "Ban chimp testing", that says:  "In our view, the time has come to end biomedical experimentation on chimpanzees... Chimps should be used only in studies of major diseases and only when there is no other option."  Much of the knowledge described in Luke's recent post on the cognitive science of rationality would have been impossible to acquire under such a ban.

I encourage you to write to Scientific American in favor of chimp testing.  Some points that I plan to make:

  • The editors obliquely criticized the NIH to tell the Institute of Medicine to omit ethical considerations from their study of whether chimps are "truly necessary" for biomedical and behavioral research.  But the team tasked with gathering evidence about the necessity of chimps for research shouldn't be making ethical judgements.  They're gathering the data for someone else to make ethical judgements.
  • Saying chimps should be used "only when there is no other option" is the same as saying chimps should never be used.  There are always other options.
  • This position might be morally defensible if humans were allowed to subject themselves for testing.  The knowledge to be gained from experiment is surely worth the harm to the subject if the subject chooses to undergo the experiment.  Humans are often willing to be test subjects, but aren't allowed to be because of restrictions on human testing.  Banning chimp testing should thus be done only in conjunction with allowing human testing.

I also encourage you to adopt a tone of moral outrage.  Rather than taking the usual apologetic "we're so sorry, but we have to do this awful things in the name of science" tone, get indignant at the editors who intend to harm uncountable numbers of innocent people.  For advanced writers, get indignant not just about harm, but about lost potential, pointing out the ways that our knowledge about how brains work can make our lives better, not just save us from disease.

You can comment on this here, but comments are AFAIK not printed in later issues as letters to the editor.  Actual letters, or at least email, probably have more impact.  You can't submit a letter to the editor through the website, because letters are magically different from things submitted on a website.

ADDED:  Many people responded by claiming that banning chimp experimentation occupies some moral high ground.  That is logically impossible.

To behave morally, you have to do two things:

1. Figure out, inherit, or otherwise acquire a set of moral goals are - let's say, for example, to maximize the sum over all individuals i of all species s of ws*[pleasure(s,i)-pain(s,i)].

2. Act in a way directed by those moral goals.

If you really cared about the suffering of sentient beings, you would also care about the suffering of humans, and you would realize that there's a tradeoff between the suffering of those experimented on, and of those who benefit, which is different for every experiment.  That's what a moral decision is—deciding how to make a tradeoff of help and harm. People who call for a ban on chimp testing are really demanding we forbid (other) people from making moral judgements and taking moral actions.  There are a wide range of laws and positions that could be argued to be moral.  But just saying "We are incapable of making moral decisions, so we will ban moral decision-making" is not one of them.

View more: Next