Thoughts on How Consciousness Can Affect the World
Overview
Tentative Lemmas
Likely Complications
Friendly AI ideas needed: how would you ban porn?
To construct a friendly AI, you need to be able to make vague concepts crystal clear, cutting reality at the joints when those joints are obscure and fractal - and them implement a system that implements that cut.
There are lots of suggestions on how to do this, and a lot of work in the area. But having been over the same turf again and again, it's possible we've got a bit stuck in a rut. So to generate new suggestions, I'm proposing that we look at a vaguely analogous but distinctly different question: how would you ban porn?
Suppose you're put in change of some government and/or legal system, and you need to ban pornography, and see that the ban is implemented. Pornography is the problem, not eroticism. So a lonely lower-class guy wanking off to "Fuck Slaves of the Caribbean XIV" in a Pussycat Theatre is completely off. But a middle-class couple experiencing a delicious frisson when they see a nude version of "Pirates of Penzance" at the Met is perfectly fine - commendable, even.
The distinction between the two case is certainly not easy to spell out, and many are reduced to saying the equivalent of "I know it when I see it" when defining pornography. In terms of AI, this is equivalent with "value loading": refining the AI's values through interactions with human decision makers, who answer questions about edge cases and examples and serve as "learned judges" for the AI's concepts. But suppose that approach was not available to you - what methods would you implement to distinguish between pornography and eroticism, and ban one but not the other? Sufficiently clear that a scriptwriter would know exactly what they need to cut or add to a movie in order to move it from one category to the other? What if the nude "Pirates of of Penzance" was at a Pussycat Theatre and "Fuck Slaves of the Caribbean XIV" was at the Met?
To get maximal creativity, it's best to ignore the ultimate aim of the exercise (to find inspirations for methods that could be adapted to AI) and just focus on the problem itself. Is it even possible to get a reasonable solution to this question - a question much simpler than designing a FAI?
Why don't more rationalists start startups?
My motivation behind this post stems from Aumann's agreement theorem. It seems that my opinions on startups differ from most of the rationality community, so I want to share my thoughts, and hear your thoughts, so we could reach a better conclusion.
I think that if you're smart and hard working, there's a pretty good chance that you achieve financial independence within a decade of the beginning of your journey to start a startup. And that's my conservative estimate.
"Achieve financial independence" only scratches the surface of the benefits of succeeding with a startup. If you're an altruist, you'll get to help a lot of other people too. And making millions of dollars will also allow you the leverage you need to make riskier investments with much higher expected values, allowing you to grow your money quickly so you could do more good.
A lot of this is predicated on my belief that you have a good chance at succeeding if you're smart and hardworking, so let me explain why I think this.
Along the lines of reductionism, "success with a startup" is an outcome (I guess we could define success as a $5-10M exit in under 10 years). And outcomes consist of their components. My argument consists of breaking the main outcome into it's components, and then arguing that the components are all likely enough for the main outcome to be likely.
I think that the 4 components are:
- Devise an idea for a product that creates demand.
- Build it.
- Market and sell it.
- Things run smoothly (some might call this luck).
The Idea
Your idea has to be for a product or service (I'll just say product to keep things simple) that creates demand, and can be met profitably. In other words, make something people want (this article spells it out pretty well).
What could go wrong?
- Failure to think specifically about benefits. These articles explain what I mean by this better than I could.
- Failure to understand customers. To put yourself in their minds and understand what it is that they do and don't want. This is distinct from the first bullet point. You could have a specific benefit in mind, but be wrong about whether it's something your customer really wants (or about how badly they want it).
- Failure to research competitors. Maybe you came up with a great idea, but it turns out that it exists already.
- Our society doesn't have the technological or scientific progress necessary to build the product. For example, I have an idea for a machine that teleports you from one place to another. Unfortunately, we as a society aren't at a point where someone could build that.
- You personally don't have the skills to build it.
- You don't work hard enough. Maybe you try, and find that you don't have the willpower. Maybe you try, find that you do have the willpower, but realize that the amount of work it take isn't worth it to you.
- You can't find people with the skills to work on it with you (cofounders).
- You can't raise money from investors to hire people to help you build it.
- The people you work with/hire aren't good enough to build the product you envisioned.
- You're unable to communicate clearly to your customers what benefits they'll be receiving if they use your product.
- You're unable to persuade them. (There are other elements to persuasion aside from clear communication).
- You didn't reach enough people. Maybe you didn't advertise enough. Maybe you thought word would spread, and it didn't.
- You're having distribution problems (delivering the product to your customer).
- PR problems. Something goes wrong and you obtain a bad reputation.
- Legal issues (current). Maybe you did something illegal and didn't realize it (ex. copyright infringement), and sanctions or a lawsuit killed your startup.
- Legal issues (future). Maybe new laws were enacted that killed your startup.
- Something in your personal life goes wrong that requires you to quit.
- Your competitors innovate and beat you out. Or a big company decides to enter the market, and crushes you.
- Scientific findings lead to your product being obsolete.
- Macroeconomic conditions change, which somehow leads to people not wanting your product.
- Political/social conditions lead to people not wanting your product.
[Link] Distance from Harvard
Related: Loss of local knowledge affecting intellectual trends, The Hyborian Age
This post is from Gregory Cochran's and Henry Harpending's excellent blog West Hunter.
Barry Marshall once said that if he had gone to Harvard, he would have known that stomach ulcers were caused by stress, and wouldn’t even have considered the possibility that they might be caused by a bacterium. There are a number of other important innovators that sure look as if they benefited from living as far as possible from the sources of establishment opinion. Back when continental drift was officially nonsense, quite a few geologists in South Africa and Australia thought it must be correct – partly because there are local geological facts that are hard to explain any other way (like ancient glacial moraines in Australia whose rocks originated in South Africa) but also because physical distance translates into mental distance.
Of course this does not always work – distance is useful, but not sufficient.. Indonesia is pretty far from Harvard, but is a vast wasteland, intellectually. Ideally, you want a country full of people drawn from the populations that actually produce creative thinkers (Europeans, mostly) instead of the populations that ought to but don’t. And it should be really, really far away.
With the Internet and cell phones and all that, psychological isolation is harder to find. Once even California had some thoughts of its own, but that day is long past. If we want to keep progress from stalling out, we need people that don’t get sucked into to the usual crap – because they can’t.
The only real solution is interstellar colonization: the speed of light is your friend. A generation ship might do the job - even if it never arrived. It would be out there for hundreds of years, years in which the inhabitants could go their own way. Some of the ships would be boring, some of them would go crazy – but at least they’d be different.
Leveling up...
I just figured out how to use the local banking system and I will be able for the first time to pay my rent from my actual salary received in this country (as opposed to savings from my previous life). Also I always hated shopping for groceries and now I can do it without much pain, because I've found a way that works for me. As I reflect on this, an expression comes to my mind: "leveling up." I don't have the same problems any more which I used to have. I grow and face new challenges.
Did you, fellow rationalists and transhumanists, ever have that feeling? Any particular accomplishments, big or small, that made you feel you're advancing? No matter fast or slow, in big steps or tiny, but firmly forward!?
I'm thrilled to read your stories!
The Argument From Marginal Cases
The argument from marginal cases claims that you can't both think that humans matter morally and that animals don't, because no reasonable set of criteria for moral worth cleanly separates all humans from all animals. For example, perhaps someone says that suffering only matters when it happens to something that has some bundle of capabilities like linguistic ability, compassion, and/or abstract reasoning. If livestock don't have these capabilities, however, then some people such as very young children probably don't either.
This is a strong argument, and it avoids the noncentral fallacy. Any set of qualities you value are going to vary over people and animals, and if you make a continuum there's not going to be a place you can draw a line that will fall above all animals and below all people. So why do I treat humans as the only entities that count morally?
If you asked me how many chickens I would be willing to kill to save your life, the answer is effectively "all of them". [1] This pins down two points on the continuum that I'm clear on: you and chickens. While I'm uncertain where along there things start getting up to significant levels, I think it's probably somewhere that includes no or almost no animals but nearly all humans. Making this distinction among humans, however, would be incredibly socially destructive, especially given how unsure I am about where the line should go, and so I think we end up with a much better society if we treat all humans as morally equal. This means I end up saying things like "value all humans equally; don't value animals" when that's not my real distinction, just the closest schelling point.
[1] Chicken extinction would make life worse for many other people, so I wouldn't actually do that, but not because of the effect on the chickens.
I also posted this on my blog.
The idiot savant AI isn't an idiot
A stub on a point that's come up recently.
If I owned a paperclip factory, and casually told my foreman to improve efficiency while I'm away, and he planned a takeover of the country, aiming to devote its entire economy to paperclip manufacturing (apart from the armament factories he needed to invade neighbouring countries and steal their iron mines)... then I'd conclude that my foreman was an idiot (or being wilfully idiotic). He obviously had no idea what I meant. And if he misunderstood me so egregiously, he's certainly not a threat: he's unlikely to reason his way out of a paper bag, let alone to any position of power.
If I owned a paperclip factory, and casually programmed my superintelligent AI to improve efficiency while I'm away, and it planned a takeover of the country... then I can't conclude that the AI is an idiot. It is following its programming. Unlike a human that behaved the same way, it probably knows exactly what I meant to program in. It just doesn't care: it follows its programming, not its knowledge about what its programming is "meant" to be (unless we've successfully programmed in "do what I mean", which is basically the whole of the challenge). We can't therefore conclude that it's incompetent, unable to understand human reasoning, or likely to fail.
We can't reason by analogy with humans. When AIs behave like idiot savants with respect to their motivations, we can't deduce that they're idiots.
The autopilot problem: driving without experience
Consider a mixed system, in which an automated system is paired with a human overseer. The automated system handles most of the routine tasks, while the overseer is tasked with looking out for errors and taking over in extreme or unpredictable circumstances. Examples of this could be autopilots, cruise control, GPS direction finding, high-frequency trading – in fact nearly every automated system has this feature, because they nearly all rely on humans "keeping an eye on things".
But often the human component doesn't perform as well as it should do – doesn't perform as well as it did before part of the system was automated. Cruise control can impair driver performance, leading to more accidents. GPS errors can take people far more off course than following maps did. When the autopilot fails, pilots can crash their planes in rather conventional conditions. Traders don't understand why their algorithms misbehave, or how to stop this.
There seems to be three factors at work here:
- Firstly, if the automation performs flawlessly, the overseers will become complacent, blindly trusting the instruments and failing to perform basic sanity checks. They will have far less procedural understanding of what's actually going on, since they have no opportunity to exercise their knowledge.
- This goes along with a general deskilling of the overseer. When the autopilot controls the plane for most of its trip, pilots get far less hands-on experience of actually flying the plane. Paradoxically, less efficient automation can help with both these problems: if the system fails 10% of the time, the overseer will watch and understand it closely.
- And when the automation does fail, the overseer will typically lack situational awareness of what's going on. All they know is that something extraordinary has happened, and they may have the (possibly flawed) readings of various instruments to guide them – but they won't have a good feel for what happened to put them in that situation.
So, when the automation fails, the overseer is generally dumped into an emergency situation, whose nature they are going to have to deduce, and, using skills that have atrophied, they are going to have to take on the task of the automated system that has never failed before and that they have never had to truly understand.
And they'll typically get blamed for getting it wrong.
Similarly, if we design AI control mechanisms that rely on the presence of a human in the loop (such as tools AIs, Oracle AIs, and, to a lesser extent, reduced impact AIs), we'll need to take the autopilot problem into account, and design the role of the overseer so as not to deskill them, and not count on them being free of error.
Being Half-Rational About Pascal's Wager is Even Worse
For so long as I can remember, I have rejected Pascal's Wager in all its forms on sheerly practical grounds: anyone who tries to plan out their life by chasing a 1 in 10,000 chance of a huge payoff is almost certainly doomed in practice. This kind of clever reasoning never pays off in real life...
...unless you have also underestimated the allegedly tiny chance of the large impact.
For example. At one critical junction in history, Leo Szilard, the first physicist to see the possibility of fission chain reactions and hence practical nuclear weapons, was trying to persuade Enrico Fermi to take the issue seriously, in the company of a more prestigious friend, Isidor Rabi:
I said to him: "Did you talk to Fermi?" Rabi said, "Yes, I did." I said, "What did Fermi say?" Rabi said, "Fermi said 'Nuts!'" So I said, "Why did he say 'Nuts!'?" and Rabi said, "Well, I don't know, but he is in and we can ask him." So we went over to Fermi's office, and Rabi said to Fermi, "Look, Fermi, I told you what Szilard thought and you said ‘Nuts!' and Szilard wants to know why you said ‘Nuts!'" So Fermi said, "Well… there is the remote possibility that neutrons may be emitted in the fission of uranium and then of course perhaps a chain reaction can be made." Rabi said, "What do you mean by ‘remote possibility'?" and Fermi said, "Well, ten per cent." Rabi said, "Ten per cent is not a remote possibility if it means that we may die of it. If I have pneumonia and the doctor tells me that there is a remote possibility that I might die, and it's ten percent, I get excited about it." (Quoted in 'The Making of the Atomic Bomb' by Richard Rhodes.)
This might look at first like a successful application of "multiplying a low probability by a high impact", but I would reject that this was really going on. Where the heck did Fermi get that 10% figure for his 'remote possibility', especially considering that fission chain reactions did in fact turn out to be possible? If some sort of reasoning had told us that a fission chain reaction was improbable, then after it turned out to be reality, good procedure would have us go back and check our reasoning to see what went wrong, and figure out how to adjust our way of thinking so as to not make the same mistake again. So far as I know, there was no physical reason whatsoever to think a fission chain reaction was only a ten percent probability. They had not been demonstrated experimentally, to be sure; but they were still the default projection from what was already known. If you'd been told in the 1930s that fission chain reactions were impossible, you would've been told something that implied new physical facts unknown to current science (and indeed, no such facts existed). After reading enough historical instances of famous scientists dismissing things as impossible when there was no physical logic to say that it was even improbable, one cynically suspects that some prestigious scientists perhaps came to conceive of themselves as senior people who ought to be skeptical about things, and that Fermi was just reacting emotionally. The lesson I draw from this historical case is not that it's a good idea to go around multiplying ten percent probabilities by large impacts, but that Fermi should not have pulled out a number as low as ten percent.
Having seen enough conversations involving made-up probabilities to become cynical, I also strongly suspect that if Fermi had foreseen how Rabi would reply, Fermi would've said "One percent". If Fermi had expected Rabi to say "One percent is not small if..." then Fermi would've said "One in ten thousand" or "Too small to consider" - whatever he thought would get him off the hook. Perhaps I am being too unkind to Fermi, who was a famously great estimator; Fermi may well have performed some sort of lawful probability estimate on the spot. But Fermi is also the one who said that nuclear energy was fifty years off in the unlikely event it could be done at all, two years (IIRC) before Fermi himself oversaw the construction of the first nuclear pile. Where did Fermi get that fifty-year number from? This sort of thing does make me more likely to believe that Fermi, in playing the role of the solemn doubter, was just Making Things Up; and this is no less a sin when you make up skeptical things. And if this cynicism is right, then we cannot learn the lesson that it is wise to multiply small probabilities by large impacts because this is what saved Fermi - if Fermi had known the rule, if he had seen it coming, he would have just Made Up an even smaller probability to get himself off the hook. It would have been so very easy and convenient to say, "One in ten thousand, there's no experimental proof and most ideas like that are wrong! Think of all the conjunctive probabilities that have to be true before we actually get nuclear weapons and our own efforts actually made a difference in that!" followed shortly by "But it's not practical to be worried about such tiny probabilities!" Or maybe Fermi would've known better, but even so I have never been a fan of trying to have two mistakes cancel each other out.
I mention all this because it is dangerous to be half a rationalist, and only stop making one of the two mistakes. If you are going to reject impractical 'clever arguments' that would never work in real life, and henceforth not try to multiply tiny probabilities by huge payoffs, then you had also better reject all the clever arguments that would've led Fermi or Szilard to assign probabilities much smaller than ten percent. (Listing out a group of conjunctive probabilities leading up to taking an important action, and not listing any disjunctive probabilities, is one widely popular way of driving down the apparent probability of just about anything.) Or if you would've tried to put fission chain reactions into a reference class of 'amazing new energy sources' and then assigned it a tiny probability, or put Szilard into the reference class of 'people who think the fate of the world depends on them', or pontificated about the lack of any positive experimental evidence proving that a chain reaction was possible, blah blah blah etcetera - then your error here can perhaps be compensated for by the opposite error of then trying to multiply the resulting tiny probability by a large impact. I don't like making clever mistakes that cancel each other out - I consider that idea to also be clever - but making clever mistakes that don't cancel out is worse.
On the other hand, if you want a general heuristic that could've led Fermi to do better, I would suggest reasoning that previous-historical experimental proof of a chain reaction would not be strongly be expected even in worlds where it was possible, and that to discover a chain reaction to be impossible would imply learning some new fact of physical science which was not already known. And this is not just 20-20 hindsight; Szilard and Rabi saw the logic in advance of the fact, not just afterward - though not in those exact terms; they just saw the physical logic, and then didn't adjust it downward for 'absurdity' or with more complicated rationalizations. But then if you are going to take this sort of reasoning at face value, without adjusting it downward, then it's probably not a good idea to panic every time you assign a 0.01% probability to something big - you'll probably run into dozens of things like that, at least, and panicking over them would leave no room to wait until you found something whose face-value probability was large.
I don't believe in multiplying tiny probabilities by huge impacts. But I also believe that Fermi could have done better than saying ten percent, and that it wasn't just random luck mixed with overconfidence that led Szilard and Rabi to assign higher probabilities than that. Or to name a modern issue which is still open, Michael Shermer should not have dismissed the possibility of molecular nanotechnology, and Eric Drexler will not have been randomly lucky when it turns out to work: taking current physical models at face value imply that molecular nanotechnology ought to work, and if it doesn't work we've learned some new fact unknown to present physics, etcetera. Taking the physical logic at face value is fine, and there's no need to adjust it downward for any particular reason; if you say that Eric Drexler should 'adjust' this probability downward for whatever reason, then I think you're giving him rules that predictably give him the wrong answer. Sometimes surface appearances are misleading, but most of the time they're not.
A key test I apply to any supposed rule of reasoning about high-impact scenarios is, "Does this rule screw over the planet if Reality actually hands us a high-impact scenario?" and if the answer is yes, I discard it and move on. The point of rationality is to figure out which world we actually live in and adapt accordingly, not to rule out certain sorts of worlds in advance.
There's a doubly-clever form of the argument wherein everyone in a plausibly high-impact position modestly attributes only a tiny potential possibility that their face-value view of the world is sane, and then they multiply this tiny probability by the large impact, and so they act anyway and on average worlds in trouble are saved. I don't think this works in real life - I don't think I would have wanted Leo Szilard to think like that. I think that if your brain really actually thinks that fission chain reactions have only a tiny probability of being important, you will go off and try to invent better refrigerators or something else that might make you money. And if your brain does not really feel that fission chain reactions have a tiny probability, then your beliefs and aliefs are out of sync and that is not something I want to see in people trying to handle the delicate issue of nuclear weapons. But in any case, I deny the original premise: I do not think the world's niches for heroism must be populated by heroes who are incapable in principle of reasonably distinguishing themselves from a population of crackpots, all of whom have no choice but to continue on the tiny off-chance that they are not crackpots.
I haven't written enough about what I've begun thinking of as 'heroic epistemology' - why, how can you possibly be so overconfident as to dare even try to have a huge positive impact when most people in that reference class blah blah blah - but on reflection, it seems to me that an awful lot of my answer boils down to not trying to be clever about it. I don't multiply tiny probabilities by huge impacts. I also don't get tiny probabilities by putting myself into inescapable reference classes, for this is the sort of reasoning that would screw over planets that actually were in trouble if everyone thought like that. In the course of any workday, on the now very rare occasions I find myself thinking about such meta-level junk instead of the math at hand, I remind myself that it is a wasted motion - where a 'wasted motion' is any thought which will, in retrospect if the problem is in fact solved, not have contributed to having solved the problem. If someday Friendly AI is built, will it have been terribly important that someone have spent a month fretting about what reference class they're in? No. Will it, in retrospect, have been an important step along the pathway to understanding stable self-modification, if we spend time trying to solve the Lobian obstacle? Possibly. So one of these cognitive avenues is predictably a wasted motion in retrospect, and one of them is not. The same would hold if I spent a lot of time trying to convince myself that I was allowed to believe that I could affect anything large, or any other form of angsting about meta. It is predictable that in retrospect I will think this was a waste of time compared to working on a trust criterion between a probability distribution and an improved probability distribution. (Apologies, this is a technical thingy I'm currently working on which has no good English description.)
But if you must apply clever adjustments to things, then for Belldandy's sake don't be one-sidedly clever and have all your cleverness be on the side of arguments for inaction. I think you're better off without all the complicated fretting - but you're definitely not better off eliminating only half of it.
And finally, I once again state that I abjure, refute, and disclaim all forms of Pascalian reasoning and multiplying tiny probabilities by large impacts when it comes to existential risk. We live on a planet with upcoming prospects of, among other things, human intelligence enhancement, molecular nanotechnology, sufficiently advanced biotechnology, brain-computer interfaces, and of course Artificial Intelligence in several guises. If something has only a tiny chance of impacting the fate of the world, there should be something with a larger probability of an equally huge impact to worry about instead. You cannot justifiably trade off tiny probabilities of x-risk improvement against efforts that do not effectuate a happy intergalactic civilization, but there is nonetheless no need to go on tracking tiny probabilities when you'd expect there to be medium-sized probabilities of x-risk reduction. Nonetheless I try to avoid coming up with clever reasons to do stupid things, and one example of a stupid thing would be not working on Friendly AI when it's in blatant need of work. Elaborate complicated reasoning which says we should let the Friendly AI issue just stay on fire and burn merrily away, well, any complicated reasoning which returns an output this silly is automatically suspect.
If, however, you are unlucky enough to have been cleverly argued into obeying rules that make it a priori unreachable-in-practice for anyone to end up in an epistemic state where they try to do something about a planet which appears to be on fire - so that there are no more plausible x-risk reduction efforts to fall back on, because you're adjusting all the high-impact probabilities downward from what the surface state of the world suggests...
Well, that would only be a good idea if Reality were not allowed to hand you a planet that was in fact on fire. Or if, given a planet on fire, Reality was prohibited from handing you a chance to put it out. There is no reason to think that Reality must a priori obey such a constraint.
EDIT: To clarify, "Don't multiply tiny probabilities by large impacts" is something that I apply to large-scale projects and lines of historical probability. On a very large scale, if you think FAI stands a serious chance of saving the world, then humanity should dump a bunch of effort into it, and if nobody's dumping effort into it then you should dump more effort than currently into it. On a smaller scale, to compare two x-risk mitigation projects in demand of money, you need to estimate something about marginal impacts of the next added effort (where the common currency of utilons should probably not be lives saved, but "probability of an ok outcome", i.e., the probability of ending up with a happy intergalactic civilization). In this case the average marginal added dollar can only account for a very tiny slice of probability, but this is not Pascal's Wager. Large efforts with a success-or-failure criterion are rightly, justly, and unavoidably going to end up with small marginally increased probabilities of success per added small unit of effort. It would only be Pascal's Wager if the whole route-to-an-OK-outcome were assigned a tiny probability, and then a large payoff used to shut down further discussion of whether the next unit of effort should go there or to a different x-risk.
Problems in Education
Post will be returning in Main, after a rewrite by the company's writing staff. Citations Galore.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)