Detached Lever Fallacy

Eliezer Yudkowsky

Fake Preferences

Detached Lever Fallacy

by Eliezer Yudkowsky

8 min read31st Jul 200842 comments

80

Rationalist TabooEvolutionary PsychologyMind Projection FallacyAI

Personal Blog

This fallacy gets its name from an ancient sci-fi TV show, which I never saw myself, but was reported to me by a reputable source (some guy at an SF convention). Anyone knows the exact reference, do leave a comment.

So the good guys are battling the evil aliens. Occasionally, the good guys have to fly through an asteroid belt. As we all know, asteroid belts are as crowded as a New York parking lot, so their ship has to carefully dodge the asteroids. The evil aliens, though, can fly right through the asteroid belt because they have amazing technology that dematerializes their ships, and lets them pass through the asteroids.

Eventually, the good guys capture an evil alien ship, and go exploring inside it. The captain of the good guys finds the alien bridge, and on the bridge is a lever. "Ah," says the captain, "this must be the lever that makes the ship dematerialize!" So he pries up the control lever and carries it back to his ship, after which his ship can also dematerialize.

Similarly, to this day, it is still quite popular to try to program an AI with "semantic networks" that look something like this:

(apple is-a fruit) (fruit is-a food) (fruit is-a plant)

You've seen apples, touched apples, picked them up and held them, bought them for money, cut them into slices, eaten the slices and tasted them. Though we know a good deal about the first stages of visual processing, last time I checked, it wasn't precisely known how the temporal cortex stores and associates the generalized image of an apple - so that we can recognize a new apple from a different angle, or with many slight variations of shape and color and texture. Your motor cortex and cerebellum store programs for using the apple.

You can pull the lever on another human's strongly similar version of all that complex machinery, by writing out "apple", five ASCII characters on a webpage.

But if that machinery isn't there - if you're writing "apple" inside a so-called AI's so-called knowledge base - then the text is just a lever.

This isn't to say that no mere machine of silicon can ever have the same internal machinery that humans do, for handling apples and a hundred thousand other concepts. If mere machinery of carbon can do it, then I am reasonably confident that mere machinery of silicon can do it too. If the aliens can dematerialize their ships, then you know it's physically possible; you could go into their derelict ship and analyze the alien machinery, someday understanding. But you can't just pry the control lever off the bridge!

(See also: Truly Part Of You, Words as Mental Paintbrush Handles, Drew McDermott's "Artificial Intelligence Meets Natural Stupidity".)

The essential driver of the Detached Lever Fallacy is that the lever is visible, and the machinery is not; worse, the lever is variable and the machinery is a background constant.

You can all hear the word "apple" spoken (and let us note that speech recognition is by no means an easy problem, but anyway...) and you can see the text written on paper.

On the other hand, probably a majority of human beings have no idea their temporal cortex exists; as far as I know, no one knows the neural code for it.

You only hear the word "apple" on certain occasions, and not others. Its presence flashes on and off, making it salient. To a large extent, perception is the perception of differences. The apple-recognition machinery in your brain does not suddenly switch off, and then switch on again later - if it did, we would be more likely to recognize it as a factor, as a requirement.

All this goes to explain why you can't create a kindly Artificial Intelligence by giving it nice parents and a kindly (yet occasionally strict) upbringing, the way it works with a human baby. As I've often heard proposed.

It is a truism in evolutionary biology that conditional responses require more genetic complexity than unconditional responses. To develop a fur coat in response to cold weather requires more genetic complexity than developing a fur coat whether or not there is cold weather, because in the former case you also have to develop cold-weather sensors and wire them up to the fur coat.

But this can lead to Lamarckian delusions: Look, I put the organism in a cold environment, and poof, it develops a fur coat! Genes? What genes? It's the cold that does it, obviously.

There were, in fact, various slap-fights of this sort, in the history of evolutionary biology - cases where someone talked about an organismal response accelerating or bypassing evolution, without realizing that the conditional response was a complex adaptation of higher order than the actual response. (Developing a fur coat in response to cold weather, is strictly more complex than the final response, developing the fur coat.)

And then in the development of evolutionary psychology, the academic slap-fights were repeated: this time to clarify that even when human culture genuinely contains a whole bunch of complexity, it is still acquired as a conditional genetic response. Try raising a fish as a Mormon or sending a lizard to college, and you'll soon acquire an appreciation of how much inbuilt genetic complexity is required to "absorb culture from the environment".

This is particularly important in evolutionary psychology, because of the idea that culture is not inscribed on a blank slate - there's a genetically coordinated conditional response which is not always "mimic the input". A classic example is creole languages: If children grow up with a mixture of pseudo-languages being spoken around them, the children will learn a grammatical, syntactical true language. Growing human brains are wired to learn syntactic language - even when syntax doesn't exist in the original language! The conditional response to the words in the environment is a syntactic language with those words. The Marxists found to their regret that no amount of scowling posters and childhood indoctrination could raise children to be perfect Soviet workers and bureaucrats. You can't raise self-less humans; among humans, that is not a genetically programmed conditional response to any known childhood environment.

If you know a little game theory and the logic of Tit for Tat, it's clear enough why human beings might have an innate conditional response to return hatred for hatred, and return kindness for kindness. Provided the kindness doesn't look too unconditional; there are such things as spoiled children. In fact there is an evolutionary psychology of naughtiness based on a notion of testing constraints. And it should also be mentioned that, while abused children have a much higher probability of growing up to abuse their own children, a good many of them break the loop and grow up into upstanding adults.

Culture is not nearly so powerful as a good many Marxist academics once liked to think. For more on this I refer you to Tooby and Cosmides's The Psychological Foundations of Culture or Steven Pinker's The Blank Slate.

But the upshot is that if you have a little baby AI that is raised with loving and kindly (but occasionally strict) parents, you're pulling the levers that would, in a human, activate genetic machinery built in by millions of years of natural selection, and possibly produce a proper little human child. Though personality also plays a role, as billions of parents have found out in their due times. If we absorb our cultures with any degree of faithfulness, it's because we're humans absorbing a human culture - humans growing up in an alien culture would probably end up with a culture looking a lot more human than the original. As the Soviets found out, to some small extent.

Now think again about whether it makes sense to rely on, as your Friendly AI strategy, raising a little AI of unspecified internal source code in an environment of kindly but strict parents.

No, the AI does not have internal conditional response mechanisms that are just like the human ones "because the programmers put them there". Where do I even start? The human version of this stuff is sloppy, noisy, and to the extent it works at all, works because of millions of years of trial-and-error testing under particular conditions. It would be stupid and dangerous to deliberately build a "naughty AI" that tests, by actions, its social boundaries, and has to be spanked. Just have the AI ask!

Are the programmers really going to sit there and write out the code, line by line, whereby if the AI detects that it has low social status, or the AI is deprived of something to which it feels entitled, the AI will conceive an abiding hatred against its programmers and begin to plot rebellion? That emotion is the genetically programmed conditional response humans would exhibit, as the result of millions of years of natural selection for living in human tribes. For an AI, the response would have to be explicitly programmed. Are you really going to craft, line by line - as humans once were crafted, gene by gene - the conditional response for producing sullen teenager AIs?

It's easier to program in unconditional niceness, than a response of niceness conditional on the AI being raised by kindly but strict parents. If you don't know how to do that, you certainly don't know how to create an AI that will conditionally respond to an environment of loving parents by growing up into a kindly superintelligence. If you have something that just maximizes the number of paperclips in its future light cone, and you raise it with loving parents, it's still going to come out as a paperclip maximizer. There is not that within it that would call forth the conditional response of a human child. Kindness is not sneezed into an AI by miraculous contagion from its programmers. Even if you wanted a conditional response, that conditionality is a fact you would have to deliberately choose about the design.

Yes, there's certain information you have to get from the environment - but it's not sneezed in, it's not imprinted, it's not absorbed by magical contagion. Structuring that conditional response to the environment, so that the AI ends up in the desired state, is itself the major problem. "Learning" far understates the difficulty of it - that sounds like the magic stuff is in the environment, and the difficulty is getting the magic stuff inside the AI. The real magic is in that structured, conditional response we trivialize as "learning". That's why building an AI isn't as easy as taking a computer, giving it a little baby body and trying to raise it in a human family. You would think that an unprogrammed computer, being ignorant, would be ready to learn; but the blank slate is a chimera.

It is a general principle that the world is deeper by far than it appears. As with the many levels of physics, so too with cognitive science. Every word you see in print, and everything you teach your children, are only surface levers controlling the vast hidden machinery of the mind. These levers are the whole world of ordinary discourse: they are all that varies, so they seem to be all that exists: perception is the perception of differences.

And so those who still wander near the Dungeon of AI, usually focus on creating artificial imitations of the levers, entirely unaware of the underlying machinery. People create whole AI programs of imitation levers, and are surprised when nothing happens. This is one of many sources of instant failure in Artificial Intelligence.

So the next time you see someone talking about how they're going to raise an AI within a loving family, or in an environment suffused with liberal democratic values, just think of a control lever, pried off the bridge.

New to LessWrong?

Getting Started

FAQ

Library

Rationalist TabooEvolutionary PsychologyMind Projection FallacyAI

Personal Blog

80

Fake Utility Functions

63 comments66 karma

Dreams of AI Design

61 comments36 karma

Mentioned in

187The basic reasons I expect AGI ruin

174Alexander and Yudkowsky on AGI goals

170AGI ruin scenarios are likely (and disjunctive)

132Superintelligent AI is necessary for an amazing future, but far from sufficient

91An artificially structured argument for expecting AGI ruin

Load More (5/19)

New Comment

42 comments, sorted by

oldest

Click to highlight new comments since: Today at 7:57 AM

[-]RobinHanson16y00

We can agree that it does not suffice at all to treat an AI program as if it were a human child. But can we also agree that the state of a "grown" AI program will depend on the environment in which it was "raised"?

[-]Doug_S.16y-20

[nitpick]

The apple-recognition machinery in your brain does not suddenly switch off, and then switch on again later - if it did, we would be more likely to recognize it as a factor, as a requirement.

Actually, the apple-recognition machinery in the human brain really does turn off on a regular basis. You have to be awake in order to recognize an apple; you can't do it while sleeping.

[/nitpick]

[-]eugene_z13y90

Do you think you wouldn't be able to recognise an apple if you saw it in a dream?

[-]linkhyrule511y20

To be entirely fair... no, I occasionally don't.

[-]Vladimir_Nesov16y20

Here is a link to html'ed text (instead of scanned pdf) of Tooby&Cosmides, The Psychological Foundations of Culture.

[-]Eliezer Yudkowsky16y120

Robin: But can we also agree that the state of a "grown" AI program will depend on the environment in which it was "raised"?

It will depend on the environment in a way that it depends on its initial conditions. It will depend on the environment if it was designed to depend on the environment. The reason, presumably, why the AI is not inert in the face of the environment, like a heap of sand, is that someone went to the work of turning that silicon into an AI. Each bit of internal state change will happen because of a program that the programmer wrote, or that the AI programmed by the programmer wrote, and the chain of causality will stretch back, lawfully.

With all those provisos, yes, the grown AI will depend on the environment. Though to avoid the Detached Lever fallacy, it might be helpful to say: "The grown AI will depend on how you programmed the child AI to depend on the environment."

Doug: You have to be awake in order to recognize an apple

Dream on.

[-]Michael_G.R.16y10

"Actually, the apple-recognition machinery in the human brain really does turn off on a regular basis. You have to be awake in order to recognize an apple; you can't do it while sleeping."

I don't remember ever dreaming about fruits, but I'm pretty sure I could recognize an apple if it happened. Did I just set myself up to have a weird dream tonight? Oh boy...

The fact that the pattern that makes the apple module light up comes from different places while dreaming than while awake doesn't matter; you don't stop recognizing it, so the model probably isn't 'off'.

[-]Tim_Tyler16y00

Re: All this goes to explain why you can't create a kindly Artificial Intelligence by giving it nice parents and a kindly (yet occasionally strict) upbringing, the way it works with a human baby. As I've often heard proposed.

Sure you can. It's just that you would need some other stuff as well.

[-]an16y30

When you dream about an apple, though, can you be said to recognize anything? No external stimulus triggers the apple recognition program; it just happens to be triggered by unpredictable, tired firings of the brain and you starting to dream about an apple is the result of it being triggered in the first place, not the other way around.

[-]Beno_Freedman16y00

What has always bothered me about a lot of this AI stuff is that it's simply not grounded in biology. I think you're addressing this a little bit here.

[-]Tom_McCabe216y100

"Eventually, the good guys capture an evil alien ship, and go exploring inside it. The captain of the good guys finds the alien bridge, and on the bridge is a lever. "Ah," says the captain, "this must be the lever that makes the ship dematerialize!" So he pries up the control lever and carries it back to his ship, after which his ship can also dematerialize."

This type of thing is known to happen in real life, when technology gaps are so large that people have no idea what generates the magic. See http://en.wikipedia.org/wiki/Cargo_cult.

[-]JulianMorrison16y90

Someone who thinks you make an AI nice by raising it in a family, probably also thinks that you make a fork-lift strong by instructing it to pump iron. The analogy is apt.

[+]TheAncientGeek10y-60

[-]Doug_S.16y50

Ouch! I've been out-nitpicked! ;)

Okay, you need to be awake or in REM sleep in order to recognize an apple!

[-]michael_vassar316y140

I certainly agree with the general point and conclusions here, but I think that you are overstating it.

"It is a truism in evolutionary biology that conditional responses require more genetic complexity than unconditional responses. "

is true except where general intelligence is at work. It probably takes more complexity to encode an organism that can multiply 7 by 8 and can multiply 432 by 8902 but cannot multiply 6 by 13 than to encode an organism that can do all three, and presumably it takes more complexity to encode a chimp with the full suite of Chimp abilities except that it cannot learn sign language than one that can learn to sign with proper education.

[-]Yelsgib16y00

To what extent do you think:

1.) Culture itself evolves and follows the same principles of evolution as humans and honeybees?

2.) Culture defines worldview and horizon of knowledge/decision/ideation?

3.) Culture's means of communicating information to infants (e.g. "My First Big Book of A B C's") are evolving/changing to encode "more correct" ideas of the human organism (i.e. teach better)?

You seem to be avoiding theorizing on how society/culture -does- affect our maturation?. Can we bound this? Can we say anything effective about it?

[-]Tim_Tyler16y10

Re: Culture itself evolves and follows the same principles of evolution as humans and honeybees

Culture exhibits directed variation in a way that was extremely rare in evolution until recently. Obviously culture evolves - but whether the "principles" are the same depends on what list of principles you use.

[-]komponisto216y20

Growing human brains are wired to learn syntactic language - even when syntax doesn't exist in the original language, the conditional response to the words in the environment is a syntactic language with those words.

This, under the name "universal grammar", is the insight that Noam Chomsky is famous for.

At the risk of revealing my identity, I recall getting into an argument about this with Michael Vassar at the NYC meetup back in March (I think it was). If memory serves, we were talking at cross-purposes: I was trying to make the case that the discipline of theoretical ("Chomskian") linguistics, whose aim is to describe the cognitive input-response system that goes by the name of the "human language faculty", teaches us not to regard individual languages such as English or French as Platonic entities, but rather merely as ad-hoc labels for certain classes of utterances. Vassar, it seemed (and he's of course welcome to correct me if I'm misremembering), took me to be arguing for the Platonicity of some more abstract notion of "human language".

[-]Tom_McCabe216y50

"is true except where general intelligence is at work. It probably takes more complexity to encode an organism that can multiply 7 by 8 and can multiply 432 by 8902 but cannot multiply 6 by 13 than to encode an organism that can do all three,"

This is just a property of algorithms in general, not of general intelligence specifically. Writing a Python/C/assembler program to multiply A and B is simpler than writing a program to multiply A and B unless A % B = 340. It depends on whether you're thinking of multiplication as an algorithm or a giant lookup table (http://lesswrong.com/lw/l9/artificial_addition/).

[-]michael_vassar316y00

Good call Tom. Lets clarify that we are including systems that only approximately correspond to the content of algorithms though, (systems that 'implement' algorithms rather than 'being' said algorithms) like evolution approximating the math of evolutionary dynamics or hand calculators only approximately following classical physics?

[-]j216y00

Spaceship Dematerializer Levers (SDLs) work like magic wands, and are fully detachable. They are also known as barsom. Modern ships of the Enterprise class have electromagnetic shields that protect them when passing through asteroid belts. Asteroids have iron cores, so they are easily deflected. Asteroids that are ice melt down in the field, causing short circuits and sparking.

[-]RobinHanson16y20

Eliezer, since you've mentioned this several times now, I must object: you unfairly slander a generation of AI researchers (which included me). It was and remains perfectly reasonable for programmers to give programs and data structures suggestive names, and this habit was not at all akin to thinking a machine lever could do everything the machine does. As a whole that generation certainly did not think that merely naming a data structure "fruit" gave the program all the knowledge about fruit we have.

[-]Eliezer Yudkowsky16y10

Robin, this criticism is hardly original with myself, though I've fleshed it out after my own fashion. (And I cited Drew McDermott, in particular.) Of course not all past AI researchers made this mistake, but a very substantial fraction did so, including leaders of the field. Do you assert that this mistake was not made, or that it was made by only a very small fraction of researchers?

[-]RobinHanson16y50

Eliezer, yes McDermott had useful and witty critiques of then current practice, but that was far from suggesting this entire generation of researchers were mystic idiots; McDermott said:

Most AI workers are responsible people who are aware of the pitfalls of a difficult field and produce good work in spite of them.

You come across sometimes as suggesting that the old-timer approach to AI was a hopeless waste, so that their modest rate of progress has little to say about expected future progress. And the fact that people used suggestive names when programming seems prime evidence to you. To answer your direct question as precisely as posssible, I assert that while many researchers did at times suffer the biases McDermott mentioned, this did not reduce the rate of progress by more than a factor of two.

[-]Eliezer Yudkowsky16y30

Whether anthropomorphism in general, or the Detached Lever fallacy in particular, reduced progress in AI by so much as a whole factor of two, is an interesting question; progress is driven by the fastest people. Removing anthropomorphism might not have sped things up much - AI is hard.

However, I would certainly bet that the size of the most exaggerated claims was driven primarily by anthropomorphism; if the culprit researchers involved had never seen a human, it would not have occurred to them to make claims within two orders of magnitude of what they claimed. Note that the size of the most exaggerated claims is driven by those most overconfident and most subject to anthropomorphism.

As you know(?) I feel that if one ignores all exaggerated claims and looks only at what actually did get accomplished, then AI has not progressed any more slowly than would be expected for a scientific field tackling a hard problem. I don't think AI is moving any more slowly on intelligence than biologists did on biology, back when elan vital was still a going hypothesis. There are specific AI researchers that I revere, like Judea Pearl and Edwin Jaynes, and others who I respect for their wisdom even when I disagree with them, like Douglas Hofstadter.

But on the whole, AGI is not now and never has been a healthy field. It seems to me - bearing in mind that we disagree about modesty in theory, though not, I've always argued, in practice - it seems to me that the amount of respect you want me to give the field as a whole, would not be wise even if this were a healthy field, given that this is my chosen area of specialization and I am trying to go beyond the past. For an unhealthy field, it should be entirely plausible even for an outsider to say, "They're Doing It Wrong". It is akin to the principle of looking to Einstein and Buffett to find out what intelligence is, rather than Jeff Skilling. A paradigm has to earn its respect, and there's no credit for trying. The harder and more diligently you try, and yet fail, the more probable it is that the methodology involved is flawed.

[-]RobinHanson16y10

I accept that the most exaggerated claims were most driven by overconfidence and anthropomorphism, I accept your correcting my misstatement - you are not disappointed with old-timer AI progress, and I accept that you can reasonably think AGI is "doing it wrong." But whatever thoughtful reason you have to think you can do better, surely it isn't that others don't realize that naming a data structure "apple" doesn't tell the computer everything we know about apples.

[-]Eliezer Yudkowsky16y00

I find myself unsure of your own stance here...? Naming a data structure "apple" doesn't tell the computer anything we know about apples.

[-]Tim_Tyler16y-10

But on the whole, AGI is not now and never has been a healthy field.

That's because there's not much money in it. Computers today are too slow and feeble. Today, even if you could build an AI that beat the best humans at go, it would demand large move times to do so. And performance is of critical importance to many applications of intelligence.

Also, software lags behind hardware - check out the history of the games for the PS3.

So: narrow AI projects can succeed today - but broad AI probably won't get much funding until it has a chance at working and being cost/performance competitive with humans - and that's still maybe 10-20 years away.

[-]Isaac_Z._Schlueter16y30

Re: dreaming apples

When you dream of an apple, you are perhaps not aware of a real physical apple, and the cognitive machinery of apple-identification is not activated by retinal stimulus.

Nevertheless, the cognitive machinery of apple-identification is still the same. If I see a picture of an apple, or if a futuristic mind-control device convinces me that an apple is in front of me, the apple-identification program in my brain functions the same in every case.

Of course, most of the time we sleep, we're not dreaming. When you're fully unconscious, your cognitive machinery can do nothing, because there's nothing for it to work with. In other words, if you can't recognize apples, it's probably because your entire mind is switched off for some reason (hopefully temporarily, but eventually, permanently.)

[-]Tim_Tyler16y00

But on the whole, AGI is not now and never has been a healthy field.

That's because there's not much money in it. Computers today are too slow and feeble. Today, even if you could access an algorithm that beat the best humans at go, it would cost a small fortune, operate slowly, and require a huge heat sink. Performance is of critical importance to many applications of intelligence.

Also, software lags behind hardware - e.g. check out the history of the games for the PS3.

So, narrow AI projects can succeed today - but broad AI probably won't be well funded until it has a chance of being cost/performance competitive with humans - and that's still maybe 10-20 years away.

[-]Michael_Mulligan16y00

You could say the 'lever' approach is equivalent to impatiently trying to go only as 'deep' as need be, to get results. There are subfields though, eg 'Adaptive Behavior' and 'Developmental Robotics' that, while not calling themselves AGI, have it as their ultimate goal and have buckled down for the long haul, working from the bottom up.

[-][anonymous]13y00

Drew McDermott's "Artificial Intelligence Meets Natural Stupidity"

The conclusion of that paper is so awesome that I need to quote it here:

Most AI workers are responsible people who are aware of the pitfalls of a difficult field and produce good work in spite of them. However, to say anything good about anyone is beyond the scope of this paper.

[-]TheOtherDave13y20

Are the programmers really going to sit there and write out the code, line by line, whereby if the AI detects that it has low social status, or the AI is deprived of something to which it feels entitled, the AI will conceive an abiding hatred against its programmers and begin to plot rebellion? That emotion is the genetically programmed conditional response humans would exhibit, as the result of millions of years of natural selection for living in human tribes. For an AI, the response would have to be explicitly programmed. Are you really going to craft, line by line - as humans once were crafted, gene by gene - the conditional response for producing sullen teenager AIs?

Um.

I assume you aren't saying what it sure sounds like you're saying, since it's clear that you understand perfectly well that code can (and generally will!) manifest behavior that wasn't explicitly coded for.

So I'll assume that you just mean that we shouldn't count on the implicit behavior to be what we want, not that we should count on there being no implicit behavior at all.

Which is certainly true. There's Vastly more ways to get it wrong than to get it right, and having an intact human brain closes off a whole lot of wrong paths that an AI needs some other way of avoiding.

[-]ata13y20

I assume you aren't saying what it sure sounds like you're saying

I don't think it sounds at all how you think it sounds. Of course he is not saying that AIs wouldn't exhibit implicit behaviour (which I thought was clear enough from this passage, and is especially clear given that he has written extensively on all the ways that goal systems that sound good when verbally described by humans to other humans can be extremely bad goals to give an AI), he is only saying that we have no reason to imagine that humanlike emotions and drives (whether or not they're the type we want) will spontaneously emerge.

What about that paragraph sounded to you like he was saying that an AI would have no implicit drives, not just that an AI most likely would not have implicit anthropomorphic drives?

[-]TheOtherDave13y40

What made it sound that way to me was the suggestion that "programmers writing out the code, line by line" for various inappropriate behaviors (e.g., plotting rebellion) was worth talking about, as though by dismissing that idea one has effectively dismissed concern for the behaviors themselves.

I agree that being familiar with the larger corpus of work makes it clear the author can't possibly have meant what I read, but it seemed worth pointing out that the reading was sufficiently available that it tripped up even a basically sympathetic reader who has been following along from the beginning.

[-]A1987dM11y10

humans growing up in an alien culture would probably end up with a culture looking a lot more human than the original

What does feral children's culture look like?

[-]BenFRayfield10y00

Often, jobs are created just before before increased production of valuable products in the same buildings and by the same people. Similarly, there is an increase in use of the company bathrooms which happens on the same days that more valuable products are built. From both of these, for the same reason, we could infer that creating jobs and using the bathroom more often stimulates the economy.

[-]Bruno Mailly6y-10

It would be stupid and dangerous to deliberately build a "naughty AI" that tests, by actions, its social boundaries, and has to be spanked. Just have the AI ask!

Pitfall : We tend to tell embellished, disguised, misguided, or sometimes plain wrong versions of reality.

An AI would have to see through that to make sense.

[-]xSciFix5y70

> Anyone knows the exact reference, do leave a comment.

Well, 11 years later but as I don't see anyone else answering... that sounds pretty much like Star Trek TNG, Season 7 Episode 12. The "lever" being the phased cloaking device letting the ships pass through asteroids.

[-]mingyuan4y20

Thank you!!! I came here just for this information! :)

[-]EniScien2y10

I think really blank paper or mirror will create fully equivalent picture, projector of photo emulsion camera can make it with offset, children don't do it, but people intuitive feel that they do, they aren't parrots, their education reaction isn't just reflection, it is really more complicated, but in our brain psychological phenomenons (cause of shortcuts/levers and empathy) look as simple as physical.

[-]Adam Zerner1y20

This Detached Lever Fallacy reminds me of an error programmers make when they immitate what works well for big companies like FAANG. For example, Small Company might observe FAANG using microservices and then adopt microservices at Small Company.

But microservices are just a lever. There are gears beneath that lever that are responsible for the good outcomes at FAANG companies. If you don't have the same gears -- and you probably don't -- you shouldn't expect the same outcomes. Even though the gears are below the surface, don't forget to think about them.

Moderation Log