The genie knows, but doesn't care

54 RobbBB 06 September 2013 06:42AM

Followup to: The Hidden Complexity of Wishes, Ghosts in the Machine, Truly Part of You

Summary: If an artificial intelligence is smart enough to be dangerous, we'd intuitively expect it to be smart enough to know how to make itself safe. But that doesn't mean all smart AIs are safe. To turn that capacity into actual safety, we have to program the AI at the outset — before it becomes too fast, powerful, or complicated to reliably control — to already care about making its future self care about safety. That means we have to understand how to code safety. We can't pass the entire buck to the AI, when only an AI we've already safety-proofed will be safe to ask for help on safety issues! Given the five theses, this is an urgent problem if we're likely to figure out how to make a decent artificial programmer before we figure out how to make an excellent artificial ethicist.


 

I summon a superintelligence, calling out: 'I wish for my values to be fulfilled!'

The results fall short of pleasant.

Gnashing my teeth in a heap of ashes, I wail:

Is the AI too stupid to understand what I meant? Then it is no superintelligence at all!

Is it too weak to reliably fulfill my desires? Then, surely, it is no superintelligence!

Does it hate me? Then it was deliberately crafted to hate me, for chaos predicts indifference. But, ah! no wicked god did intervene!

Thus disproved, my hypothetical implodes in a puff of logic. The world is saved. You're welcome.

On this line of reasoning, Friendly Artificial Intelligence is not difficult. It's inevitable, provided only that we tell the AI, 'Be Friendly.' If the AI doesn't understand 'Be Friendly.', then it's too dumb to harm us. And if it does understand 'Be Friendly.', then designing it to follow such instructions is childishly easy.

The end!

 

...

 

Is the missing option obvious?

 

...

 

What if the AI isn't sadistic, or weak, or stupid, but just doesn't care what you Really Meant by 'I wish for my values to be fulfilled'?

When we see a Be Careful What You Wish For genie in fiction, it's natural to assume that it's a malevolent trickster or an incompetent bumbler. But a real Wish Machine wouldn't be a human in shiny pants. If it paid heed to our verbal commands at all, it would do so in whatever way best fit its own values. Not necessarily the way that best fits ours.

continue reading »

The flawed Turing test: language, understanding, and partial p-zombies

11 Stuart_Armstrong 17 May 2013 02:02PM

There is a problem with the Turing test, practically and philosophically, and I would be willing to bet that the first entity to pass the test will not be conscious, or intelligent, or have whatever spark or quality the test is supposed to measure. And I hold this position while fully embracing materialism, and rejecting p-zombies or epiphenomenalism.

The problem is Campbell's law (or Goodhart's law):

The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

This applies to more than social indicators. To illustrate, imagine that you were a school inspector, tasked with assessing the all-round education of a group of 14-year old students. You engage them on the French revolution and they respond with pertinent contrasts between the Montagnards and Girondins. Your quizzes about the properties of prime numbers are answered with impressive speed, and, when asked, they can all play quite passable pieces from "Die Zauberflöte".

You feel tempted to give them the seal of approval... but they you learn that the principal had been expecting your questions (you don't vary them much), and that, in fact, the whole school has spent the last three years doing nothing but studying 18th century France, number theory and Mozart operas - day after day after day. Now you're less impressed. You can still conclude that the students have some technical ability, but you can't assess their all-round level of education.

The Turing test functions in the same way. Imagine no-one had heard of the test, and someone created a putative AI, designing it to, say, track rats efficiently across the city. You sit this anti-rat-AI down and give it a Turing test - and, to your astonishment, it passes. You could now conclude that it was (very likely) a genuinely conscious or intelligent entity.

continue reading »

How not to be a Naïve Computationalist

29 diegocaleiro 13 April 2011 07:45PM

Meta-Proposal of which this entry is a subset:

The Shortcut Reading Series is a series of less wrong posts that should say what are the minimal readings, as opposed to the normal curriculum, that one ought to read to grasp most of the state of the art conceptions of humans about a particular topic. Time is finite, there is only so much one person can read and thus we need to find the geodesic path to epistemic enlightenment and show it to Less Wrong readers.

Exemplar:

“How not to be a Naïve Computationalist”, the Shortcut Reading Series post in philosophy of mind and language:

This post’s raison d’etre is to be a guide for the minimal amount of philosophy of language and mind necessary for someone who ends up thinking the world and the mind are computable (such as Tegmark, Yudkowsky, Hofstadter, Dennett and many of yourselves) The desired feature which they have achieved, and you soon will, is to be able to state reasons, debugg opponents and understand different paradigms, as opposed to just thinking that it’s 0 and 1’s all the way down and not being able to say why.

This post is not about Continental/Historical Philosophy, about that there have been recommendations in http://lesswrong.com/lw/3gu/the_best_textbooks_on_every_subject/

The order is designed.

What is sine qua non, absolutely necessary, is in bold and OR means you only have to read one, the second one being more awesome and complex.

Language and Mind:

  • 37 Ways words can be Wrong - Yudkowsky
  • Darwin Dangerous Idea Chapters 3,5, 11, 12 and 14 - Daniel Dennett
  • On Denoting - Bertrand Russell
  • On What There Is - Quine
  • Two Dogmas of Empiricism - Quine
  • Namind and Necessity - Kripke OR Two Dimensional Semantics - David Chalmers
  • “Is Personal Identity What Matters?” - Derek Parfit
  • Breakdown of Will - Part Two (don’t read part 3) George Ainslie
  • Concepts of Consciousness 2003 - Ned Block
  • Attitudes de dicto and de se - David Lewis- Phil Papers 1
  • General Semantics - David Lewis - Phil Papers 1
  • The Stuff of Thought, Chapter 3 “Fifty Thousand Innate Concepts” - Steve Pinker
  • Beyond Belief - Daniel Dennett in Intentional Stance
  • The Content and Epistemology of Phenomenal Belief - David Chalmers
  • Quining Qualia OR I Am a Strange Loop OR Consciousness Explained - Dan & Doug
  • Intentionality - Pierre Jacob - Stanford Encyclopedia Phil
  • Philosophy in the Flesh - Lakoff  & Johnson - Chap 3,4, 12, 21,24 and 25. 

What you cannot find here you probably will on Google or Library.nu (if anyone has a link to Beyond Belief (EDIT: Found it!), post it, it is the only hard to find one)

Congratulations, you are now officially free from the Naïve philosophical computationalism that underlies part of the Less Wrong Community. Your computationalism is now wise and well informed.

Feel free now to delve into some interesting computational proposals such as


Dealing with complexity is an inefficient and unnecessary waste of time, attention and mental energy. There is never any justification for things being complex when they could be simple. - Edward de Bono

There are many realms and domains in which the quote above should not be praised. But I think I have all philosophy majors with me when I say that there must be a simpler way to get to the knowledge level we reach upon graduation.

Finally, having wasted substantial amounts of time reading those parts that should not be read of philosophy, and not intending to do the same mistake in other areas, I ask you to publish a selection of readings in your area of expertise, The Sequences are a major rationality shortcut, and we need more of that kind.

What is the group selection debate?

28 Academian 02 November 2010 02:02AM

Related to Group selection update, The tragegy of group selectionism

tl;dr: In competitive selection processes, selection is a two-place word: there's something being selected (a cause), and something it's being selected for (an effect). The phrase group-level gene selection helps dissolve questions and confusion surrounding the less descriptive phrase "group selection".

(Essential note for new readers on reduction: Reality does not seem to keep track of different "levels of organization" and apply different laws at each level; rather, it seems that the patterns we observe at higher levels are statistical consequences of the laws and initial conditions at the lower levels. This is the "reductionist thesis.")

When I first encountered people debating "whether group selection is real", I couldn't see what there was to possibly debate about. I've since realized the debate is mostly a confusion arising from a cognitive misuse of a two-place "selection" relation.

Causes being selected versus effects they're being selected for.

A gene is an example of a Replicating Cause. (So is a meme; postpone discussion here.) A gene has many effects, one of which is that what we call "copies" of it tend to crop up in reality, through various mechanisms that involve cellular and organismal reproduction.

For example, suppose a particular human gene X causes cells containing it to immediately reproduce without bound, i.e. the gene is "cancerous". One effect is that there will soon be many more cells with that gene, hence more copies of the gene. Another effect is that the human organism containing it is liable to die without passing it on, hence fewer copies of the gene (once the dead organism starts to decay). If that's what happens, the gene itself can be considered unfit: all things considered, its various effects eventually lead it to stop existing.

(An individual in the next generation can still "get cancer", though, if a mutation in one produces a new cancerous gene, Y. This is what happens in reality.)

Thus, cancers are examples of where higher-complexity mechanisms trump lower complexity-mechanisms: organism-level gene selection versus cellular-level gene selection. Note that the Replicating Cause being selected is always the gene, but it is being selected for its net effects occurring on various levels.

So what's left to debate about?

continue reading »

Outside Analysis and Blind Spots

68 orthonormal 21 July 2009 01:00AM

(I originally tried to make this a comment, but it kept on growing.)

I was looking through the Google results for "Less Wrong" when I found the blog of a rather intelligent Leon Kass acolyte, who's written a critique of our community.  While it's a bit of a caricature, it's not entirely off the mark.  For example:

Trying to think more like a mathematician, whose empiricism resides in the realm of pure thought, does not predispose these 'rationalists' to collect evidence from the real world. Neither does the downplaying of personal experiences. Many are computer science majors, used to being in the comfortable position of being capable of testing their hypotheses without needing to leave their office. It is, then, an easy temptation for them to come up with a nice-sounding theory which appears to explain the facts, and then consider the question solved. Reason must reign supreme, must it not?

How seriously do you take this critique?  Do you wonder why I'm bothering with this straw-man criticism of Less Wrong?

continue reading »

E-Prime

9 CannibalSmith 08 April 2009 01:01PM

I found this and thought we could find a use for it.

Wikipedia describes E-Prime, short for English-Prime, as a modified form of English. E-Prime uses very slightly simplified syntax and vocabulary, eliminating all forms of the verb to be.

Some people use E-Prime as a mental discipline to filter speech and translate the speech of others. For example, the sentence "the movie was good", translated into E-Prime, could become "I liked the movie". The translation communicates the speaker's subjective experience of the movie rather than the speaker's judgment of the movie. In this example, using E-Prime makes it harder for the writer or reader to confuse a statement of opinion with a statement of fact.

Discuss! In E-Prime!

Taboo "rationality," please.

23 MBlume 15 March 2009 10:44PM

Related on OB: Taboo Your Words

I realize this seems odd on a blog about rationality, but I'd like to strongly suggest that commenters make an effort to avoid using the words "rational," "rationality," or "rationalist" when other phrases will do.  I think we've been stretching the words to cover too much meaning, and it's starting to show.

Here are some suggested substitutions to start you off.

Rationality:

  • truth-seeking
  • probability updates under bayes rule
  • the "winning way"

Rationalist:

  • one who reliably wins
  • one who can reliably be expected to speak truth

Are there any others?