LESSWRONG
LW

Comment Permalink

WhySpace_duplicate0.92616921290755279y00

That was pretty much my take. I get the feeling that "okay" outcomes are a vanishingly small portion of probability space. This suggests to me that the additional marginal effort to stipulate "okay" outcomes instead of perfect CEV is extremely small, if not negative. (By negative, I mean that it would actually take additional effort to program an AI to maximize for "okay" outcomes instead of CEV.)

However, I didn't want to ask a leading question, so I left it in the present form. It's perhaps academically interesting that the desirability of outcomes as a function of “similarity to CEV” is a continuous curve rather than a binary good/bad step function. However, I couldn't really see any way of taking advantage of this. I posted mainly to see if others might spot potential low hanging fruit.

I guess the interesting follow up questions are these: Is there any chance that humans are sufficiently adaptable that human values are more than just an infinitesimally small sliver of the set of all possible values? If so, is there any chance this enables an easier alternative version of the control problem? It would be nice to have a plan B.

See in context

6 Open Thread, Aug. 22 - 28, 2016

by polymathwannabe

22nd Aug 2016

1 min read

6

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "Make this post available under..." before submitting.

Open Threads

Personal Blog

6

Open Thread, Aug. 22 - 28, 2016

2Good_Burning_Plastic

5Good_Burning_Plastic

2WhySpace_duplicate0.9261692129075527

3Wei Dai

1MrMind

0WhySpace_duplicate0.9261692129075527

1ChristianKl

0WhySpace_duplicate0.9261692129075527

0pcm

0WhySpace_duplicate0.9261692129075527

0WhySpace_duplicate0.9261692129075527

New Comment

67 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:05 AM

[-]Fluttershy9y110

Several months ago, Ozy wrote a wonderful post on weaponized kindness over at Thing of Things. The principal benefit of weaponized kindness is that you can have more pleasant and useful conversations with would-be adversaries by acknowledging correct points they make, and actively listening to them. The technique sounds like exactly the sort of thing I'd expect Dale Carnegie to write about in How to Win Friends and Influence People.

I think, though, that there's another benefit to both weaponized kindness, and more general extreme kindness. To generalize from my own experience, it seems that people's responses to even single episodes of extreme kindness can tell you a lot about how you'll get along with them, if you're the type of person who enjoys being extremely kind. Specifically, people who reciprocate extreme kindness tend to get along well with people who give extreme kindness, as do people who socially or emotionally acknowledge that an act of kindness has been done, even without reciprocating. On the other hoof, the sort of people who have a habit of using extreme kindness don't tend to get along with the (say) half of the population consisting of people who are most likely to ignore or discredit extreme kindness.

In some sense, this is fairly obvious. The most surprising-for-me thing about using the reaction-to-extreme-kindness heuristic for predicting who I'll be good friends with, though, is how incredibly strong and accurate the heuristic is for me. It seems like 5 of the 6 individuals I feel closest to are all in the top ~1 % of people I've met at being good at giving and receiving extreme kindness.

(Partial caveat: this heuristic doesn't work as well when another party strongly wants something from you, e.g. in some types of unhealthy dating contexts).

[-]Jiro9y50

Using kindness as a weapon creates incentives for people not to respond to kindness.

Also, as people pointed out in that thread, using kindness as a weapon is usually just being patronizing. The problem with showing contempt for someone is that you have contempt for someone, not that you're showing it, and focussing your efforts on the showing part rarely fools anyone and is considered worse for very good reasons.

[-]WalterL9y100

Saw the site mentioned on Breibart:

Link: http://www.breitbart.com/tech/2016/03/29/an-establishment-conservatives-guide-to-the-alt-right/

Money Quote:

...Elsewhere on the internet, another fearsomely intelligent group of thinkers prepared to assault the secular religions of the establishment: the neoreactionaries, also known as #NRx.

Neoreactionaries appeared quite by accident, growing from debates on LessWrong.com, a community blog set up by Silicon Valley machine intelligence researcher Eliezer Yudkowsky. The purpose of the blog was to explore ways to apply the latest research on cognitive science to overcome human bias, including bias in political thought and philosophy.

LessWrong urged its community members to think like machines rather than humans. Contributors were encouraged to strip away self-censorship, concern for one’s social standing, concern for other people’s feelings, and any other inhibitors to rational thought. It’s not hard to see how a group of heretical, piety-destroying thinkers emerged from this environment — nor how their rational approach might clash with the feelings-first mentality of much contemporary journalism and even academic writing.

Led by philosopher Nick Land and computer scientist Curtis Yarvin, this group began a ..."

I wasn't around back in the day, but this is nonsense, right? Nrx didn't start on lesswrong, yeah?

[-]Dagon9y150

I was around back in the day, and can confirm that this is nonsense. NRX evolved separtely. There was a period where it was of interest and explored by a number of LW contributors, but I don't think any of the thought leaders of either group were significantly influential to the other.

There is some philosophical overlap in terms of truth-seeking and attempted distinction between universal truths and current social equilibria, but neither one caused nor grew from the other.

[-]Viliam9y60

Agreed.

At some moment, there was a period when there were debates about NR on LW, simply because those were the times when people on LW were okay with discussing almost anything. And NR happened to be one of the many interesting topics to discuss.

The problem was, everyone else on this planet was completely ignoring NR at that moment. Starved for attention, NRs decided to focus their recruitment attempts on LW audience, and started associating themselves with LW on their personal blogs. After repeating this lie enough times, it started getting quoted by other people as a fact. (Such as on Breitbart now.)

The LW debates of NR were interesting at the beginning, but they soon became repetitive (and it was kinda impossible to decipher what specifically Moldbug is saying in his long texts, other than that "Cthulhu is always swimming left"). The few local NR fans decided to start their own collective blog.

The largest long-term impact is that one hardcore NR fan decided to stay on LW, despite repeated bans, and created an army of automated sockpuppets, downvoting comments of people he perceives hostile to the NR idea, plus any comments about himself or his actions here. (I expect this comment to have -10 karma soon, but whatever.)

Probably long before NR existed, LW had the "politics is the mindkiller" approach to politics. This didn't prevent us from having relatively friendly discussions of political topics once in a while. But the automated downvoting of every perceived enemy of NR had a chilling effect on such debates.

[-][anonymous]9y40

I don't know whether you've heard of it, but someone wrote an ebook called "Neoreaction a Basilisk" that claims Eliezer Yudkowsky was an important influence on Mencius Moldbug and Nick Land. There was a lot of talk about it on the tumblr LW diaspora a few months back.

[This comment is no longer endorsed by its author]Reply

[-]Good_Burning_Plastic9y20

I don't think any of the thought leaders of either group were significantly influential to the other.

Yvain did say that he was influenced by Moldbug.

[-]philh9y100

I feel it's important to note that he was talking about writing styles, not philosophy.

[-]ChristianKl9y00

I feel it's important to note that he was talking about writing styles, not philosophy.

Do you think how one reasons in writing about a subject has nothing todo with philosophy?

[-]philh9y00

I don't think I need to make a claim that strong.

I think that "Yvain's writing style was significantly influenced by Moldbug" is an importantly different claim to "Yvain's philosophy was significantly influenced by Moldbug"; the second claim is the one under discussion; and if someone wants to turn the first into the second, the burden of proof is on them.

[-]Baughn9y00

When it comes to writing styles? Absolutely. There's a ton of skills involved, and deciding exactly which thoughts you want to convey is only a small part of it.

[-]Elo9y70

think like machines rather than humans

01101000 01100001 01101000 01100001 01101000 01100001 01101000 01100001

[-]WalterL9y50

Aw come on guys. Negative karma for literally pointing out a news site? What does that even mean?

[-]WalterL9y20

Touche.

[-]Good_Burning_Plastic9y50

Moldbug and Yudkowsky have been disagreeing with each other basically ever since their blogs have even existed.

[-]Vaniver9y40

I seem to recall a Yudkowsky anti-NRx comment on Facebook a year or two ago, but does anyone recall / have a link to an earlier disagreement on Yudkowsky's part?

[-]Risto_Saarelma9y100

On Moldbug from 2012.

[-]polymathwannabe9y-20

NRx emerged from some of the ideas that also coalesced into LW, but their aims couldn't be more different. A core item of the LW philosophy is altruism; NRx is systematized hatred.

Also, it being on Breitbart should be enough indication that its version of events is not to be trusted.

[-]ZankerH9y80

NRx is systematized hatred.

Am NRx, this assertion is false.

[-]morganism9y40

Scientists use ultrasound to jump-start a man’s brain after coma

Deep brain stim with ultrasonic raises patient from coma. Aimed at the thalmus, very low power required.

Might be useful for raising astronauts from slumber on a mars mission too. NASA are studying extended hibernation/somulance.

http://sciencebulletin.org/archives/4666.html

paper, paywalled http://www.brainstimjrnl.com/article/S1935-861X%2816%2930200-5/abstract

[-][anonymous]9y30

It seems to me that the history of biological systematics/taxonomy is a great source of material for a study on dissolving the question (but I am neither a systematicist nor a historian). Are there any popular intros into the field that don't focus on individual botanists of the past? Serebryakov's "Morphology of plants", printed half a century ago, has a nice section on history, but it is limited in scope (and not quite "popular"). Other books often just list the people and what they did without interconnecting them, which is boring.

[-]NancyLebovitz9y30

Naming Nature is focused on animals, but it or some of the books receommended with it might be the sort of thing you're looking for.

[-]ChristianKl9y00

Why do you think that reading a history of how people who didn't know what DNA was thought about taxonomy will help dissolving the question?

[-][anonymous]9y30

Why DNA? For most of taxonomy's existence, DNA "didn't exist". Just because genotyping changed the game doesn't mean there was no game before.

[-]ChristianKl9y00

Of course there was taxonomy before. There was taxonomy at the time where people believed in spontaneous generation. On the other hand studying the biological theories of the time doesn't help us much to solve contemporary problems.

[-]linkhyrule59y20

So there's a post that was written, geez, about three years back, about the estimated risk of a catastrophe from an anthropic perspective. I forget most of the logic (or I'd've found it myself), but one of the conclusions was that from the perspective of an observer who requires a "miracle" to exist, miracles seem to occur at approximately evenly spaced intervals. Does anyone remember which post I'm talking about?

[-]gwern9y70

https://mason.gmu.edu/~rhanson/greatfilter.html http://hanson.gmu.edu/hardstep.pdf https://wiki.lesswrong.com/wiki/Great_Filter ?

[-]linkhyrule59y20

There we go. No wonder I couldn't find it, it wasn't on LessWrong, and also a lot older than three years. Thanks!

[-]WhySpace_duplicate0.92616921290755279y20

(1) Given: AI risk comes primarily from AI optimizing for things besides human values.

(2) Given: humans already are optimizing for things besides human values. (or, at least besides our Coherent Extrapolated Volition)

(3) Given: Our world is okay.^[CITATION NEEDED!]

(4) Therefore, imperfect value loading can still result in an okay outcome.

This is, of course, not necessarily always the case for any given imperfect value loading. However, our world serves as a single counterexample to the rule that all imperfect optimization will be disastrous.

(5) Given: A maxipok strategy is optimal. ("Maximize the probability of an okay outcome.")

(6) Given: Partial optimization for human values is easier than total optimization. (Where "partial optimization" is at least close enough to achieve an okay outcome.)

(7) ∴ MIRI should focus on imperfect value loading.

Note that I'm not convinced of several of the givens, so I'm not certain of the conclusion. However, the argument itself looks convincing to me. I’ve also chosen to leave assumptions like “imperfect value loading results in partial optimization” unstated as part of the definitions of those 2 terms. However, I’ll try and add details to any specific areas, if questioned.

[-]Wei Dai9y30

However, our world serves as a single counterexample to the rule that all imperfect optimization will be disastrous.

Except that the proposed rule is more like, given an imperfect objective function, the outcome is likely to turn from ok to disastrous at some point as optimization power is increased. See the Context Disaster and Edge Instantiation articles at Arbital.

The idea of context disasters applies to humans or humanity as a whole as well as AIs, since as you mentioned we are already optimizing for something that is not exactly our true values. Even without the possibility of AI we have a race between technological progress (which increases our optimization power) and progress in coordination and understanding our values (which improve our objective function), which we can easily lose.

[-]MrMind9y10

I think a problem arises with conclusion 4: I can agree that humans imperfectly steering the world for their own values has resulted in a world averagely ok, but AI will possibly be much more powerful than humans.
As far as corporation and sovereing states can be seen to be super-human entities, then we can see that imperfect value optimization has created massive suffering: think of all the damage a ruthless corporation can inflict e.g. by polluting the environment, or a state where political assassination is easy and widespread.
An imperfectly aligned value optimization might result in an average world that is ok, but possibly this world would be separated in a heaven and hell, which I think is not an acceptable outcome.

[-]WhySpace_duplicate0.92616921290755279y00

This is a good point. Pretty much all the things we're optimizing for which aren't our values are due to coordination problems. (There's also Akrasia/addiction sorts of things, but that's optimizing for values which we don't endorse upon reflection, and so arguably isn't as bad as optimizing for a random part of value-space.)

So, Moloch might optimize for things like GDP instead of Gross National Happiness, and individuals might throw a thousand starving orphans under the bus for a slightly bigger yacht or whatever, but neither is fully detached from human values. Even if U(orphans)>>U(yacht), at least there’s an awesome yacht to counterbalance the mountain of suck.

I guess the question is precisely how diverse human values are in the grand scheme of things, and what the odds are of hitting a human value when picking a random or semi-random subset of value-space. If we get FAI slightly wrong, precisely how wrong does it have to be before it leaves our little island of value-space? Tiling the universe with smiley faces is obviously out, but what about hedonium, or wire heading everyone? Faced with an unwinnable AI arms race and no time for true FAI, I’d probably consider those better than nothing.

That's a really, really tiny sliver of my values though, so I'm not sure I'd even endorse such a strategy if the odds were 100:1 against FAI. If that's the best we could do by compromising, I'd still rate the expected utility of MIRI's current approach higher, and hold out hope for FAI.

[-]ChristianKl9y10

(1) Given: AI risk comes primarily from AI optimizing for things besides human values.

I don't that's a good description of the orthogonality thesis. An AI that optimizes for a single human value like purity could still produce huge problems.

Given: humans already are optimizing for things besides human values.

Human's don't effectively self modify to achieve specific objectives in the way an AGI could.

(6) Given: Partial optimization for human values is easier than total optimization. (Where "partial optimization" is at least close enough to achieve an okay outcome.)

Why do you believe that?

[-]WhySpace_duplicate0.92616921290755279y00

I don't that's a good description of the orthogonality thesis.

Probably not, but it highlights the relevant (or at least related) portion. I suppose I could have been more precise by specifying terminal values, since things like paperclips are obviously instrumental values, at least for us.

Human's don't effectively self modify

Agreed, except in the trivial case where we can condition ourselves to have different emotional responses. That's substantially less dangerous, though.

Partial optimization for human values is easier than total optimization.

Why do you believe that?

I'm not sure I do, in the sense that I wouldn't assign the proposition >50% probability. However, I might put the odds at around 25% for a Reduced Impact AI architecture providing a useful amount of shortcuts.

That seems like decent odds of significantly boosting expected utility. If such an AI would be faster to develop by even just a couple years, that could make the difference between winning and loosing an AI arms race. Sure, it'd be at the cost of a utopia, but if it boosted the odds of success enough it'd still have enough expected utility to compensate.

[-]pcm9y00

I expect that MIRI would mostly disagree with claim 6.

Can you suggest something specific that MIRI should change about their agenda?

When I try to imagine problems for which imperfect value loading suggests different plans from perfectionist value loading, I come up with things like "don't worry about whether we use the right set of beings when creating a CEV". But MIRI gives that kind of problem low enough priority that they're acting as if they agreed with imperfect value loading.

[-]WhySpace_duplicate0.92616921290755279y00

I'm pretty sure I also mostly disagree with claim 6. (See my other reply below.)

The only specific concrete change that comes to mind is that it may be easier to take one person's CEV than aggregate everyone's CEV. However, this is likely to be trivially true, if the aggregation method is something like averaging.

If that's 1 or 2 more lines of code, then obviously it doesn't really make sense to try and put those lines in last to get FAI 10 seconds sooner, except in a sort of spherical cow in a vacuum sort of sense. However, if "solving the aggregation problem" is a couple years worth of work, maybe it does make sense to prioritize other things first in order to get FAI a little sooner. This is especially true in the event of an AI arms race.

I’m especially curious whether anyone else can come up with scenarios where a maxipok strategy might actually be useful. For instance, is there any work being done on CEV which is purely on the extrapolation procedure or procedures for determining coherence? It seems like if only half our values can easily be made coherent, and we can load them into an AI, that might generate an okay outcome.

[-]niceguyanon9y00

3) World is OK with humans optimizing for the wrong things because humans eventually die and take their ideas with them good or bad. Power and wealth is redistributed. Humans get old, they get weak, they get dull, they lose interest. AI gets it wrong then well that's it.

[-]Lumifer9y10

AI gets it wrong then well that's it.

Not necessarily, depends on your AI and how god-like it is.

In the XIX century you could probably make the same argument about corporations: once one corporation rises above the rest, it will use its power to squash competition and install itself as the undisputed economic ruler forever and ever. The reality turned out to be rather different and not for the lack of trying.

[-]niceguyanon9y00

Not necessarily, depends on your AI and how god-like it is.

I hope you're right. I just automatically think that AI is going to be god-like by default.

In the XIX century you could probably make the same argument about corporations

Not just corporations; you could make the some argument for sovereign states, foundations, trusts, militaries, and religious orgs.

Weak argument is that corporations with their visions, charters, and mission statements are ultimately run by a meatbag or run jointly by meatbags that die/retire, at least that's how it currently is. You can't retain humans forever. Corporations lose valuable and capable employee brains over time and replace them with new brains, which maybe better or worse, but you certainly cant keep your best humans forever. Power is checked; Bill Gates plans his legacy, while Sumner Redstone is infirm with kids jockeying for power and Steve Jobs is dead.

[-]Lumifer9y10

AI is going to be god-like by default

Well, the default on LW is EY's FOOM scenario where an AI exponentially bootstraps itself into Transcendent Realms and, as you say, that's it. The default in the rest of the world... isn't like that.

[-]Manfred9y00

1) Sure.
2) Okay.
3) Yup.
4) This is weasely. Sure, 1-3 are enough to establish that an okay outcome is possible, but don't really say anything about probability. You also don't talk about how good of an optimization process is trying to optimize these values.

5) Willing to assume for the sake of argument.
6) Certainly true but not certainly useful.
7) Doesn't follow, unless you read 6 in a way that makes it potentially untrue.

All of this would make more sense if you tried to put probabilities to how likely you think certain outcomes are.

[-]WhySpace_duplicate0.92616921290755279y00

[-]morganism9y20

The Young Male Cigarette and Alcohol Syndrome Smoking and Drinking as a Short-Term Mating Strategy

"both tobacco and (especially) alcohol use brought some attractiveness benefits in short-term mating contexts. A follow-up study (N = 171) confirmed that men’s behavior corresponds with women’s perceptions"

Self destructive behavior is so interesting in a guy !

CClicensed

http://evp.sagepub.com/content/14/1/1474704916631615.short?rss=1&ssource=mfr

[-]Gleb_Tsipursky9y1-1

Here's my piece in Salon about updating my beliefs about basic income. The goal of the piece was to demonstrate the rationality technique of updating beliefs in the hard mode of politics. Another goal was to promote GiveDirectly, a highly effective charity, and its basic income experiment. Since it had over 1K shares in less than 24 hours and the comment section is surprisingly decent, I'm cautiously optimistic about the outcome.

[-]NancyLebovitz9y160

I'm curious about why this got downvoted, if anyone would like to explain.

[-]Lumifer9y101

Because it's a link to a godawful clickbait piece-of-crap writing that has no relationship to LW, a sprinkling of related terms notwithstanding. Just because Gleb has an account here doesn't mean this is a good place to post about his excursions into the slime pools of 'net pseudo-journalism.

[-]passive_fist9y10

On the contrary, being able to identify your own biases and being able to express what kind of information would change your mind is at the heart of rationality.

You're a libertarian. We all know that. But regardless of whether you ideologically agree with the conclusions of the article or not, you should be able to give a more convincing counter-argument than 'godawful clickbait piece-of-crap.'

[-]Lumifer9y10

I'm not talking about content at all. It seems to be that Gleb now likes the idea of basic income -- and I neither have strong opinions about basic income, nor care much about what Gleb believes.

This would have been a godawful clickbait piece-of-crap even if it argued that free markets are the best thing evah.

[-]passive_fist9y90

Anyone can easily deny that they are biased. That takes no effort. So, again, why is it a 'godawful clickbait piece-of-crap'?

[-]delton1379y80

Its funny because 90+% of articles on Salon.com are 'godawful clickbait' in my opinion -- with this one being one of the exceptions.

[-]passive_fist9y30

And Lumifer's dismissal of it is probably the most low-effort way of responding. Students of rationality, take note.

[-]Gurkenglas9y40

Students of rationality

You sound like you're trying to win at werewolf! Gleb at least appears honest.

[-]Lumifer9y10

The most low-effort way is to just ignore it.

[+]Lumifer9y-50

[-]ChristianKl9y00

I'm for basic income but I agree with Lumifer's sentiment (even when I would use different words). The issue with the article isn't the conclusions. The fact that Gleb posted the article directly after people told him that they want him to stop taking up as much mindshare on EA affiliated venues is also problematic.

[-]Gurkenglas9y-20

You're a libertarian.

I think that non sequitur-ad hominem got you those downvotes.

[-]Fluttershy9y60

There was a lengthy and informative discussion of why many EA/LW/diaspora folks don't like Gleb's work on Facebook last week. I found both Owen Cotton-Barratt's mention of the unilateralist's curse, and Oliver Habryka's statement that people dislike what Gleb is doing largely because of how much he's done to associate himself with rationality and EA, to be informative and tactful.

[-]Elo9y150

With all due respect; I am not interested in starting another very long thread.

In regard to the curse, I would suggest that many many many people have privately consulted Gleb on his actions and told him to refrain in various face-saving ways. The number would be something that Gleb knows. This might be the most recent and most public event. But it is far from the first.

In any case; curse or no. There is significant displeasure at Gleb's actions. He has known this in the past and I only assume continues to know this. I hope he takes this into consideration at this schelling point, and I hope that whatever happens next ends up positive for him and us, and EA and all relevant parties.

[-]delton1379y20

Decent article but pretty basic. Still, a glimmer of reason in the dark pit of Salon.

Didn't know Y Combinator was doing a pilot. They don't mention how many people will be in the pilot in the announcement, but it will be interesting to see.

One thing I never understood is why it makes sense to do cash transfers to people that are already wealthy - or even above average income. A social safety net (while admittedly more difficult to manage) consisting solely of cash income seems to makes more sense. I guess the issue is with the practical implementation details of managing the system and making sure everyone who needs to be enrolled is.

[-]Gleb_Tsipursky9y40

Agreed, to me it also makes no sense to do cash transfers to people with above average income. I see basic income as mainly about a social safety net.

[-]turchin9y-20

(memetic hazard) ˙sƃuıɹǝɟɟns lɐuɹǝʇǝ ɯnɯıxɐɯ ǝʇɐǝɹɔ oʇ pǝsıɯıʇdo ɹǝʇʇɐɯ sı ɯnıuoɹʇǝɹnʇɹoʇ

Update: added full description of the idea in my facebook https://www.facebook.com/turchin.alexei/posts/10210360736765739?comment_id=10210360769286552&notif_t=feed_comment&notif_id=1472326132186571

[-]qmotus9y40

Uh, I think you should format your post so that somebody reading that warning would also have time to react to it and actually avoid reading the thing you're warning about.

[-]turchin9y20

Done using inverted text

[-]MrMind9y10

Iain M. Banks' novel "Surface detail" is about this very concept and its economics.

[-]Manfred9y10

I find this surprisingly unmotivating. Maybe it's because the only purpose this could possibly have is as blackmail material, and I am pretty good at not responding to blackmail.

[-]Dagon9y00

You say blackmail, I say altruistic punishment.

[-]Gurkenglas9y-20

nooo why would you say that someone delete him

Moderation Log