Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Viliam_Bur 08 July 2014 08:29:00AM *  30 points [-]

If someone wants to describe LW as cultish, they can take any parody of itself and present it as further evidence for their claims.

I think something like this has already happened with the Chuck-Norris-like list of Yudkowsky facts; the "Bayesian conspirator" illustration of the beisutsukai stories; and the redacted lecture screenshot that displayed "Eliezer Yudkowsky" on the right end of the intelligence scale. -- Instead of "they are cool people who can make fun" they can be spinned into "this is what those people seriously believe / this is how much they are obsessed with themselves... they must be truly insane". See RationalWiki:

That Eliezer Yudkowsky Facts page is the most disturbing thing I have read in my life. I don't need a shower, I need the outer layer of my skin peeled off. (...) It is fanboyism at a disturbing level. (...) he is hosting this shit on his website that disturbs me

On the other hand, if someone wants to describe LW as cultish, they could also use lack of parodies, or whatever else as an evidence. Once you are charged with being a witch, there is not much you could successfully say in your defense.

So at the end, perhaps we should ignore all such considerations (which, by the way, is what most non-cults do) and simply upvote or downvote things only by their own merit. Also, any attempts for this kind of PR automatically destroys themselves if it is easy to provide a link to the discussion about the PR. (And LW being LW, such discussion will almost certainly happen.)

Comment author: steven0461 08 July 2014 10:43:52AM 0 points [-]

I'd still be happy to remove the EY facts post, although I've been hesitant to do so because it would affect many other people's comments and hiding things might itself be construed as sinister. (I guess your point is that it doesn't matter, but I thought I'd mention it.)

Comment author: steven0461 07 May 2014 02:39:54AM *  4 points [-]

MIRI is now in a close race for the prize for the most total unique donors over the 24 hours, which adds a lot of additional value to $10 donations by people who haven't donated yet.

Comment author: malo 06 May 2014 10:08:26AM 2 points [-]

We didn't win the 12 pm hour, but we won the 1 am hour! We also won the 408 prize of $2,500!

We have a live blog setup here, and a live feed of “the war room” at MIRI.

Comment author: steven0461 06 May 2014 10:27:22AM *  1 point [-]

(MIRI lost the third hour despite being comfortably on top of the leaderboard: what matters is the increase over the last hour, so at this point the leaderboard is probably misleading as an indicator of how close things are.)

Comment author: steven0461 06 May 2014 08:37:01AM *  3 points [-]

The leaderboards for most unique donors seem pretty close between MIRI and the next contender, so additional $10 donations may be getting unusual value per dollar right now. (The first hour was close between MIRI and a different organization with both having something like 34 unique donors, so in that sort of situation, if the highest number wins, the expected value to MIRI of a donation might be on the rough order of $100.)

In response to Meetup : Amsterdam
Comment author: steven0461 21 November 2013 01:15:36PM *  1 point [-]

I've edited the post to include a meetup location: the Starbucks on the Oosterdokseiland just east of Amsterdam Central Station. Hope to see you guys there!

Meetup : Amsterdam

4 steven0461 12 November 2013 09:12AM

Discussion article for the meetup : Amsterdam

WHEN: 23 November 2013 02:00:00PM (+0100)

WHERE: Oosterdokseiland 4, Amsterdam, The Netherlands

Let's have a Netherlands LessWrong meetup on Saturday the 23rd. We're meeting in the Starbucks / "East Dock Lounge" on the Oosterdokseiland next to Amsterdam's Central Station. (The building is new and not yet pictured on Google Maps, but I've verified that it exists in the territory. Note that there's also a Starbucks in the station itself; that isn't where we're meeting.)

I'll bring a sign that says "LW". You can reach me at 0611431304.

Discussion article for the meetup : Amsterdam

Comment author: steven0461 18 July 2013 03:21:46AM 2 points [-]

Black-Box Metaphilosophical AI is also risky, because it's hard to test/debug something that you don't understand.

On the other hand, to the extent that our uncertainty about whether different BBMAI designs do philosophy correctly is independent, we can build multiple ones and see what outputs they agree on. (Or a design could do this internally, achieving the same effect.)

it's unclear why such an AI won't cause disaster in the time period before it achieves philosophical competence.

This seems to be an argument for building a hybrid of what you call metaphilosophical and normative AIs, where the normative part "only" needs to be reliable enough to prevent initial disaster, and the metaphilosophical part can take over afterward.

Comment author: Will_Newsome 17 July 2013 01:16:23PM *  6 points [-]

"create an AI that minimizes the expected amount of astronomical waste"

Of course, this is still just a proxy measure... say that we're "in a simulation", or that there are already superintelligences in our environment who won't let us eat the stars, or something like that—we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here. (Further clarifying and expanding on that intuition might be very valuable, both for polemical reasons and for organizing some thoughts on AI strategy.) I cynically suspect that the stars aren't out there for us to eat, but that we can still gain a lot of leverage over the acausal fanfic-writing commun... er, superintelligence-centered economy/ecology, and so, optimizing the hell out of the AGI that might become an important bargaining piece and/or plot point is still the most important thing for humans to do.

Metaphilosophical AI

The thing I've seen that looks closest to white-box metaphilosophical AI in the existing literature is Eliezer's causal validity semantics, or more precisely the set of intuitions Eliezer was drawing on to come up with the idea of causal validity semantics. I would recommend reading the section Story of a Blob and the sections on causal validity semantics in Creating Friendly AI. Note that philosophical intuitions are a fuzzily bordered subset of justification-bearing (i.e. both moral/values-like and epistemic) causes that are theoretically formally identifiable and are traditionally thought of as having a coherent, lawful structure.

Comment author: steven0461 18 July 2013 03:08:07AM 2 points [-]

we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here

It seems that we have more morally important potential in some possible worlds than others, and although we don't want our language to commit us to the view that we only have morally important potential in possible worlds where we can prevent astronomical waste, neither do we want to suggest (as I think "not waste any of our potential" does) the view that we have the same morally important potential everywhere and that we should just minimize the expected fraction of our potential that is wasted. A more neutral way of framing things could be "minimize wasted potential, especially if the potential is astronomical", leaving the strength of the "especially" to be specified by theories of how much one can affect the world from base reality vs simulations and zoos, theories of how to deal with moral uncertainty, and so on.

Bayesian Adjustment Does Not Defeat Existential Risk Charity

43 steven0461 17 March 2013 08:50AM

(This is a long post. If you’re going to read only part, please read sections 1 and 2, subsubsection 5.6.2, and the conclusion.)

1. Introduction

Suppose you want to give some money to charity: where can you get the most bang for your philanthropic buck? One way to make the decision is to use explicit expected value estimates. That is, you could get an unbiased (averaging to the true value) estimate of what each candidate for your donation would do with an additional dollar, and then pick the charity associated with the most promising estimate.

Holden Karnofsky of GiveWell, an organization that rates charities for cost-effectiveness, disagreed with this approach in two posts he made in 2011. This is a response to those posts, addressing the implications for existential risk efforts.

According to Karnofsky, high returns are rare, and even unbiased estimates don’t take into account the reasons why they’re rare. So in Karnofsky's view, our favorite charity shouldn’t just be one associated with a high estimate, it should be one that supports the estimate with robust evidence derived from multiple independent lines of inquiry.1 If a charity’s returns are being estimated in a way that intuitively feels shaky, maybe that means the fact that high returns are rare should outweigh the fact that high returns were estimated, even if the people making the estimate were doing an excellent job of avoiding bias.

Karnofsky’s first post, Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased), explains how one can mitigate this issue by supplementing an explicit estimate with what Karnofsky calls a “Bayesian Adjustment” (henceforth “BA”). This method treats estimates as merely noisy measures of true values. BA starts with a prior representing what cost-effectiveness values are out there in the general population of charities, then the prior is updated into a posterior in standard Bayesian fashion.

Karnofsky provides some example graphs, illustrating his preference for robustness. If the estimate error is small, the posterior lies close to the explicit estimate. But if the estimate error is large, the posterior lies close to the prior. In other words, if there simply aren’t many high-return charities out there, a sharp estimate can be taken seriously, but a noisy estimate that says it has found a high-return charity must represent some sort of fluke.

Karnofsky does not advocate a policy of performing an explicit adjustment. Rather, he uses BA to emphasize that estimates are likely to be inadequate if they don’t incorporate certain kinds of intuitions — in particular, a sense of whether all the components of an estimation procedure feel reliable. If intuitions say an estimate feels shaky and too good to be true, then maybe the estimate was noisy and the prior is more important. On the other hand, if intuitions say an estimate has taken everything into account, then maybe the estimate was sharp and outweighs the prior.

Karnofsky’s second post, Maximizing Cost-Effectiveness Via Critical Inquiry, expands on these points. Where the first post looks at how BA is performed on a single charity at a time, the second post examines how BA affects the estimated relative values of different charities. In particular, it assumes that although the charities are all drawn from the same prior, they come with different estimates of cost-effectiveness. Higher estimates of cost-effectiveness come from estimation procedures with proportionally higher uncertainty.

It turns out that higher estimates aren’t always more auspicious: an estimate may be “too good to be true,” concentrating much of its evidential support on values that the prior already rules out for the most part. On the bright side, this effect can be mitigated via multiple independent observations, and such observations can provide enough evidence to solidify higher estimates despite their low prior probability.

Charities aiming to reduce existential risk have a potential claim to high expected returns, simply because of the size of the stakes. But if such charities are difficult to evaluate, and the prior probability of high expected values is low, then the implications of BA for this class of charities loom large.

This post will argue that competent efforts to reduce existential risk reduction are still likely to be optimal, despite BA. The argument will have three parts:

  1. BA differs from fully Bayesian reasoning, so that BA risks double-counting priors.

  2. The models in Karnofsky’s posts, when applied to existential risk, boil down to our having prior knowledge that the claimed returns are virtually impossible. (Moreover, similar models without extreme priors don’t lead to the same conclusions.)

  3. We don’t have such prior knowledge. Extreme priors would have implied false predictions in the past, imply unphysical predictions for the future, and are justified neither by our past experiences nor by any other considerations.

Claim 1 is not essential to the conclusion. While Claim 2 seems worth expanding on, it’s Claim 3 that makes up the core of the controversy. Each of these concerns will be addressed in turn.

Before responding to the claims themselves, however, it’s worth discussing a highly simplified model that will illustrate what Karnofsky’s basic point is.

continue reading »
Comment author: Elithrion 16 March 2013 06:23:52PM 2 points [-]

I'm surprised this post doesn't at least mention temporal discounting. Even if it's somewhat unpopular in utilitarian circles, it's sufficiently a part of mainstream assessments of the future and of basic human psychology that I would think its effects on astronomical waste (and related) arguments should at the very least be considered.

Comment author: steven0461 16 March 2013 09:36:45PM 5 points [-]

The post discusses the limiting case where astronomical waste has zero importance and the only thing that matters is saving present lives. Extending that to the case where astronomical waste has some finite level of importance based on time discounting seems like a matter of interpolating between full astronomical waste and no astronomical waste.

View more: Next