Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Wei_Dai 22 June 2017 06:02:17PM *  0 points [-]

If we try to answer the question now, it seems very likely we'll get the answer wrong (given my state of uncertainty about the inputs that go into the question). I want to keep civilization going until we know better how to answer these types of questions. For example if we succeed in building a correctly designed/implemented Singleton FAI, it ought to be able to consider this question at leisure, and if it becomes clear that the existence of mature suffering-hating civilizations actually causes more suffering to be created, then it can decide to not make us into a mature suffering-hating civilization, or take whatever other action is appropriate.

Are you worried that by the time such an FAI (or whatever will control our civilization) figures out the answer, it will be too late? (Why? If we can decide that x-risk reduction is bad, then so can it. If it's too late to alter or end civilization at that point, why isn't it already too late for us?) Or are you worried more that the question won't be answered correctly by whatever will control our civilization?

Comment author: paulfchristiano 23 June 2017 06:53:13AM 0 points [-]

If you are concerned exclusively with suffering, then increasing the number of mature civilizations is obviously bad and you'd prefer that the average civilization not exist. You might think that our descendants are particularly good to keep around, since we hate suffering so much. But in fact almost all s-risks occur precisely because of civilizations that hate suffering, so it's not at all clear that creating "the civilization that we will become on reflection" is better than creating "a random civilization" (which is bad).

To be clear, even if we have modest amounts of moral uncertainty I think it could easily justify a "wait and see" style approach. But if we were committed to a suffering-focused view then I don't think your argument works.

Comment author: Wei_Dai 21 June 2017 10:19:48PM 3 points [-]

Do you think that it's clear/very likely that it is net helpful for there to be more mature suffering-hating civilizations? (On the suffering-focused perspective.)

My intuition is that there is no point in trying to answer questions like these before we know a lot more about decision theory, metaethics, metaphilosophy, and normative ethics, so pushing for a future where these kinds of questions eventually get answered correctly (and the answers make a difference in what happens) seems like the most important thing to do. It doesn't seem to make sense to try to lock in some answers (i.e., make our civilization suffering-hating or not suffering-hating) on the off chance that when we figure out what the answers actually are, it will be too late. Someone with much less moral/philosophical uncertainty than I do would perhaps prioritize things differently, but I find it difficult to motivate myself to think really hard from their perspective.

Comment author: paulfchristiano 22 June 2017 04:36:46PM 0 points [-]

This question seems like a major input into whether x-risk reduction is useful.

Comment author: Wei_Dai 21 June 2017 11:17:20AM *  2 points [-]

I don't recall seeing any argument for s-risks being a particularly plausible category of risks, let alone one of the most important ones.

There was some discussion back in 2012 and sporadically since then. (ETA: You can also do a search for "hell simulations" and get a bunch more results.)

I never saw anyone draw the conclusion that "hey, this looks like an important subcategory of x-risks that warrants separate investigation and dedicated work to avoid".

I've always thought that in order to prevent astronomical suffering, we will probably want to eventually (i.e., after a lot of careful thought) build an FAI that will colonize the universe and stop any potential astronomical suffering arising from alien origins and/or try to reduce suffering in other universes via acausal trade etc., so the work isn't very different from other x-risk work. But now that the x-risk community is larger, maybe it does make sense to split out some of the more s-risk specific work?

Comment author: paulfchristiano 21 June 2017 05:24:53PM 1 point [-]

I've always thought that in order to prevent astronomical suffering, we will probably want to eventually (i.e., after a lot of careful thought) build an FAI that will colonize the universe and stop any potential astronomical suffering arising from alien origins and/or try to reduce suffering in other universes via acausal trade etc., so the work isn't very different from other x-risk work.

It seems like the most likely reasons to create suffering come from the existence of suffering-hating civilizations. Do you think that it's clear/very likely that it is net helpful for there to be more mature suffering-hating civilizations? (On the suffering-focused perspective.)

Comment author: Kaj_Sotala 20 June 2017 08:53:09PM 2 points [-]

especially given that suffering-focused ethics seems to somehow be connected with distrust of philosophical deliberation

Can you elaborate on what you mean by this? People like Brian or others at FRI don't seem particularly averse to philosophical deliberation to me...

This also seems like an attractive compromise more broadly: we all spend a bit of time thinking about s-risk reduction and taking the low-hanging fruit, and suffering-focused EAs do less stuff that tends to lead to the destruction of the world.

I support this compromise and agree not to destroy the world. :-)

Comment author: paulfchristiano 21 June 2017 05:20:47PM 1 point [-]

Can you elaborate on what you mean by this? People like Brian or others at FRI don't seem particularly averse to philosophical deliberation to me...

People vary in what kinds of values change they would consider drift vs. endorsed deliberation. Brian has in the past publicly come down unusually far on the side of "change = drift," I've encountered similar views on one other occasion from this crowd, and I had heard second hand that this was relatively common.

Brian or someone more familiar with his views could speak more authoritatively to that aspect of the question, and I might be mistaken about the views of the suffering-focused utilitarians more broadly.

Comment author: cousin_it 20 June 2017 05:20:58PM *  4 points [-]

Paul, thank you for the substantive comment!

Carl's post sounded weird to me, because large amounts of human utility (more than just pleasure) seem harder to achieve than large amounts of human disutility (for which pain is enough). You could say that some possible minds are easier to please, but human utility doesn't necessarily value such minds enough to counterbalance s-risk.

Brian's post focuses more on possible suffering of insects or quarks. I don't feel quite as morally uncertain about large amounts of human suffering, do you?

As to possible interventions, you have clearly thought about this for longer than me, so I'll need time to sort things out. This is quite a shock.

Comment author: paulfchristiano 21 June 2017 05:09:01PM 6 points [-]

large amounts of human utility (more than just pleasure) seem harder to achieve than large amounts of human disutility (for which pain is enough).

Carl gave a reason that future creatures, including potentially very human-like minds, might diverge from current humans in a way that makes hedonium much more efficient. If you assigned significant probability to that kind of scenario, it would quickly undermine your million-to-one ratio. Brian's post briefly explains why you shouldn't argue "If there is a 50% chance that x-risks are 2 million times worse, than they are a million times worse in expectation." (I'd guess that there is a good chance, say > 25%, that good stuff can be as efficient as bad stuff.)

I would further say: existing creatures often prefer to keep living even given the possibility of extreme pain. This can be easily explained by an evolutionary story, which suffering-focused utilitarians tend to view as a debunking explanation: given that animals would prefer keep living regardless of the actual balance of pleasure and pain, we shouldn't infer anything from that preference. But our strong dispreference for intense suffering has a similar evolutionary origin, and is no more reflective of underlying moral facts than is our strong preference for survival.

Comment author: cousin_it 20 June 2017 01:46:47PM *  9 points [-]

Wow!

Many thanks for posting that link. It's clearly the most important thing I've read on LW in a long time, I'd upvote it ten times if I could.

It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct. The argument wasn't as clear to me before. Does anyone have good counterarguments? Why shouldn't we all focus on s-risk from now on?

(Unsong had a plot point where Peter Singer declared that the most important task for effective altruists was to destroy Hell. Big props to Scott for seeing it before the rest of us.)

Comment author: paulfchristiano 20 June 2017 04:11:08PM *  17 points [-]

I don't buy the "million times worse," at least not if we talk about the relevant E(s-risk moral value) / E(x-risk moral value) rather than the irrelevant E(s-risk moral value / x-risk moral value). See this post by Carl and this post by Brian. I think that responsible use of moral uncertainty will tend to push you away from this kind of fanatical view

I agree that if you are million-to-1 then you should be predominantly concerned with s-risk, I think they are somewhat improbable/intractable but not that improbable+intractable. I'd guess the probability is ~100x lower, and the available object-level interventions are perhaps 10x less effective. The particular scenarios discussed here seem unlikely to lead to optimized suffering, only "conflict" and "???" really make any sense to me. Even on the negative utilitarian view, it seems like you shouldn't care about anything other than optimized suffering.

The best object-level intervention I can think of is reducing our civilization's expected vulnerability to extortion, which seems poorly-leveraged relative to alignment because it is much less time-sensitive (unless we fail at alignment and so end up committing to a particular and probably mistaken decision-theoretic perspective). From the perspective of s-riskers, it's possible that spreading strong emotional commitments to extortion-resistance (e.g. along the lines of UDT or this heuristic) looks somewhat better than spreading concern for suffering.

The meta-level intervention of "think about s-risk and understand it better / look for new interventions" seems much more attractive than any object-level interventions we yet know, and probably worth investing some resources in even if you take a more normal suffering vs. pleasure tradeoff. If this is the best intervention and is much more likely to be implemented by people who endorse suffering-focused ethical views, it may be the strongest incentive to spread suffering-focused views. I think that higher adoption of suffering-focused views is relatively bad for people with a more traditional suffering vs. pleasure tradeoff, so this is something I'd like to avoid (especially given that suffering-focused ethics seems to somehow be connected with distrust of philosophical deliberation). Ironically, that gives some extra reason for conventional EAs to think about s-risk, so that the suffering-focused EAs have less incentive to focus on value-spreading. This also seems like an attractive compromise more broadly: we all spend a bit of time thinking about s-risk reduction and taking the low-hanging fruit, and suffering-focused EAs do less stuff that tends to lead to the destruction of the world. (Though here the non-s-riskers should also err on the side of extortion-resistance, e.g. trading with the position of rational non-extorting s-riskers rather than whatever views/plans the s-riskers happen to have.)

An obvious first question is whether the existence of suffering-hating civilizations on balance increases s-risk (mostly by introducing game-theoretic incentives) or decreases s-risk (by exerting their influence to prevent suffering, esp. via acausal trade). If the former, then x-risk and s-risk reduction may end up being aligned. If the latter, then at best the s-riskers are indifferent to survival and need to resort to more speculative interventions. Interestingly, in this case it may also be counterproductive for s-riskers to expand their influence or acquire resources. My guess is that mature suffering-hating civilizations reduce s-risk, since immature suffering-hating civilizations probably provide a significant part of the game-theoretic incentive yet have almost no influence, and sane suffering-hating civilizations will provide minimal additional incentives to create suffering. But I haven't thought about this issue very much.

Comment author: cousin_it 13 June 2017 05:23:02PM *  3 points [-]

Very impressive, I'm happy that Paul ended up there! There's still a lot of neural network black magic though. Stuff like this:

We use standard settings for the hyperparameters: an entropy bonus of β = 0.01, learning rate of 0.0007 decayed linearly to reach zero after 80 million timesteps (although runs were actually trained for only 50 million timesteps), n = 5 steps per update, N = 16 parallel workers, discount rate γ = 0.99, and policy gradient using Adam with α = 0.99 and ε = 10−5.

For the reward predictor, we use 84x84 images as inputs (the same as the inputs to the policy), and stack 4 frames for a total 84x84x4 input tensor. This input is fed through 4 convolutional layers of size 7x7, 5x5, 3x3, and 3x3 with strides 3, 2, 1, 1, each having 16 filters, with leaky ReLU nonlinearities (α = 0.01). This is followed by a fully connected layer of size 64 and then a scalar output. All convolutional layers use batch norm and dropout with α = 0.5 to prevent predictor overfitting.

I know I sound like a retrograde, but how much of that is necessary and how much can be figured out from first principles?

Comment author: paulfchristiano 14 June 2017 04:07:09PM 6 points [-]

If we view the goal as transforming (AI that works) ---> (AI that works and does what we want), then the black magic doesn't seem like a big deal. You just copy it from the (AI that works).

In this case, I also think that using almost any reasonable architecture would work fine.

Comment author: Benquo 08 May 2017 06:49:49AM *  0 points [-]

Private discussion is nearly as efficient as public discussion for information-transmission, but has way fewer political consequences.

If this is a categorical claim, then what are academic journals for? Should we ban the printing press?

If your claim is just that some public forums are too corrupted to be worth fixing, not a categorical claim, then the obvious thing to do is to figure out what went wrong, coordinate to move to an uncorrupted forum, and add the new thing to the set of things we filter out of our new walled garden.

Comment author: paulfchristiano 09 May 2017 03:31:54AM *  1 point [-]

I don't believe that academic journals are an efficient form of information transmission. Academics support academic journals (when they support academic journals) because journals serve other useful purposes.

Often non-epistemic consequences of words are useful, and often they aren't a big deal. I wouldn't use the word "corrupted" to describe "having political consequences," it's the default state of human discussions.

Public discussion is sometimes much more efficient than private discussion. A central example is when the writer's time is much more valuable than the reader's time, or when it would be high-friction for the reader to buy off the writer's time. (Though in this case, what's occurring isn't really discourse.) There are of course other examples.

Doing things like "writing down your thoughts carefully, and then reusing what you've written down" is important whether discussion occurs in public or private.

Comment author: tristanm 08 May 2017 02:17:49PM 0 points [-]

Can you define more precisely what you mean by "private discussion?" If by that you mean that all discourse is constrained to one-on-one conversations where the contents are not available to anyone else, I don't intuitively see how this would be less destructive and more collaborative. It seems to require that a lot of interactions must occur before every person is up to date on the collective group knowledge, and also that for each conversation there is a lossy compression going on - it's difficult for each conversation to carry the contents of each person's history of previous conversations.

On the other hand, if you're advocating for information to be filtered when transmitted beyond the trusted group, but flows freely within the trusted group, I believe that is less complicated and more efficient and I would have fewer objections to that.

Comment author: paulfchristiano 09 May 2017 03:21:15AM 0 points [-]

By "private discussion" I mean discussions amongst small groups, in contrast with discussions amongst large groups. Both of them occur constantly. I've claimed that in general political considerations cut in favor of having private discussions more often than you otherwise would, I didn't mean to be making a bold claim.

Comment author: Benquo 05 May 2017 06:21:23AM 0 points [-]

My intuition around whether some people are intrinsically bad (as opposed to bad at some things), is that it's an artifact of systems of dominance like schools designed to create insecure attachment, and not a thing nonabused humans will think of on their own.

Comment author: paulfchristiano 08 May 2017 02:13:28AM 2 points [-]

I think this is very unlikely.

View more: Next