Sebastian_Hagen

Superintelligence 29: Crunch time

I'll be there.

Sebastian_Hagen11y60

It's the most important problem of this time period, and likely human civilization as a whole. I donate a fraction of my income to MIRI.

Superintelligence 27: Pathways and enablers

Superintelligence 25: Components list for acquiring values

Which means that if we buy this [great filter derivation] argument, we should put a lot more weight on the category of 'everything else', and especially the bits of it that come before AI. To the extent that known risks like biotechnology and ecological destruction don't seem plausible, we should more fear unknown unknowns that we aren't even preparing for.

True in principle. I do think that the known risks don't cut it; some of them might be fairly deadly, but even in aggregate they don't look nearly deadly enough to contribute much to the great filter.... (read more)

Superintelligence 23: Coherent extrapolated volition

This issue is complicated by the fact that we don't really know how much computation our physics will give us access to, or how relevant negentropy is going to be in the long run. In particular, our physics may allow access to (countably or more) infinite computational and storage resources given some superintelligent physics research.

For Expected Utility calculations, this possibility raises the usual issues of evaluating potential infinite utilities. Regardless of how exactly one decides to deal with those issues, the existence of this possibility does shift things in favor of prioritizing for safety over speed.

Superintelligence 23: Coherent extrapolated volition

I used "invariant" here to mean "moral claim that will hold for all successor moralities".

A vastly simplified example: at t=0, morality is completely undefined. At t=1, people decide that death is bad, and lock this in indefinitely. At t=2, people decide that pleasure is good, and lock that in indefinitely. Etc.

An agent operating in a society that develops morality like that, looking back, would want to have all the accidents that lead to current morality to be maintained, but looking forward may not particularly care about how the rem... (read more)

Superintelligence 23: Coherent extrapolated volition

That does not sound like much of a win. Present-day humans are really not that impressive, compared to the kind of transhumanity we could develop into. I don't think trying to reproduce entites close to our current mentality is worth doing, in the long run.

Sebastian_Hagen11y50

While that was phrased in a provocative manner, there /is/ an important point here: If one has irreconcilable value differences with other humans, the obvious reaction is to fight about them; in this case, by competing to see who can build an SI implementing theirs first.

I very much hope it won't come to that, in particular because that kind of technology race would significantly decrease the chance that the winning design is any kind of FAI.

In principle, some kinds of agents could still coordinate to avoid the costs of that kind of outcome. In practice, our species does not seem to be capable of coordination at that level, and it seems unlikely that this will change pre-SI.

Superintelligence 23: Coherent extrapolated volition

Superintelligence 23: Coherent extrapolated volition

True, but it would nevertheless make for a decent compromise. Do you have a better suggestion?

Superintelligence 23: Coherent extrapolated volition

allocating some defense army patrol keeping the borders from future war?

Rather than use traditional army methods, it's probably more efficient to have the SI play the role of Sysop in this scenario, and just deny human actors access to base-layer reality; though if one wanted to allow communication between the different domains, the sysop may still need to run some active defense against high-level information attacks.

That seems wrong.

As a counterexample, consider a hypothetical morality development model where as history advances, human morality keeps accumulating invariants, in a largely unpredictable (chaotic) fashion. In that case modern morality would have more invariants than that of earlier generations. You could implement a CEV from any time period, but earlier time periods would lead to some consequences that by present standards are very bad, and would predictably remain very bad in the future; nevertheless, a present-humans CEV would still work just fine.

Perhaps. But it is a desperate move, both in terms of predictability and in terms of the likely mind crime that would result in its implementation, since the conceptually easiest and most accurate ways to model other civilizations would involve fully simulating the minds of their members.

If we had to do it, I would be much more interested in aiming it at slightly modified versions of humanity as opposed to utterly alien civilizations. If everyone in our civilization had taken AI safety more seriously, and we could have coordinated to wait a few hundred yea... (read more)

I agree, the actual local existence of other AIs shouldn't make a difference, and the approach could work equally either way. As Bostrom says on page 198, no communication is required.

Nevertheless, for the process to yield a useful result, some possible civilization would have to build a non-HM AI. That civilization might be (locally speaking) hypothetical or simulated, but either way the HM-implementing AI needs to think of it to delegate values. I believe that's what footnote 25 gets at: From a superrational point of view, if every possible civilization (or every one imaginable to the AI we build) at this point in time chooses to use an HM approach to value coding, it can't work.

Powerful AIs are probably much more aware of their long-term goals and able to formalize them than a heterogenous civilization is. Deriving a comprehensive morality for post-humanity is really hard, and indeed CEV is designed to avoid the need of having humans do that. Doing it for an arbitrary alien civilization would likely not be any simpler.

Whereas with powerful AIs, you can just ask them which values they would like implemented and probably get a good answer, as proposed by Bostrom.

Superintelligence 20: The value-loading problem

Sebastian_Hagen11y40

The Hail Mary and Christiano's proposals, simply for not having read about them before.

Sebastian_Hagen11y00

Davis massively underestimates the magnitude and importance of the moral questions we haven't considered, which renders his approach unworkable.

I feel safer in the hands of a superintelligence who is guided by 2014 morality, or for that matter by 1700 morality, than in the hands of one that decides to consider the question for itself.

I don't. Building a transhuman civilization is going to raise all sorts of issues that we haven't worked out, and do so quickly. A large part of the possible benefits are going to be contingent on the controlling system be... (read more)

Superintelligence 20: The value-loading problem