xpym

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by
xpym30

Is this related to the bounty, or a separate project?

xpym30

Furthermore, most of these problems can be addressed just fine in a Bayesian framework. In Jaynes-style Bayesianism, every proposition has to be evaluated in the scope of a probabilistic model; the symbols in propositions are scoped to the model, and we can’t evaluate probabilities without the model. That model is intended to represent an agent’s world-model, which for realistic agents is a big complicated thing.

It still misses the key issue of ontological remodeling. If the world-model is inadequate for expressing a proposition, no meaningful probability could be assigned to it.

xpym64

Killing oneself with high certainty of effectiveness is more difficult than most assume.

Dying naturally also isn't as smooth as plenty of people assume. I'm pretty sure that "taking things into your hands" leads to higher amount of expected suffering reduction in most cases, and it's not informed rational analysis that prevents people from taking that option.

If a future hostile agent just wants to maximize suffering, will foregoing preservation protect you from it?

Yes? I mean, unless we entertain some extreme abstractions like it simulating all possible minds of certain complexity or whatever.

xpym10

This isn’t really a problem with alignment

I'd rather put it that resolving that problem is a prerequisite for the notion of "alignment problem" to be meaningful in the first place. It's not technically a contradiction to have an "aligned" superintelligence that does nothing, but clearly nobody would in practice be satisfied with that.

xpym10

Because humans have incoherent preferences, and it's unclear whether a universal resolution procedure is achievable. I like how Richard Ngo put it, "there’s no canonical way to scale me up".

xpym10

Hmm, right. You only need assume that there are coherent reachable desirable outcomes. I'm doubtful that such an assumption holds, but most people probably aren't.

xpym30

We’ll say that a state is in fact reachable if a group of humans could in principle take actions with actuators - hands, vocal chords, etc - that could realize that state.

The main issue here is that groups of humans may in principle be capable of great many things, but there's a vast chasm between "in principle" and "in practice". A superintelligence worthy of the name would likely be able to come up with plans that we wouldn't in practice be able to even check exhaustively, which is the sort of issue that we want alignment for.

xpym10

I think that saying that "executable philosophy" has failed is missing Yudkowsky's main point. Quoting from the Arbital page:

To build and align Artificial Intelligence, we need to answer some complex questions about how to compute goodness

He claims that unless we learn how to translate philosophy into "ideas that we can compile and run", aligned AGI is out of the question. This is not a worldview, but an empirical proposition, the truth of which remains to be determined.

There's also an adjacent worldview, which suffuses the Sequences, that it's possible in the relatively short term to become much more generally "rational" than even the smartest uninitiated people, "faster than science" etc, and that this is chiefly rooted in Bayes, Solomonoff &Co. It's fair to conclude that this has largely failed, and IMO Chapman makes a convincing case that this failure was unavoidable. (He also annoyingly keeps hinting that there is a supremely fruitful "meta-rational" worldview instead that he's about to reveal to the world. Any day now. I'm not holding my breath.)

xpym132

the philosophy department thinks you should defect in a one-shot prisoners’ dilemma

Without further qualifications, shouldn't you? There are plenty of crazy mainstream philosophical ideas, but this seems like a strange example.

xpym30

Yes, I buy the general theory that he was bamboozled by misleading maps. My claim is that it's precisely the situation where a compass should've been enough to point out that something had gone wrong early enough for the situation to have been salvageable, in a way that sun clues plausibly wouldn't have.

Load More