owencb

Sequences

On Wholesomeness

Wikitag Contributions

Comments

Sorted by
owencb30

It seems fine to me to have the goalposts moving, but then I think it's important to trace through the implications of that. 

Like, if the goalposts can move then this seems like perhaps the most obvious way out of the predicament; to keep the goalposts ever ahead of AI capabilities. But when I read your post I get the vibe that you're not imagining this as a possibility?

owencb50

If we are going to build these agents without "losing the game", either (a) they must have goals that are compatible with human interests, or (b) we must (increasingly accurately) model and enforce limitations on their capabilities. If there's a day when an AI agent is created without either of these conditions, that's the day I'd consider humanity to have lost.

Something seems funny to me here.

It might be to do with the boundaries of your definition. If humans agents are getting empowered by strategically superhuman (in an everyday sense) AI systems (agentic or otherwise), perhaps that raises the bar for what counts as superhuman for the purposes of this post? If so I think the argument would make sense to me, but it feels a bit funny to me to have this definition which is such a moving goalpost, and also might never get crossed even as AI gets arbitrarily powerful.

Alternatively, it might be that your definition is kind of an everyday one, but in that case your conclusion seems pretty surprising. Like it seems easy to me to imagine worlds where there are some agents without either of those conditions, but that they're not better than the empowered humans.

Or perhaps something else is going on. Just trying to voice my confusions. 

I do appreciate the attempt to analyse which kinds of capabilities are actually crucial.

owencb20

It's been a long time since I read those books, but if I'm remembering roughly right: Asimov seems to describe a world where choice is in a finely balanced equilibrium with other forces (I'm inclined to think: implausibly so -- if it could manage this level of control at great distances in time, one would think that it could manage to exert more effective control over things at somewhat less distance).

owencb50

I've now sent emails contacting all of the prize-winners.

owencb20

Actually, on 1) I think that these consequentialist reasons are properly just covered by the later sections. That section is about reasons it's maybe bad to make the One Ring, ~regardless of the later consequences. So it makes sense to emphasise the non-consequentialist reasons.

I think there could still be some consequentialist analogue of those reasons, but they would be more esoteric, maybe something like decision-theoretic, or appealing to how we might want to be treated by future AI systems that gain ascendancy.

owencb20
  1. Yeah. As well as another consequentialist argument, which is just that it will be bad for other people to be dominated. Somehow the arguments feel less natively consequentialist, and so it seems somehow easier to hold them in these other frames, and then translate them into consequentialist ontology if that's relevant; but also it would be very reasonable to mention them in the footnote.
  2. My first reaction was that I do mention the downsides. But I realise that that was a bit buried in the text, and I can see that that could be misleading about my overall view. I've now edited the second paragraph of the post to be more explicit about this. I appreciate the pushback.
owencb30

Ha, thanks!

(It was part of the reason. Normally I'd have made the effort to import, but here I felt a bit like maybe it was just slightly funny to post the one-sided thing, which nudged against linking rather than posting; and also I thought I'd take the opportunity to see experimentally whether it seemed to lead to less engagement. But those reasons were not overwhelming, and now that you've put the full text here I don't find myself very tempted to remove it. :) )

owencb82

The judging process should be complete in the next few days. I expect we'll write to winners at the end of next week, although it's possible that will be delayed. A public announcement of the winners is likely to be a few more weeks.

owencb40

I don't see why (1) says you should be very early. Isn't the decrease in measure for each individual observer precisely outweighed by their increasing multitudes?

owencb73

This kind of checks out to me. At least, I agree that it's evidence against treating quantum computers as primitive that humans, despite living in a quantum world, find classical computers more natural.

I guess I feel more like I'm in a position of ignorance, though, and wouldn't be shocked to find some argument that quantum has in some other a priori sense a deep naturalness which other niche physics theories lack.

Load More