acgt — LessWrong

LESSWRONG
LW

acgt — LessWrong

Replying toCounterarguments to the basic AI x-risk case

Counterarguments to the basic AI x-risk case

I predict that this would look at least as weird and nonhuman as those deep dream images if not more so

This feels like something we should just test? I don’t have access to any such model but presumably someone does and can just run the experiment? Bcos it seems like peoples hunches are varying a lot here

Replying toCounterarguments to the basic AI x-risk case

acgt3y

Counterarguments to the basic AI x-risk case

if you ask them for the faciest possible thing, it's not very human!facelike

Is this based on how these models actually behave or just what the OP expects? Because is seems to just be begging the question if the latter

Replying toNate Soares on the Ultimate Newcomb's Problem

acgt4y

Nate Soares on the Ultimate Newcomb's Problem

This doesn’t seem true, at least in the sense of strict ranking? In the EDT case: if Omega’s policy is to place a prime in Box 1 whenever Omicron chooses a composite number (instead of matching Omicron when possible), then it predicts the EDT agent will choose only Box 1 and so is a stable equilibrium. But since it also always places a different prime whenever Omicron chooses a prime, EDT never sees matching numbers and so always one-boxes, therefore its expected earnings are no less than FDT

Replying toWhat are the best elementary math problems you know?

acgt4y

What are the best elementary math problems you know?

The answer to all of them is 1/e?

-1

Replying toWhat are the best elementary math problems you know?

acgt4y

What are the best elementary math problems you know?

Curious what the solution to this one is? Couldn’t figure it out

Replying toIt Looks Like You're Trying To Take Over The World

acgt4y*

It Looks Like You're Trying To Take Over The World

Doesn’t this argument also work against the idea that they would self-modify in the “normal” finite way? It can’t currently represent the number which it’s building a ton of new storage to help contain, so it can’t make a pairwise comparison to say the latter is better, nor can it simulate the outcome of doing this and predict the reward it would get

Maybe you say it’s not directly making a pairwise comparison but making a more abstract step of reasoning like “I can’t predict that number but I know it’s gonna be bigger that what I have now, me with augmented memory will still be aligned with me in terms of its... (read more)

Replying toIt Looks Like You're Trying To Take Over The World

acgt4y

It Looks Like You're Trying To Take Over The World

This is a really interesting point. It seems like it goes even further - if the agent was only trying to maximise future expected reward, not only would it be ambivalent between temporary and permanent “Nirvana”, it would be ambivalent between strategies which achieved Nirvana with arbitrarily different probabilities right (maybe with some caveats about how it would behave if it predicted the strategy might lead to negative-infinite states)

So if a sufficiently fleshed out agent is going to assign a non-zero probability of Nirvana to every - or at least most - strategies since it’s not impossible, then won’t our agent just suddenly become incredibly apathetic and just sit there as soon... (read more)

Replying toI Really Don't Understand Eliezer Yudkowsky's Position on Consciousness

acgt4y

I Really Don't Understand Eliezer Yudkowsky's Position on Consciousness

Your comments here and some comments Eliezer had made elsewhere seem to imply he believes he has at least in large party “solved” consciousness. Is this fair? And if so is there anywhere he has written up this theory/analysis in depth - because surely if correct this would be hugely important

I’m kind of assuming that whatever Eliezer’s model is, the bulk of the interestingness isn’t contained here and still needs to be cashed out, because the things you/he list (needing to examine consciousness through the lens of the cognitive algorithms causing our discussions of it, the centrality of self-modely reflexive things to consciousness etc.) are already pretty well explored and understood in... (read more)

-1

Replying toQuestion/Issue with the 5/10 Problem

acgt4y

Question/Issue with the 5/10 Problem

Yeah sure, like there's a logical counterfactual strand of the argument but that's not the topic I'm really addressing here - I find those a lot less convincing so my issue here is around the use of Lobian uncertainty specifically. There's an step very specific to this species of argument that proving that □P will make P true when P is about the outcomes of the bets, because you will act based on the proof of P.

This is invoking Lob's Theorem in a manner which is very different from the standard counterpossible principle of explosion stuff. And I'm really wanting to discuss that step specifically because I don't think it's valid, and if the above argument is still representative of at least a strand of relevant argument then I'd be grateful for some clarification on how (3.) is supposed to be provable by the agent, or how my subsequent points are invalid.

Question/Issue with the 5/10 Problem

acgt

I'm not sure if the 5/10 problem and the surrounding Löbian uncertainty is still an issue/area of research, but I've been struggling with the validity of this argument lately - I doubt this is a novel point so if this is addressed somewhere I'd appreciate being corrected. On the off chance that this hasn't been explained elsewhere I'd be really interested to hear peoples' thoughts

The proof as I understand it is roughly as below, with "A" referring to the agent's output, "U" referring to the environment's output, and "□" standing for the provability predicate:

(A = 5) →(U= 5)
(A = 5) →((A = 10)→(U=0))
□(((A = 5) → (U = 5) ∧ ((A =

... (read 705 more words →)

Replying toA Semitechnical Introductory Dialogue on Solomonoff Induction

acgt4y

A Semitechnical Introductory Dialogue on Solomonoff Induction

I think there’s a sense in which some problems can be uncomputable even with infinite compute no? For example if the Halting problem were computable even with literally infinite time, then we could construct a machine that halted when given its own description iff it ran forever when given its own description. I do think theres a distinction beyond just “arbitrarily large finite compute vs. infinite compute”. It seems like either some problems have to be uncomputable even by a hyper-computer, or else the concept of infinite compute time is less straightforward than it seems

I totally agree on your other points though, I think the concept of bounded Solomonoff induction could be interesting in itself, although I presume with it you lose all the theoretical guarantees around bounded error. Would definitely be interested to see if there’s literature on this