LESSWRONG
LW

All of Nathan1123's Comments + Replies

I didn't mean it to be so simplistic. I am just considering that if there is a known limitation of AI, no matter how powerful it is, that could be used as the basis of a system an AI could not circumvent. For example, if there was a shutdown system where the only way to disable it would require solving the halting problem.

2Viliam2mo

If you knew how to build such shutdown system, you could probably also build one that cannot be disabled at all (e.g. would require solving a literally impossible problem, like proving that 1 = 0).

Eisenhower's Atoms for Peace Speech

Nathan11232y10

While this is an example of international cooperation in the face of mutually-assured destruction, there is some historical context that shows why this is effective, in my opinion: First, because the destructive power of nuclear weapons was already realized by Hiroshima and Nagasaki. Second, because the US no longer had a monopoly on nuclear weapons and determined that overcoming the USSR by force was no longer an option. Third, because the world was still recovering from the Second World War and shared a universal desire for peaceful resolutions.

Is there any literature on using socialization for AI alignment?

Nathan11232y30

The implication that I didn't think to spell out is that the AI should be programmed with the capacity for empathy. It's more of a proposal of system design than a proposal of governance. Granted, the specifics of that design would be its own discussion entirely

A Confession about the LessWrong Team

Nathan11232y30

You were the chosen one, Anakin!

Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky

Nathan11232y70

I think the harsh truth is that no one cared about Nuclear Weapons until Hiroshima was bombed. The concept of one nation "disarming" AI would never be appreciated until somebody gets burned.

1lariat2y

We cared about the Nazis not getting nuclear weapons before us. I am sure if after WW2 we agreed with the Soviets that we would pause nuclear research and not research the hydrogen bomb, both sides would have signed the treaty and continued research covertly while hoping the other side sticks with the treaty. I don't think you need game theory to figure out that neither side could take the risk of not researching. It seems incredibly naive to believe this exact process would not also play out with AI.

Could the simulation argument also apply to dreams?

Nathan11233y10

I wonder if you could expand more on this observation. So you are saying that a dream is operating on a very limited dataset on a person, not an exact copy of information ("full description"). Do I understand right?

I sort of do intend of it as a kind of reductio, unless people find reason for this "Dream Hypothesis" to be taken seriously.

4Charlie Steiner3y

>So you are saying that a dream is operating on a very limited dataset on a person, not an exact copy of information ("full description"). Do I understand right? Slightly different - I mean that when I talk about someone appearing in my dream, I am being somewhat loose with the definition of that "someone." Like, I would agree that my dream of the person is much smaller and is a poor copy of the real person. But the thing I was trying to point at is the broad definition by which I might equivocate between them, e.g. calling them by the same name.

What is the probability that a superintelligent, sentient AGI is actually infeasible?

Nathan11233y10

I don't see anything in that scenario that prevents a human-level AGI from using a collection of superintelligent tool AIs with a better interface to achieve feats of intelligence that humans cannot, even with the same tool AIs.

At that point, it wouldn't functionally be different than a series of tool AIs being controlled directly by a human operator. If that poses risk, then mitigations could be extrapolated to the combined-system scenario.

What fundamental law of the universe would set a limit right there, out of all possible capacities across every

Nathan11233y10

Isn't a deceptive agent the hallmark of unfriendly AI? In what scenarios does a dishonest agent reflect a good design?

Of course, I didn't mean to say that TDT always keeps its promises, just that it is capable of doing so in scenarios like Parfit's Hitchhiker, where CDT is incapable of doing so.

How would two superintelligent AIs interact, if they are unaligned with each other?

Nathan11233y10

Thanks, I'll be sure to check them out

Are ya winning, son?

Nathan11233y30

C,C is second-best, you prefer D,C and Nash says D,D is all you should expect. C,C is definitely better than C,D or D,D, so in the special case of symmetrical decisions, it's winning. It bugs me as much as you that this part gets glossed over so often.

I see what you mean, it works as long as both sides have roughly similar behavior.

Counterfactual Mugging is a win to pay off, in a universe where that sort of thing happens. You really do want to be correctly predicted to pay off, and enjoy the $10K in those cases where the coin goes your way.

For me... (read more)

3Vladimir_Nesov3y

It's less than that, you don't know that you are real and not the hypothetical. If you are the hypothetical, paying up is useful for the real one. This means that even if you are the real one (which you don't know), you should pay up, or else the hypothetical you wouldn't. Winning behavior/policy is the best map from what you observe/know to decisions, and some (or all) of those observations/knowledge never occur, or even never could occur.

How would Logical Decision Theories address the Psychopath Button?

Nathan11233y20

That's a very interesting and insightful dissection of the problem. Do you think there might be a problem in the post that I copied the thought experiment from (which said that CDT presses, and EDT doesn't), or did I make a mistake of taking it out of context?

3Vladimir_Nesov3y

The context seems to be * A Egan (2007) Some Counterexamples to Causal Decision Theory There, it's related to Smoking Lesion, which has a tradition of interpreting it that suggests how to go about interpreting "only a psychopath would press such a button" as well. But that tradition is also convoluted (see "tickle defense"; it might be possible to contort this into an argument that EDT recommends pressing the button in Psychopath Button, not sure).

How would Logical Decision Theories address the Psychopath Button?

Nathan11233y10

Ok, if the button is thought of the "second agent" then I would guess TDT would not press the button. TDT would reason that the button will make the decision that the person who pressed the button is a psychopath, and therefore Paul should precommit to not press the button. Is that the right way to approach it?

Why I Am Skeptical of AI Regulation as an X-Risk Mitigation Strategy

Nathan11233y10

When it comes to AI regulation, a certain train of thought comes to my mind:

Because a superintelligent AI has never existed, we can assume that creating one requires an enormous amount of energy and resources.
Due to global inequality, certain regions of the world have exponentially more access to energy and resources than other regions.
Therefore, when creating an AGI becomes possible, only a couple of regions of the world (and only a small number of people in these regions) will have the capability of doing so.

Therefore, enforcement of AI regulations... (read more)

How do I know if my first post should be a post, or a question?

Nathan11233y10

Okay, thank you!