dankane — LessWrong

LESSWRONG
LW

dankane — LessWrong

I feel like your discussion of predictors makes a few not-necessarily-warranted assumptions about how the predictor deals with self-reference. Then again, I guess anything that doesn't do this fails as a predictor in a wide range of useful cases. It predicts a massive fire will kill 100 people, and so naturally this prediction is used to invalidate the original prediction.

But there is a simple-ish fix. What if you simply ask it to make predictions about what would happen if it (and say all similar predictors) suddenly stopped functioning immediately before this prediction was returned?

Markets are Anti-Inductive

dankane9y00

Unless you can explain to me how prediction markets are going to break the pattern that two different shares of the same stock have correlated prices.

I'm actually not sure how prediction markets are supposed to have an effect on this issue. My issue is not that people have too much difficulty recognizing patterns. My issue is that some patterns once recognized do not provide incentives to make that pattern disappear. Unless you can tell me how prediction markets might fix this problem, your response seems like a bit of a non-sequitur.

Markets are Anti-Inductive

dankane9y10

This seems like too general a principle. I agree that in many circumstances, public knowledge of a pattern in pricing will lead to effects causing that pattern to disappear. However, it is not clear to me that this is always to case, or that the size of the effect will be sufficient to complete cancel out the original observation.

For example, I observe that two different units of Google stock have prices that are highly correlated with each other. I doubt that this observation will cause separate markets to spring up giving wildly divergent prices to diffe... (read more)

The Strangest Thing An AI Could Tell You

dankane10y10

We probably couldn't even talk ourselves out of this box.

I don't know... That sounds a lot like what an AI trying to talk itself out of a box would say.

The Mystery of the Haunted Rationalist

dankane10y30

Hmm... I would probably explain the threshold for staying in the house not as an implicit expected probability computation, but an evaluation of the price of the discomfort associated with staying in a location that you find spooky. At least for me, I think that the part of my mind that knows that ghosts do not exist would have no trouble controlling whether or not I remain in the house or not. However, it might well decide that it is not worth the $10 that I would receive to spend the entire night in a place where some other piece of my mind is constantly yelling at me to run away screaming.

An overall schema for the friendly AI problems: self-referential convergence criteria

dankane11y40

It's just that such self-referential criteria as reflective equilibrium are a necessary condition

Why? The only example of adequately friendly intelligent systems that we have (i.e. us) don't meet this condition. Why should reflective equilibrium be a necessary condition for FAI?

Taking Effective Altruism Seriously

dankane11y20

That may be true (at least to the degree to which it is sensible to assign a specific cause to a given util). However, it is not very good evidence that investment in first world economies is the most effective way to generate utils in Africa.

Taking Effective Altruism Seriously

dankane11y30

OK. So suppose that I grant your claim that donations to sub-Saharan Africa will not substantially affect the size of the future economic pie, but that other investments will. I claim that there may still be reason to donate there.

I grant that such a donation will produce fewer dollars of value than investing in capitol infrastructure. On the other hand dollars is not the objective, utils are. We can reasonably assume that marginal utility of an extra dollar for a given person is decreasing as that person's wealth increases. We can reasonably expect that w... (read more)

The Truly Iterated Prisoner's Dilemma

dankane11y00

[I realize that I missed the train and probably very few people will read this, but here goes]

So in non-iterated prisoner's dilemma, defect is a dominant strategy. No matter what the opponent is doing, defecting will always give you the best possible outcome. In iterated prisoner's dilemma, there is no longer a dominant strategy. If my opponent is playing Tit-for-Tat, I get the best outcome by cooperating in all rounds but the last. If my opponent ignores what I do, I get the best outcome by always defecting. It is true that all defects is the unique Nash ... (read more)

An Introduction to Löb's Theorem in MIRI Research

dankane11y00

I think that the way that humans predict other humans is the wrong way to look at this, and instead consider how humans would reason about the behavior of an AI that they build. I'm not proposing simply "don't use formal systems", or even "don't limit yourself exclusively to a single formal system". I am actually alluding to a far more specific procedure:

Come up with a small set of basic assumptions (axioms)
Convince yourself that these assumptions accurately describe the system at hand
Try to prove that the axioms would imply the des

dankane11y00

Yes, obviously. We solve the Lobstacle by not ourselves running on formal systems and sometimes accepting axioms that we were not born with (things like PA). Allowing the AI to only do things that it can prove will have good consequences using a specific formal system would make it dumber than us.

An Introduction to Löb's Theorem in MIRI Research

dankane11y40

Actually, why is it that when the Lobian obstacle is discussed that it seem to always be in reference to an AI trying to determine if a successor AI is safe, and not an AI trying to determine whether or not it, itself, is safe?

An Introduction to Löb's Theorem in MIRI Research

dankane11y20

Question: If we do manage to build a strong AI, why not just let it figure this problem out on its own when trying to construct a successor? Almost definitionally, it will do a better job of it than we will.

Newcomblike problems are the norm

dankane12y20

Relatedly, with your interview example, I think that perhaps a better model is that whether a person is confident or shy is not depending on whether they believe that they will be bold or not, but upon the degree to which they care about being laughed at. If you are confident, you don't care about being laughed at and might as well be bold. If you are afraid of being laughed at, you already know that you are shy and thus do not gain anything by being bold.

Newcomblike problems are the norm

dankane12y20

I think my bigger point is that you don't seem to make any real argument as to which case we are in. For example, consider the following model of how people's perception of my trustworthiness might be correlated to my actual trustworthiness: There are two causal chains: My values -> Things I say -> Peoples' perceptions My values -> My actions So if I value trustworthiness, I will not, for example talk much about wanting to avoid being sucker (in contexts where it would refer to be doing trustworthy things). This will influence peoples' perceptions... (read more)

Newcomblike problems are the norm

dankane12y150

Newcomblike problems occur whenever knowledge about what decision you will make leaks into the environment. The knowledge doesn't have to be 100% accurate, it just has to be correlated with your eventual actual action.

This is far too general. The way in which information is leaking into the environment is what separates Newcomb's problem from the smoking lesion problem. For your argument to work you need to argue that whatever signals are being picked up on would change if the subject changed their disposition, not merely that these signals are correlated with the disposition.

Causal decision theory is unsatisfactory

dankane12y00

Sorry. I'm not quite sure what you're saying here. Though, I did ask for a specific example, which I am pretty sure is not contained here.

Though to clarify, by "reading your mind" I refer to any situation in which the scenario you face (including the given description of that scenario) depends directly on which program you are running and not merely upon what that program outputs.

Causal decision theory is unsatisfactory

dankane12y00

Well, yes. Then again, the game was specified as PD against BOT^CDT not as PD against BOT^{you}. It seems pretty clear that for X not equal to CDT that it is not the case that X could achieve the result CC in this game. Are you saying that it is reasonable to say that CDT could achieve a result that no other strategy could just because it's code happens to appear in the opponent's program?

I think that there is perhaps a distinction to be made between things that happen to be simulating your code and this that are causally simulating your code.

Causal decision theory is unsatisfactory

dankane12y00

OK. Fine. Point taken. There is a simple fix though.

MBOT^X(Y) = X'(MBOT^X) where X' is X but with randomized irrelevant experiences.

In order to produce this properly, MBOT only needs to have your prior (or a sufficiently similar probability distribution) over irrelevant experiences hardcoded. And while your actual experiences might be complicated and hard to predict, your priors are not.

Causal decision theory is unsatisfactory

dankane12y00

No. BOT^CDT = DefectBot. It defects against any opponent. CDT could not cause it to cooperate by changing what it does.

If it cooperated, it would get CC instead of DD.

Actually if CDT cooperated against BOT^CDT it would get $3^^^3. You can prove all sorts of wonderful things once you assume a statement that is false.

Depending on the exact setup, "irrelevant details in memory" are actually vital information that allow you to distinguish whether you are "actually playing" or are being simulated in BOT's mind.

OK... So UDT^Red and U... (read more)