bryjnar

Often the effect of being blinded is that you take suboptimal actions. As you pointed out in your example, if you see the problem then all sorts of cheap ways to reduce the harmful impact occur to you. So perhaps one way of getting to the issue could be to point at that: "I know you care about my feelings, and it wouldn't have made this meeting any less effective to have had it more privately, so I'm surprised that you didn't"?

Replying toHalf-baked AI Safety ideas thread

bryjnar4y

Half-baked AI Safety ideas thread

Wireheading traps.

An agent is "wireheading" if it is taking an action that a) provides it with enormous amounts of utility for little effort, b) is trivial or seemingly unrelated to its "main" utility function or goals.

People have discussed the possibility of an AI wireheading as a problem for getting it to do what you want, "what if it just works out a way to set your metric to +ve infinity and then goes to sleep satisfied?".

But we can also use this as a guard-rail.

A "wireheading trap" is an action which a) is very hard for an AI to do below a level of capacity X, but very easy for it to do... (read more)

Replying toThe Proof of Doom

bryjnar4y

The Proof of Doom

We have trained ML systems to play games, what if we trained one to play a simplified version of the "I'm an AI in human society" game?

Have a population of agents with preferences, the AI is given some poorly specified goal, it has the ability to expand its capabilities etc. You might expect to observe things like a "treacherous turn".

If we could do that it would be quite the scary headline "Researchers simulate the future with AI and it kills us all". Not proof, but perhaps viral and persuasive.

Replying toMeaning and Moral Foundations Theory

bryjnar8y

Meaning and Moral Foundations Theory

I think I would argue that harm/care isn't obviously deontological. Many of the others are indeed about the performance of the action, but I think arguably harm/care is actually about the harm. There isn't an extra term for "and this was done by X".

That might just be me foisting my consequentialist intuitions on people, though.

Replying toIs Rhetoric Worth Learning?

bryjnar8y

Is Rhetoric Worth Learning?

"What if there's an arms race / race to the bottom in persuasiveness, and you have to pick up all the symmetrical weapons others use and then use asymmetrical weapons on top of those?"

Doesn't this question apply to other cases of symmetric/asymmetric weapons just as much?

I think the argument is that you want to try and avoid the arms race by getting everyone to agree to stick to symmetrical weapons because they believe it'll benefit them (because they're right). This may not work if they don't actually believe they're right and are just using persuasion as a tool, but I think it's something we could establish as a community norm in restricted circles at least.

Replying toLocal Validity as a Key to Sanity and Civilization

bryjnar8y

Local Validity as a Key to Sanity and Civilization

The point that the Law needs to be simple and local so that humans can cope with it is also true of other domains. And this throws up an important constraint for people designing systems that humans are supposed to interact with: you must make it possible to reason simply and locally about them.

This comes up in programming (to a man with a nail everything looks like a hammer): good programming practice emphasises splitting programs up into small components that can be reasoned about in isolation. Modularity, compositionality, abstraction, etc. aside from their other benefits, make it possible to reason about code locally.

Of course, some people inexplicably believe that programs are mostly... (read more)

Replying toIs Rhetoric Worth Learning?

bryjnar8y

Is Rhetoric Worth Learning?

I am reminded of Guided by the Beauty of our Weapons. Specifically, it seems like we want to encourage forms of rhetoric that are disproportionately persuasive when deployed by someone who is in fact right.

Something like "make the structure of your argument clear" is probably good (since it will make bad arguments look bad), "use vivid examples" is unclear (can draw people's attention to the crux of your argument, or distract from it), "tone and posture" are probably bad (because the effect is symmetrical).

So a good test is "would this have an equal effect on the persuasiveness of my speech if I was making an invalid point?". If the answer is no, then do it; otherwise maybe not.

Meaning and Moral Foundations Theory

bryjnar

(cross-posted from https://www.michaelpj.com/blog/2018/04/07/meaning-and-mft.html)

Robin Hanson writes (some time ago, but it's a classic):

So there is a bit of a tension here between the meaning that crusaders choose for themselves and the happiness they try give to others. They might reasonably be accused of elitism, thinking that happiness is good for the masses, while meaning should be reserved for elites like them. Also, since such folks tend to embrace far mode thoughts more, and tend less to think that near mode desires say what we really want, such folks should also be conflicted about their overwhelming emphasis on happiness over meaning when giving policy advice.

I think there's something interesting here,... (read 499 more words →)

Replying toBeta-Beta Testing: Frontpage Rework [Update - further tweak]

bryjnar8y

Beta-Beta Testing: Frontpage Rework [Update - further tweak]

Yes, this is very annoying.

Replying toAn Apology is a Surrender

bryjnar8y

An Apology is a Surrender

I found Kevin Simmler's observation that an apology is a status lowering to be very helpful. In particular, it gives you a good way to tell if you made an apology properly - do you feel lower status?

I think that even if you take the advice in this post you can make non-apologies if you don't manage to make yourself lower your own status. Bits of the script that are therefore important:

Being honest about the explanation, especially if it's embarassing.
Emphasise explanations that attribute agency to you - "I just didn't think about it" is bad for this reason.
Not being too calm and clinical about the process - this suggests that it's unimportant.

This also means that weird dramatic stuff can be good if it actaully makes you lower your status. If falling to your knees and embracing the person's legs will be perceived as lowering your status rather than funny, then maybe that will help.

Replying toChoice begets regret

bryjnar8y

Choice begets regret

This is a great point. I think this can also lead to cognitive dissonance: if you can predict that doing X will give you a small chance of doing Y, then in some sense it's already in your choice set and you've got the regret. But if you can stick your fingers in your ears enough and pretend that X isn't possible, then that saves you from the regret.

Possible values of X: moving, starting a company, ending a relationship. Scary big decisions in general.

Something that confused me for a bit: people use regret-minimization to handle exporation-exploitation problems, shouldn't they have noticed a bias against exploration? I think the answer here is that the "exploration" people usually think about involves taking an already known option to gain more information about it, not actually expanding the choice set. I don't know of any framework that includes actions that actually change the choice set.

Choice begets regret

bryjnar

Epistemic status: speculative

Choice is bad I want to focus on one aspect of this badness: regret. I'm going to argue that increased choice predictably increases the amount of regret that an agent feels, even if they are actually better off, and that this is bad for humans in particular.

Suppose that John is an uneducated child of subsistence farmers. Then John has a very limited set of options available to him (his "choice set") - work at his parents' farm, work at someone else's farm in the village, start his own farm... and that's about it. While John might prefer to be a lawyer, the fact that he was unable to... (read 932 more words →)

Zero-Knowledge Cooperation

bryjnar

A lot of ink has been spilled about how to get various decision algorithms to cooperate with each other. However, most approaches require the algorithm to provide some kind of information about itself to a potential cooperator.

Consider FDT, the hot new kid on the block. In order for FDT to cooperate with you, it needs to reason about how your actions are related to the algorithm that the FDT agent is running. The naive way of providing this information is simple: give the other agent a copy of your “source code”. Assuming that said code is analyzable without running into halting problem issues, this should allow FDT to work out whether it wants to cooperate

... (read 1133 more words →)

Decision theory question

bryjnar

13y

I'm sure I remember reading somewhere that an informal slogan for (some brand of decision theory) was:

"Act in accordance with the precommittment you wish you had made."

i.e. when faced with Newcomb's problem, you would wish you had precommitted to one-box, and so you should one-box. This is entirely predictable, so you get the money.

However, I couldn't find where I saw this, and I can't remember which decision theory it was meant to exemplify - and I don't actually understand the maths of the various competitors enough to figure out which it is.

Could someone who knows the area better point me in the right direction? In general, the idea of making a kind of "fully general precommitment" seems like an intuitive way of explaining how you can improve on, say, CDT.

LESSWRONG
LW

LESSWRONG
LW

Zero-Knowledge Cooperation

Meaning and Moral Foundations Theory

Choice begets regret

Decision theory question

bryjnar

Meaning and Moral Foundations Theory

Choice begets regret

Zero-Knowledge Cooperation

Decision theory question

bryjnar

Zero-Knowledge Cooperation

Meaning and Moral Foundations Theory

Choice begets regret

Decision theory question

bryjnar

Meaning and Moral Foundations Theory

Choice begets regret

Zero-Knowledge Cooperation

Decision theory question