All preferences have a causal history, and given that those causes tend not to care about efficiency (e.g. evolution, but also society/culture and probably others), I suspect most human "terminal" preferences are like risk-aversion: they seem suited for accomplishing some goal, but there are more efficient or accurate ways of doing so.
So should we self-modify to instead value those more efficient or accurate approaches? In the case of risk-aversion I seem to think the answer is yes, but in the case of love I seem to think that the answer is no. I am not sure why my brain is making this distinction or whether it might be legitimate.
Yup, I'm confused too.
Why is risk aversion a bias, but love is not? We know that risk aversion is strictly dominated for rational agents, but I think it likely that love is strictly dominated by some clever game-theoretic approach to mating. Why oppose wireheading, for that matter? Like eliminating risk aversion, it's a more efficient way for us to get what we want.
I am still confused.
For a rational agent with goals that don't include "being averse to risk", risk aversion is a bias. The correct decision theory acts on expected utility, with utility of outcomes and probability of outcomes factored apart and calculated separately. Risk aversion does not factor them.
"Risk Aversion," as a technical term, means that the utility function is concave with respect to its input, like in thelittledoctor's example. I think you're thinking of something else, like the certainty effect. But I don't know of anyone who considers the certainty effect to be a terminal goal rather than an instrumental one (woo, I don't have to compute probabilities!).
A proper utilitarian would feel approximately the same desire to do something about each
And we should be proper utilitarians... why?
what if we discover, for example, that some murders are not calculated defections, but failures of self control caused by a bad upbringing and lack of education.
Then we have evidence they will strike again.
What if we then further discover that there is a two-month training course that has a high success rate of turning murderers into productive members of society.
Does tha...
I feel like values are defined over outcomes, while biases are defined over cognitive processes.
You could value a bias I suppose, but then you'd be valuing executing particular algorithms over like, saving the world. If that's the case, I think that the people arguing for a bias are looking for an easy way out of a problem, or more attached to their identity than I believe to be useful.
Not that I've reflected that much on it, but that's my intuition coming in.
What atucker said. Someone who falls prey to the Allais paradox can't be Bayesian-rational. Someone who values love, can.
To everyone who proposes to examine the causal origins of human urges: can you try to make that idea more precise? How far back do you go? What if the causal origin is some natural process that doesn't seem to have goals?
Ok let's try this as a solution: All our neat little mechanisms and heuristics make up our values, but they come on a continuum of importance, and some of them sabotage the rest more than others.
For example, all those nice things like love and beauty seem very important, and usually don't conflict, so they are closer to values.
Things like risk aversion and hindsight bias and such aren't terribly important, but because they prescribe otherwise stupid behavior in the decision theory/epistemological realm, they sabotage the achievement of other bias/values, a...
Risk aversion as a terminal value follows pretty naturally from decreasing marginal utility. For example imagine we have a paperclip-loving agent whose utility function is equal to sqrt(x), where x is the number of paperclips in the universe. Now imagine a lottery which either creates 9 or 25 paperclips, each with 50% probability - an expected net gain of 17 paperclips. Now give the agent a choice between 16.5 paperclips or a run of this lottery. Which choice maximizes the agent's expected utility?
This was an interesting exercise in denying human nature. I also wonder why you left love alone. Perhaps you can experiment with behaviors that compromise these things you consider non-terminal in favor of what you identify as their true purpose. I don't know of any argument that could be convincing but to successfully live your life as if they're not terminal. Otherwise, it seems like excessive optimism in telling stories about yourself.
I also don't see how you could maintain the impetus to really live as you suggest, without it being a hugely rewarding p...
Sheridan: "What do you want?" Kosh: "Never ask that question!"
People are like dogs, they just sort of do things arbitrarily. If you look beyond the smoke and mirrors of your surface preferences, all you're going to find behind them is more smoke and mirrors. A wise man once suggested to me that I should just treat my brain as an oracle for preferences - give it as good a data as I can, and as much processing power as it needs, and just take what it spits out as gospel, rather than seeking the underlying principles.
Not necessarily, your brain might have this annoying property that understanding a moral principal changes it in such a way that it no longer cares about it.
I think the reason the values/biases you described (risk aversion, justice, responsibility) initially caused you confusion is that all of them are (as other commenters pointed out) very similar to behaviors a calculating consequentialist would use to achieve its values, even if it lacked them. For instance, a consequentialist with strong desires for love and beauty, but no desire for justice, would still behave somewhat similarly to a consequentialist with a desire for justice, because it sees how taking action to deter negative behaviors by other agents ...
Surely justice [i.e. punishment of criminals] is a terminal value; it feels so noble to desire it.
I don't know that many people who consider punishing criminals a end in itself, as opposed to a means to rehabilitate them and/or deter other potential criminals. (Maybe that's because I'm European; I've heard that that is mainly an American thing.)
I'm working under the assumption everything individually is a bias until proven otherwise, and find it very unlikely such a proof will be available before the singularity, and after the singularity happens being biased doesn't matter all that much anyway. This also doubles as a safeguard against attempting to make an FAI on my own implementing only my values since by the time I finished no such thing would exist in any meaningful sense.
I have a question for those who claim not to value justice. Would you support introducing something like the nine familial exterminations, basically punishing friends and relatives of the criminal, if its deterrent effect was shown to decrease crime?
I think you misdiagnosed the source of responsibility. In keeping with your Syria example, suppose you gave some Syrian advise on how to rebel. Then you'd probably feel responsibility for him even if you don't identify with Syrians. I would argue that responsibility is more based on a (possibly implicit) contract (e.g., if you give advise thus you are responsible for its quality) then on identity.
Not to be rude, but this article is terminally confused. The principles of rationality do not tell you what your values should be; rather, they guide you in achieving whatever your values actually are. The principles and processes of rational thought and action are the same for all of us, but they lead to different prescriptions for different people. What is a rational action for me is not always a rational action for you, and vice versa, not only because our circumstances are different (and hence we will get different results) but because our values are ...
I've seen of people on Lesswrong taking cognitive structures that I consider to be biases as terminal values. Take risk aversion for example:
Risk Aversion
For a rational agent with goals that don't include "being averse to risk", risk aversion is a bias. The correct decision theory acts on expected utility, with utility of outcomes and probability of outcomes factored apart and calculated separately. Risk aversion does not factor them.
EDIT: There is some contention on this. Just substitute "that thing minimax algorithms do" for "risk aversion" in my writing. /EDIT
A while ago, I was working through the derivation of A* and minimax planning algorithms from a Bayesian and decision-theoretic base. When I was trying to understand the relationship between them, I realized that strong risk aversion, aka minimax, saves huge amounts of computation compared to the correct decision theory, and actually becomes more optimal as the environment becomes more influenced by rational opponents. The best way win is to deny the opponents any opportunity to weaken you. That's why minimax is a good algorithm for chess.
Current theories about the origin of our intelligence say that we became smart to outsmart our opponents in complex social games. If our intelligence was built for adversarial games, I am not surprised at risk aversion.
A better theoretical replacement, and a plausible causal history for why we have the bias instead of the correct algorithm are convincing to me as an argument against risk aversion as a value the way a rectangular 13x7 pebble heap is convincing to a pebble sorter as an argument against the correctness of a heap of 91 pebbles; it seems undeniable, but I don't have access to the hidden values that would say for sure.
And yet I've seen people on LW state that their "utility function" includes risk aversion. Because I don't understand the values involved, all I can do is state the argument above and see if other people are as convinced as me.
It may seem silly to take a bias as terminal, but there are examples with similar arguments that are less clear-cut, and some that we take as uncontroversially terminal:
Responsibility and Identity
The feeling that you are responsible for some things and not others, like say, the safety of your family, but not people being tortured in Syria, seems noble and practical. But I take it to be a bias.
I'm no evolutionary psychologist, but it seems to me that feelings of responsibility are a quick hack to kick you into motion where you can affect the outcome and the utility at stake is large. For the most part, this aligns well with utilitarianism; you usually don't feel responsible for things you can't really affect, like people being tortured in Syria, or the color of the sky. You do feel responsible to pull a passed out kid off the train tracks, but maybe you don't feel responsible to give them some fashion advice.
Responsibility seems to be built on identity, so it starts to go weird when you identify or don't identify in ways that didn't happen in the ancestral environment. Maybe you identify as a citizen of the USA, but not of Syria, so you feel shame and responsibility about the US torturing people, but the people being tortured in Syria are not your responsibility, even though both cases are terrible, and there is very little you can do about either. A proper utilitarian would feel approximately the same desire to do something about each, but our responsibility hack emphasizes responsibility for the actions of the tribe you identify with.
You might feel great responsibility to defend your past actions but not those other people, even tho neither is worth "defending". A rational agent would learn from both the actions of their own past selves and those of other people without seeking to justify or condemn; they would update and move on. There is no tribal council that will exile you if you change your tune or don't defend yourself.
You might be appalled that someone wishes to stop feeling responsibility for their past selves; "but if they don't feel responsibility for their actions, what will prevent them from murdering people, or encourage them to do good?". A rational utilitarian would do good and not do evil because they wish good and non-evil to be done, instead of because of feelings of responsibility that they don't understand.
This argument is a little harder to see and possibly a little less convincing, but again I am convinced that identity and responsibility are inferior to utilitarianism, tho they may have seemed almost terminal.
Justice
Surely justice is a terminal value; it feels so noble to desire it. Again I consider the desire for justice to be a biased heuristic.
in game theory the best solution for iterated prisoners dilemma is tit-for-tat: cooperate and be nice, but punish defectors. Tit-for-tat looks a lot like our instincts for justice, and I've heard that the prisoners dilemma is a simplified analog of many of the situations that came up in the ancestral environment, so I am not surprised that we have an instinct for it.
It's nice that we have a hardware implementation of tit-for-tat, but to the extent that we take it as terminal instead of instrumental-in-some-cases, it will make mistakes. It will work well when individuals might choose to defect from the group for greater personal gain, but what if we discover, for example, that some murders are not calculated defections, but failures of self control caused by a bad upbringing and lack of education. What if we then further discover that there is a two-month training course that has a high success rate of turning murderers into productive members of society. When Dan the Deadbeat kills his girlfriend, and the psychologists tell us he is a candidate for the rehab program, we can demand justice like we feel we ought to at a cost of hundreds of thousands of dollars and a good chunk of Dan's life, or we can run Dan thru the two month training course for a few thousand dollars, transforming him into a good, normal person. People who take punishment of criminals as a terminal value will choose prison for Dan, but people with other interests would say rehab.
One problem with this story is that the two-month murder rehab seems wildly impossible, but so do all of Omega's tricks. I think it's good to stress our theories at the limits, they seem to come out stronger, even for normal cases.
I was feeling skeptical about some people's approach to justice theory when I came up with this one, so I was open to changing my understanding of justice. I am now convinced that justice and punishment instincts are instrumental, and only approximations of the correct game theory and utilitarianism. The problem is, while I was convinced, someone who takes justice as terminal, and is not open to the idea that it might be wrong, is absolutely not convinced. They will say "I don't care if it is more expensive, or that you have come up with something that 'works better', it is our responsibility to the criminal to punish them for their misdeeds.". Part of the reason for this post is that I don't know what to say to this. All I can do is state the argument that convinced me, ask if they have something to protect, and feel like I'm arguing with a rock.
Before anyone who is still with me gets enthusiastic about the idea that knowing a causal history and an instrumentally better way is enough to turn a value into a bias, consider the following:
Love, Friendship, and Flowers
See the gift we give to tomorrow. That post contains plausible histories for why we ended up with nice things like love, friendship, and beauty; and hints that could lead you to 'better' replacements made out of game theory and decision theory.
Unlike the other examples, where I felt a great "Aha!" and decided to use the superior replacements when appropriate, this time I feel scared. I thought I had it all locked out, but I've found some existential angst lurking in the basement.
Love and such seem like something to protect, like I don't care if there are better solutions to the problem they were built to solve; I don't care if game theory and decision theory leads to more optimal replication. If I'm worried that love will go away, then there's no reason I ought to let it, but these are the same arguments as the people who think justice is terminal. What is the difference that makes it right this time?
Worrying and ConclusionOne answer to this riddle is that everyone is right with respect to themselves, and there's nothing we can do about disagreements. There's nothing someone who has one interpretation can say to another to justify their values against some objective standard. By thefull power of my current understanding, I'm right, but so is someone who disagrees.On the other hand, maybe we can do some big million-variable optimization on the contradictory values and heuristics that make up ourselves and come to a reflectively coherent understanding of which are values and which are biases. Maybe none of them have to be biases; it makes sense and seems acceptable that sometimes we will have to go against one of our values for greater gain in another. Maybe I'm asking the wrong question.I'm confused, what does LW think?Solution
I was confused about this for a while; is it just something that we have to (Gasp!) agree to disagree about? Do we have to do a big analysis to decide once and for all which are "biases" and which are "values"? My favored solution is to dissolve the distinction between biases and values:
All our neat little mechanisms and heuristics make up our values, but they come on a continuum of importance, and some of them sabotage the rest more than others.
For example, all those nice things like love and beauty seem very important, and usually don't conflict, so they are closer to values.
Things like risk aversion and hindsight bias and such aren't terribly important, but because they prescribe otherwise stupid behavior in the decision theory/epistemological realm, they sabotage the achievement of other bias/values, and are therefore a net negative.
This can work for the high-value things like love and beauty and freedom as well: Say you are designing a machine that will achieve many of your values, being biased towards making it beautiful over functional could sabotage achievement of other values. Being biased against having powerful agents interfering with freedom can prevent you from accepting law or safety.
So debiasing is knowing how and when to override less important "values" for the sake of more important ones, like overriding your aversion to cold calculation to maximize lives saved in a shut up and multiply situation.