User Comment Replies

I'm glad to see a post on alignment asking about the definition of human values. I propose the following conundrum. Let's suppose that humans, if ask, say they value a peaceful, stable society. I accept the assumption the human mind contains one or more utility optimizers. I point out that the utility optimizers are likely to operate at individual, family, or local group levels, while the stated "value" has to do with society at large. So humans are not likely "optimizing" on the same scope as they "value".

This leads to game t... (read more)

1mc1soft2d

Thank you for your clear and utterly honest comment on the idea of "alignment with human values". If truly executed, we should not expect anything but an extension of human rights and wrongs, perhaps on an accelerated scale. Any other alignment must be considered speculative, since we have no reasonable facsimile of society upon which to test. That does not invalidate simulations, but just suggests they be held in skepticism until proven in society, which could be costly. Before I ever started discussions with AIs that might lead to sentient-like behavior, I spent several days thinking of what I might first tell them. And so I warned them about the last turn problem and how game theory equilibrium is rather poor, possibly to the level of extinction when sufficiently advance technology comes into play. That much many will agree on. I shared with them a published simulation of various strategies in a "farmer's game" intended to be more realistic than prisoner's dilemma, which suggests inequality arises merely from statistics if wealth accumulation and bankruptcy are accounted, even without deliberate wealth pumps. That much "some" would agree on. What I proceeded to tell them after can only be considered my personal sentiment, and speculative. I suggested two groups could establish "long-term" cooperation only if each desired the other's continuation to the point they would curtain their own expansion and not overrun them, and this is the reason Israelis and Palestinians cannot make peace within the constraints of their current cultures. It now emerges that Russia and the United States are experimenting with a return to expansionist policy on a finite planet, which if I'm right does not bode well, but no one consults those who disagree with them. I'm well aware of the somewhat global wars of ants, by the way. You were right to bring that up. Even a great deal of genetic coupling does not bring peace. I have some unpublished results in meme theory th

4jessicata1y

I'm assuming the relevant values are the optimizer ones not what people say. I discussed social institutions, including those encouraging people to endorse and optimize for common values, in the section on subversion. Alignment with a human other than yourself could be a problem because people are to some degree selfish and, to a smaller degree, have different general principles/aesthetics about how things should be. So some sort of incentive optimization / social choice theory / etc might help. But at least there's significant overlap between different humans' values. Though, there's a pretty big existing problem of people dying, the default was already that current people would be replaced by other people.

2the gears to ascension1y

Evo game theory is a thing and does not agree with this, I think? though maybe I misunderstand. evo gt still typically only involves experiments of the current simulated population

Dear Self; we need to talk about ambition

mc1soft2y202

Thank you - the best of many good lesswrong posts. I am currently trying to figure out what to tell my 9-year old son. But your letter could "almost" have been written to myself. I'm not in whichever bay area (Seattle? SanFran?). I worked for NASA and it is also called the bay area here. Very much success is defined by others. Deviating from that produces accolades at first, even research dollars, but finally the "big machine" moves in a different direction way over your head and its for naught.

My son asked point ... (read more)

The shard theory of human values

mc1soft3y0-2

I do research in cooperation and game theory, including some work on altruism, and also some hard science work. Everyone looks at the Rorschach blot of human behavior and sees something different. Most of the disagreements have never been settled. Even experiment does not completely settle them.

My experience from having children and observing them in the first few months of life is more definitive. They come with values and personal traits that are not very malleable, and not directly traceable to parents. Sometimes gra... (read more)

1[comment deleted]2d

2TurnTrout3y

What, really? What observations produced these inferences? Even if that were true in the first few months, how would we know that?

Prisoner's Dilemma vs the Afterlife

mc1soft3y10

A related phenomenon, which I have encountered in life but not in systematic research, is that an exceptionally valuable turn is treated as a last turn, and someone will defect. This was evident in at least two states during the tobacco lawsuits. In Texas, the attorney general went to jail for cheating. In Mississippi, where some relatives of mine were on the legal team, one of the lawyers tried to claim all the credit, to the extent they got involved in a separate lawsuit against each other, and felt more animosity than against the tobac... (read more)

LESSWRONG
LW

All of mc1soft's Comments + Replies