Anyone who predicts that some decision may result in the world being optimized according to something other than their own values, and is okay with that, is probably not thinking about terminal values. More likely, they're thinking that humanity (or its successor) will clarify its terminal values and/or get better at reasoning from them to instrumental values to concrete decisions, and that their understanding of their own values will follow that. Of course, when people are considering whether it's a good idea to create a certain kind of mind, that kind of thinking probably means they're presuming that Friendliness comes mostly automatically. It's hard for the idea of an agent with different terminal values to really sink in; I've had a little bit of experience with trying to explain to people the idea of minds with really fundamentally different values, and they still often try to understand it in terms of justifications that are compelling (or at least comprehensible) to them personally. Like, imagining that a paperclip maximizer is just like a quirky highly-intelligent human who happens to love paperclips, or is under the mistaken impression that maximizing paperclips is the rig...
Ok. Well done. You have managed to frighten me. Frightened me enough to make me ask the question: "Just why do we want to build a powerful optimizer, anyways?"
I feel like I remember trying to answer the same question (asked by you) before, but essentially, the answer is that (1) eventually (assuming humanity survives long enough) someone is probably going to build one anyway, probably without being extremely careful about understanding what kind of optimizer it's goint to be, and getting FAI before then will probably be the only way to prevent it; (2) there are many reasons why humanity might not survive long enough for that to happen — it's likely that humanity's technological progress over the next century will continuously lower the amount of skill, intelligence, and resources needed to accidentally or intentionally do terrible things — and getting FAI before then may be the best long-term solution to that; (3) given that pursuing FAI is likely necessary to avert other huge risks, and is therefore less risky than doing nothing, it's an especially good cause considering that it subsumes all other humanitarian causes (if executed successfully).
If I knew how that sausage will be made, I'd make it myself. The point of FAI is to do a massive amount of good that we're not smart enough to figure out how to do on our own.
Hmmm. Amnesty International, Doctors without Borders, and the Humane Society are three humanitarian causes that come to mind. FAI subsumes these ... how, exactly?
If humanity's extrapolated volition largely agrees that those causes are working on important problems, problems urgent enough that we're okay with giving up the chance to solve them ourselves if they can be solved faster and better by superintelligence, then it'll do so. Doctors Without Borders? We shouldn't be needing doctors (or borders) anymore. Saying how that happens is explicitly not our job — as I said, that's the whole point of making something massively smarter than we are. Don't underestimate something potentially hundreds or thousands or billions of times smarter than every human put together.
Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.
That really depends of what you mean by "our values":
1) The values of modern, western, educated humans? (as opposed to those of the ancient Greek, or of Confucius, or of medieval Islam), or
2) The "core" human values common to all human civilizations so far? ("stabbing someone who just saved your life is a bit of a dick move", "It would be a shame if humanity was exterminated in order to pave the universe with paperclips", etc.)
Both of those are quite fuzzy and I would find it hard to describe either of them precisely enough that even a computer could understand them.
When Eliezer talks of Friendly AI having human value, I think he's mostly talking about the second set (in The Psychological Unity of Mankind. But when Ben or Robin talk about how it isn't such a big deal if values change, because they've already changed in the past, they seem to be referring to the first kind of value.
I would agree with Ben and Robin that it isn...
How about ruler-of-the-universe deathism? Wouldn't it be great if I were sore undisputed ruler of the universe? And yet thinking that rather unlikely, I don't even try to achieve it. I even think trying to achieve it would be counter-productive. How freackin' defeatist is that?
Voted up. However, I disagree with "it's not OK". Everything is always OK. OK is a feature of the map. From a psychological perspective, that's important. If an OK state of the map can't be generated by changing the territory, it WILL be generated by cheating and directly manipulating the map.
That said, we have preferences, rank orderings of outcomes. The value of futures with our values is high.
OK, fine, literally speaking, value drift is bad.
But if I live to see the Future, then my values will predictably be updated based on future events, and it is part of my current value system that they do so. I affirmatively value future decisions being made by entities that have taken a look at what the future is actually like and reflected on the data they gain.
Why should this change if it turns out that I don't live to see the future? I would like future-me to be one of the entities that help make future decisions, but failing that, my second-best option...
Saying a certain future is "not ok", and saying gradual value drift is "business as usual", are both value judgments. I don't understand why you dismiss one of them but not the other, and call it "courageous and rational".
And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.
I still find it shocking and terrifying every time someone compares the morphing of human values with the death of the universe. Even though I saw another FAI-inspired person do it yesterday.
If all intelligent life held your view about the importance of their own values, then life in the universe would be doomed. The outcome of that view is that intelligent life greatly increases its ac...
Goertzel: Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology.
Agree, but the multiple different current forms of human values are the source of much conflict.
Hanson: Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors.
Agree again. And in honor of Robin's profession, I will point out that the multiple current forms of human values are the dr...
Hmm. I wonder if it helps with gathering energy to fight the views of others if you label their views as being "deathist".
Direct question. I cannot infer answer from you posts. If human values do not exist in closed form (i.e. do include updates on future observations including observations which in fact aren't possible in our universe), then is it better to have FAI operating on some closed form of values instead?
"But of course the claims are separate, and shouldn't influence each other."
No, they are not separate, and they should influence each other.
Suppose your terminal value is squaring the circle using Euclidean geometry. When you find out that this is impossible, you should stop trying. You should go and do something else. You should even stop wanting to square the circle with Euclidean geometry.
What is possible, directly influences what you ought to do, and what you ought to desire.
Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.
Even at the personal level our values change with time, somewhat dramatically as we grow in maturity from children to adolescents to adults, and more generally just as a process of learning modifying our belief networks.
Hanson's analogy to new generations of humans is especially important - value drift occurs naturally. A fully general ar...
Azathoth already killed you once, at puberty. Are you significantly worse off now that you value sex? Enough to eliminate that value?
There's no way you'd agree to receive $1000 a year before your death on condition that your family members will be tortured a minute after it. This is an example of what Vladimir means.
Is it more important to you that people of the future share your values, or that your values are actually fulfilled? Do you want to share your values, so that other (future?) people can make the world better, or are you going to roll up your sleeves and do it yourself? After all, if everyone relies on other people to get work done, nothing will happen. It's not Pareto efficient.
I think your deathism metaphor is flawed, but in your terms: Why do you assume "living for as long as I want" has a positive utility in my values system? It's not Pareto e...
The problem with this logic is that my values are better than those of my ancestors. Of course I would say that, but it's not just a matter of subjective judgment; I have better information on which to base my values. For example, my ancestors disapproved of lending money at interest, but if they could see how well loans work in the modern economy, I believe they'd change their minds.
It's easy to see how concepts like MWI or cognitive computationalism affect one's values when accepted. It's likely bordering on certain that transhumans will have more ins...
Ben Goertzel:
Robin Hanson:
We all know the problem with deathism: a strong belief that death is almost impossible to avoid, clashing with undesirability of the outcome, leads people to rationalize either the illusory nature of death (afterlife memes), or desirability of death (deathism proper). But of course the claims are separate, and shouldn't influence each other.
Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise. It's easier to see a sudden change as morally relevant, and easier to rationalize gradual development as morally "business as usual", but if we look at the end result, the risks of value drift are the same. And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.
Regardless of difficulty of the challenge, it's NOT OK to lose the Future. The loss might prove impossible to avert, but still it's not OK, the value judgment cares not for feasibility of its desire. Let's not succumb to the deathist pattern and lose the battle before it's done. Have the courage and rationality to admit that the loss is real, even if it's too great for mere human emotions to express.