Aleksei_Riikonen comments on Be a Visiting Fellow at the Singularity Institute - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (156)
The latter part, that IF SIAI is exerting a positive influence, THEN doing that outweighs the alternative of not working on existential risks, seems to be a claim somewhat easy to defend.
The math in this Bostrom paper should do it: http://www.nickbostrom.com/astronomical/waste.html (even though the paper is not directly commenting on this particular question, the math rather straightforwardly applies to this question)
Ouch. This paper reads to me like a reductio ad absurdum of utilitarianism. Some simple math inevitably implies that I'm losing an unimaginable amount of "utility" every second without realizing it? Then please remind me why I should care about this "utility"?
Imagine that you have to decide once and for all eternity what to do with the world. You won't be able to back off, because that would just mean that the world will be rewritten randomly. How should you do that?
This is essentially the situation we find ourselves in, with Friendly AI/existential risk pressure. Formal preference is the answer you give to that question, about what to do with the world, not something that "you have", or "care about". Forget intuitions and emotions, or considerations of comfort, and just answer the question. Formal preference is distinct from exact state of the world only because it's uncertain what can be actually done, and what can't. So, formal preference specifies what should be done for every level of capability to determine things. Of course, formal preference can't be given explicitly. To the extent you'll be able to express the answer to this question, your formal preference is defined by your wishes. Any uncertainty gets taken over by randomness, an opportunity to make the world better lost forever.
For any sane notion of an answer to that question, you'll find that whatever actually happens now is vastly suboptimal.
If it's your chosen avenue of research, I guess I'm okay with that, but IMO you're making the problem way more difficult for yourself. Such "formal preferences" will be much harder to extract from actual humans than utility functions in their original economic sense, because unlike utility, "formal preference" as you define it doesn't even influence our everyday actions very much.
Way more difficult than what? There is no other way to pose this problem, any revealed preference is not what Friendly AI is about. I agree that it's a way harder problem than automatic extraction of utilities in the economic sense, and that formal preference barely controls what people actually do.
What would be wrong with an AI based on our revealed preferences? It sounds like an easy question, but somehow I'm having a hard time coming up with an answer.
Because my revealed preferences suck. The difference between even what I want in a sort of ordinary and non-transhumanist way and what I have is enormous. I am 150 pounds heavier than I want to be. My revealed preference is to eat regardless of health/size consequences, but I don't want all of the people in the future to be fat. My revealed preference is also to kill people in pooristan so that I can have cheap plastic widgets or food or whatever. I don't want an extrapolation of my akrasiatic actual actions controlling the future of the universe. I suspect the same goes for you.
Hmm. Let's look more closely at the weight example, because the others are similar. You also reveal some degree of preference to be thin rather than fat, do you? Then an AI with unlimited power could satisfy both your desire to eat and your desire to be thin. And if the AI has limited power, do you really want it to starve you, rather than go with your revealed preference?
Revealed preference means what your actual actions are. It doesn't have anything at all to do with what I verbally say my goals are. I can say that I would prefer to be thin all I want, but that isn't my revealed preference. My revealed preference is to be fat, because, you know, that's how I'm acting. You seem to be suffering some misapprehensions as to what you are saying about how an AI should act. If your definition of revealed preference contains my desire not to be fat, you should shift to what I mean when I talk about preference, because yours solves none of the problems you think it does.
Is your revealed preference to be fat, or is it to eat and exercise (or not exercise) in ways which incidentally result in your being fat?
I'm assuming that you revealed your preference to be thin in your other actions, at some other moments of your life. Pretty hard to believe that's not the case.
What AI is based on is what determines the way the world will actually be, so by building an AI with given preference, you are inevitably answering my question about what to do with the world. It's wrong to use revealed preference for AI to the same extent revealed preference gives the wrong answer to my question. You seem to agree that the correct answer to my question has little to do with revealed preference. This seems to be the same as seeing revealed preference a wrong thing to imprint AI with.
No one has ever been an altruist in this crazy sense. No one's actual wants and desires have ever been adequately represented by this 10^23 stuff. Utility is a model of what people want, not a prescription of what you "should" want (what does "should want" mean anyway?), and here we clearly see the model not modeling what it's supposed to.
I agree with you to the extent that no one that I am aware of is actually expending the effort that disutilities represented by 10^23 should inspire. But even before the concept of cosmic waste was developed, no one was actually working as hard as, say, starvation in Africa deserved. Or ending aging. Or the threat of nuclear Armageddon. But the fact that humans, who are all affected by akrasia aren't actually doing what they want isn't really strong evidence that it isn't what they, on sufficient reflection, want. Utility is not a model of what non-rational agents (ie humans) are doing, it is a model of how actual, idealized agents want to act. I don't want people to die, so I should work to reduce existential risk as much as possible, but because I am not a perfect agent, I can't actually follow the path that really maximizes my (non-existent abstraction of) utility.
I haven't seen anyone who claims to be motivated by utilities of such magnitude except Eliezer. He's currently busy writing his Harry Potter fanfic and shows no signs of mental distress that the 10^23-strong anticipation should've given him.
From the Author's Note:
From Kaj Sotala:
Eliezer is not "busy writing his Harry Potter fanfic." He is working on his book on rationality.
The Harry Potter fanfic is a book on rationality. And a damn good one.
To clarify, Eliezer Yudkowsky is working both on a book and on the Harry Potter fanfiction in question. Both pertain to rationality.