Nothing wrong - if you prove the business scalable. (Which might not be true for many charities out there, but that would not make them inefficient; only the donating as contributing.) I admit I have no experience with free kitchens, though.
Scalable in what sense? Do you foresee some problem with one kitchen using the hiring model and other kitchens using the volunteer model?
I don't follow. Do you argue that in some cases volunteering in the kitchen is better than donating? Why? What's wrong with the model where the kitchen uses your money to hire workers?
In the post you linked to, at the end you mention a proposed "fetus" stage where the agent receives no external inputs. Did you ever write the posts describing it in more detail? I have to say my initial reaction to that idea is also skeptical though. Human don't have a fetus stage where we think/learn about math with external inputs deliberately blocked off. Why do artificial agents need it? If an agent couldn't simultaneously learn about math and process external inputs, it seems like something must be wrong with the basic design which we should fix instead of work around.
I didn't develop the idea, and I'm still not sure whether it's correct. I'm planning to get back to these questions once I'm ready to use the theory of optimal predictors to put everything on rigorous footing. So I'm not sure we really need to block the external inputs. However, note that the AI is in a sense more fragile than a human since the AI is capable of self-modifying in irreversible damaging ways.
(I think) I'm arguing that if you have with some probability saved some people, and you intend to keep saving people, it is more efficient to keep saving the same set of people.
I assume you meant "more ethical" rather than "more efficient"? In other words, the correct metric shouldn't just sum over QALYs, but should assign f(T) utils to a person with life of length T of reference quality, for f a convex function. Probably true, and I do wonder how it would affect charity ratings. But my guess is that the top charities of e.g. GiveWell will still be close to the top in this metric.
I don’t understand your first paragraph. For the second, I see my future self as morally equivalent to myself, all else being equal. So I defer to their preferences about how the future world is organized, because they're the one who will live in it and be affected by it. It’s the same reason that my present self doesn’t defer to the preferences of my past self.
Your preferences are by definition the things you want to happen. So, you want your future self to be happy iff your future self's happiness is your preference. Your ideas about moral equivalence are your preferences. Et cetera. If you prefer X to happen and your preferences are changed so that you no longer prefer X to happen, the chance X will happen becomes lower. So this change of preferences goes against your preference for X. There might be upsides to the change of preferences which compensate the loss of X. Or not. Decide on a case by case basis, but ceteris paribus you don't want your preferences to change.
This is an account of some misgivings I've been having about the whole rationality/effective altruism world-view. I do expect some outsiders to think similarly.
So yesterday I was reading SSC, and there was an answer to some article about the EA community by someone [whose name otherwise told me nothing] who among other things said EAs were 'white male autistic nerds'.
'Rewind,' said my brain.
'Aww,' I reasoned. 'You know. Americans. We have some heuristics like this, too.'
'...but what is this critique about?'
'Get unstuck already. The EA is populated with young hard-working talented educated hopeful people...'
'Let's not join,' brain ruled immediately. 'We're not like that!'
'...who are out to save the world, eliminate suffering and maybe even defeat Death.'
Brain smirked. 'I find it easier to believe in the WMAN than in the YHTEHS - fewer dice rolls... But even if all of it is true, and they do intend to do all this; how would they fail?'
'Huh?'
'Would they lose their jobs, if some angry developer rings up their boss? Would they get sued, and lose their jobs, if they protest unwisely? Would they get beaten up in a dark alley, and incidentally lose their jobs, if - '
'THE WHOLE POINT is that you don't risk your own skin. You efficiently pay others to do it, hopefully without the actual risking, and in this way more people benefit. And stop being bloody-minded.'
'Well, good luck making more people join. We want to have lived. (In case there ain't no Singularity coming soon.) We believe experience. We believe failure.'
'Failure isn't efficient. And what are you about? That you want us to get beaten up?'
'No, I want to see some price they pay for their ideas. Out of, you know, sheer malice. Like if you're an environmentalist, then everybody around you knows what you must do better than yourself.'
'They pay money, because people shouldn't be heroes to do good. Shouldn't have to be sad to do good. Or angry. Even if it helps.'
Brain thought for a moment.
'Okay. But why do they expect others to be sad, angry or heroes? You buy a malaria net as an Effective Altruist, you kinda make a contract with somebody who uses it, like Albus Dumbledore giving the Cloak of Invisibility to Harry Potter. For your money to have mattered, that person would have to live in unceasing toil.'
'Which is in their best interests anyway.'
'...in more toil than you could ever imagine. And sorrow. And make efficient decisions. Aren't you morally obliged to keep helping?'
'If a builder sells a house, is he morally obliged to keep repairing it?' I shrugged. 'Legally, perhaps, if the house falls down.'
'Then I want to know what an Effective Altruist does when the house falls down, in the absence of any law that can force him,' said the brain. 'Surely he is more responsible than the builder?'
I don't follow. Are you arguing that saving a person's life is irresponsible if you don't keep saving them?
But in that case, why do we need a special "pure learning" period where you force the agent to explore? Wouldn't any prior that would qualify as "the right prior for me" or "my actual prior" not favor any particular universe to such an extent that it prevents the agent from exploring in a reasonable way?
To recap, if we give the agent a "good" prior, then the agent will naturally explore/exploit in an optimal way without being forced to. If we give it a "bad" prior, then forcing it to explore during a pure learning period won't help (enough) because there could be environments in the bad prior that can't be updated away during the pure learning period and cause disaster later. Maybe if we don't know how to define a "good" prior but there are "semi-good" priors which we know will reliably converge to a "good" prior after a certain amount of forced exploration, then a pure learning phase would be useful, but nobody has proposed such a prior, AFAIK.
If we find a mathematical formula describing the "subjectively correct" prior P and give it to the AI, the AI will still effectively use a different prior initially, namely the convolution of P with some kind of "logical uncertainty kernel". IMO this means we still need a learning phase.
A question that I noticed I'm confused about. Why should I want to resist changes to my preferences?
I understand that it will reduce the chance of any preference A being fulfilled, but my answer is that if the preference changes from A to B, then at that time I'll be happier with B. If someone told me "tonight we will modify you to want to kill puppies," I'd respond that by my current preferences that's a bad thing, but if my preferences change then I won't think it's a bad thing any more, so I can't say anything against it. If I had a button that could block the modification, I would press it, but I feel like that's only because I have a meta-preference that my preferences tend to maximizing happiness, and the meta-preference has the same problem.
A quicker way to say this is that future-me has a better claim to caring about what the future world is like than present-me does. I still try to work toward a better world, but that's based on my best prediction for my future preferences, which is my current preferences.
"I understand that it will reduce the chance of any preference A being fulfilled, but my answer is that if the preference changes from A to B, then at that time I'll be happier with B". You'll be happier with B, so what? Your statement only makes sense of happiness is part of A. Indeed, changing your preferences is a way to achieve happiness (essentially it's wireheading) but it comes on the expense of other preferences in A besides happiness.
"...future-me has a better claim to caring about what the future world is like than present-me does." What is this "claim"? Why would you care about it?
This is a crazy idea that I'm not at all convinced about, but I'll go ahead and post it anyway. Criticism welcome!
Rationality and common sense might be bad for your chances of achieving something great, because you need to irrationally believe that it's possible at all. That might sound obvious, but such idealism can make the difference between failure and success even in science, and even at the highest levels.
For example, Descartes and Leibniz saw the world as something created by a benevolent God and full of harmony that can be discovered by reason. That's a very irrational belief, but they ended up making huge advances in science by trying to find that harmony. In contrast, their opponents Hume, Hobbes, Locke etc. held a much more LW-ish position called "empiricism". They all failed to achieve much outside of philosophy, arguably because they didn't have a strong irrational belief that harmony could be found.
If you want to achieve something great, don't be a skeptic about it. Be utterly idealistic.
I think it is more interesting to study how to be simultaneously supermotivated about your objectives and realistic about the obstacles. Probably requires some dark arts techniques (e.g. compartmentalization). Personally I find that occasional mental invocations of quasireligious imagery are useful.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Hi Peter! I suggest you read up on UDT (updateless decision theory). Unfortunately, there is no good comprehensive exposition but see the links in the wiki and IAFF. UDT reasoning leads to discarding "fragile" hypotheses, for the following reason.
According to UDT, if you have two hypotheses H1, H2 consistent with your observations you should reason as if there are two universes Y1 and Y2 s.t. Hi is true in Yi and the decisions you make control the copies of you in both universes. Your goal is to maximize the a priori expectation value of your utility function U where the prior includes the entire level IV multiverse weighted according to complexity (Solomonoff prior). Fragile universes will be strongly discounted in the expected utility because of the amount of coincidences required to create them. Therefore if H1 is "fragile" and H2 isn't, H2 is by far the more important hypothesis unless the complexity difference between them is astronomic.