All of endoself's Comments + Replies

Note that I ... wrote the only comment on the IAF post you linked

Yes, I replied to it :)

Unfortunately, I don't expect to have more Eliezer-level explanations of these specific lines of work any time soon. Eliezer has a fairly large amount of content on Arbital that hasn't seen LW levels of engagement either, though I know some people who are reading it and benefiting from it. I'm not sure how LW 2.0 is coming along, but it might be good to have a subreddit for content similar to your recent post on betting. There is an audience for it, as that post demonstrated.

0cousin_it
I think Eliezer's Arbital stuff would've been popular in blog form. (Converting it to a blog now won't work, the intrigue is gone.) The sequences had lots of similar quality material, like "Created already in motion". I don't like it much because it's so far out, but it gets readers.

Maybe you've heard this before, but the usual story is that the goal is to clarify conceptual questions that exist in both the abstract and more practical settings. We are moving towards considering such things though - the point of the post I linked was to reexamine old philosophical questions using logical inductors, which are computable.

Further, my intuition from studying logical induction is that practical systems will be "close enough" to satisfying the logical induction critereon that many things will carry over (much of this is just intuit... (read more)

2whpearson
When computations have costs I think the nature of the problems change drastically. I've argued that we need to go up to meta-decision theories because of it here. The idea of solomonov induction is not needed for building Neural networks (or useful for reasoning about them). So my pragmatic heart is cold towards a theory of logical induction as well.

Scott Garrabrant and I would be happy to see more engagement with the content on Agent Foundations (IAF). I guess you're right that the math is a barrier. My own recent experiment of linking to Two Major Obstacles for Logical Inductor Decision Theory on IAF was much less successful than your post about betting, but I think that there's something inessential about the inaccessiblity.

In that post, for example, I think the math used is mostly within reach for a technical lay audience, except that an understanding of logical induction is assumed, though I may ... (read more)

4whpearson
I lack motivation myself. I'm interested in AIrisk but I think exploring abstract decision theories where the costs of doing the computation to make the decision are ignored is like trying to build a vehicle and ignoring drag entirely. I may well be wrong so I still skim the agent foundations stuff, but I am unconvinced of its practicality. So I'm unlikely to be commenting on it or participating in that.
7cousin_it
Note that I played a part in convincing MIRI to create IAF, and wrote the only comment on the IAF post you linked, so rest assured that I'm watching you folks :-) My thinking has changed over time though, and probably diverged from yours. I'll lay it out here, hopefully it won't sound too harsh. First of all, if your goal is explaining math using simpler math, I think there's a better way to do it. In a good math explanation, you formulate an interesting problem at level n whose solution requires level n+1. (Ideally n should be as low as possible.) In a bad math explanation, you assume the reader understands level n, then write out the basic definitions of level n+1 and formulate a problem using those. That loses the reader, unless they are already interested in level n+1. But that's still underestimating the problem by a couple orders of magnitude. To jumpstart engagement, you need something as powerful as this old post by Eliezer. That's a much more complicated beast. The technical content is pretty much readable to schoolchildren, yet somehow readers are convinced that something magical is going on and they can contribute, not just read and learn. Coming back to that post now, I'm still in awe of how the little gears work, from the opening sentence to the "win" mantra to the hint that he knows the solution but ain't telling. It hits a tiny target in manipulation-space that people don't see clearly even now, after living for a decade inside the research program that it created. Apart from finding the right problem and distilling it in the right manner, I think the next hardest part is plain old writing style. For example, Eliezer uses lots of poetic language and sounds slightly overconfident, staying mostly in control but leaving dozens of openings for readers to react. But you can't reuse his style today, the audience has changed and you'll sound phony. You need to be in tune with readers in your own way. If I knew how to do it, I'd be doing it already. These

I model probabilistic thinking as something you build on top of all this. First you learn to model the world at all (your steps 3-8), then you learn the mathematical description of part of what your brain is doing when it does all this. There are many aspects of normative cognition that Bayes doesn't have anything to say about, but there are also places where you come to understand what your thinking is aiming at. It's a gears model of cognition rather than the object-level phenomenon.

If you don't have gears models at all, then yes, it's just another way t... (read more)

Hi Yaacov!

The most active MIRIx group is at UCLA. Scott Garrabrant would be happy to talk to you if you are considering research aimed at reducing x-risk. Alternatively, some generic advice for improving your future abilities is to talk to interesting people, try to do hard things, and learn about things that people with similar goals do not know about.

As far as I can tell, you've misunderstood what I was trying to do with this post. I'm not claiming that Hawkins' work is worth pursuing further; passive_fist's analysis seems pretty plausible to me. I was just trying to give people some information that they may not have on how some ideas developed, to help them build a better model of such things.

(I did not downvote you. If you thought that I was arguing for further work towards Hawkins' progam, then your comment would be justified, and in any case this is a worthwhile thing for me to explicitly disclaim.)

Yeah, I didn't mean to contradict any of this. I wonder how much a role previous arguments from MIRI and FHI played in changing the zeitgeist and contributing to the way Superintelligence was received. There was a slow increase in uninformed fear-of-AI sentiments over the preceding years, which may have put people in more of a position to consider the arguments in Superintelligence. I think that much of this ultimately traces back to MIRI and FHI; for example many anonymous internet commenters refer to them or use phrasing inspired by them, though many oth... (read more)

The quote from Ng is

The big AI dreams of making machines that could someday evolve to do intelligent things like humans could, I was turned off by that. I didn’t really think that was feasible, when I first joined Stanford. It was seeing the evidence that a lot of human intelligence might be due to one learning algorithm that I thought maybe we could mimic the human brain and build intelligence that’s a bit more like the human brain and make rapid progress. That particular set of ideas has been around for a long time, but [AI expert and Numenta cofounder

... (read more)

I can't see how this would work. Wouldn't the UDT-ish approach be to ask an MMEU agent to pick a strategy once, before making any updates? The MMEU agent would choose a strategy that makes it equivalent to a Bayesian agent, as I describe. The characteristic ambiguity-averse behaviour only appears if the agent is allowed to update.

Given a Cartesian boundary between agent and environment, you could make an agent that prefers to have its future actions be those that are prescribed by MMEU, and you'd then get MMEU-like behaviour persisting upon reflection, but I assume this isn't what you mean since it isn't UDT-ish at all.

3Wei Dai
Suppose you program a UDT-MMEU agent to care about just one particular world defined by some world program. The world program takes a single bit as input, representing the mysterious coin, and the agent represents uncertainty about this bit using a probability interval. You think that in this world the agent will either be offered only bet 1, or only bet 2, or the world will split into two copies with the agent being offered a different bet in each copy (analogous to your example). You have logical uncertainty as to which is the case, but the UDT-MMEU agent can compute and find out for sure which is the case. (I'm assuming this agent isn't updateless with regard to logical facts but just computes as many of them as it can before making decisions.) Then UDT-MMEU would reject the bet unless it turns out that the world does split in two. Unless I made a mistake somewhere, it seems like UDT-MMEU does retain "ambiguity-averse behaviour" and isn't equivalent to any standard UDT agent, except in the sense that if you did know which version of the bet would be offered in this world, you could design a UDT agent that does the same thing as the UDT-MMEU agent.
endoself120

MMEU isn't stable upon reflection. Suppose that in addition to the mysterious [0.4, 0.6] coin, you had a fair coin, and I tell you that all offer bet 1 ("pay 50¢ to be payed $1.10 if the coin came up heads") if the fair coin comes up heads and bet 2 if the fair coin comes up tails, but you have to choose whether to accept or reject before flipping the fair coin to decide which bet will be chosen. In this case, the Knighian uncertainty cancels out, and your expected winnings are +5¢ no matter which value is [0.4, 0.6] is taken to be the true proba... (read more)

4PeterDonis
Does it? You still know that you will only be able to take one of the two bets; you just don't know which one. The Knightian uncertainty only cancels out if you know you can take both bets.
4Wei Dai
This looks more like a problem with updating than with MMEU though. It seems possible to design a variant of UDT that uses MMEU, without it wanting to self-modify into something else (at least not for this reason).
endoself100

This is a very general point. Most of the uncertainty people face is of the sort that they would naively classify as Knighian, so if people actually behaved according to MMEU, then they would essentially be playing minimax against the world.

9Manfred
Yeah that could lead to some pretty dumb behavior. For a silly example, "I don't know the skill of other drivers, so I'll just never use a road, because never using a road has higher minimum utility than dying in a car crash."
endoself300

I took the census. My answers for MWI and Ailens were conditional on ¬Simulation, since if we are in a simulation where MWI doesn't hold, the simulation is probably intended to provide information about a universe in which MWI does hold.

endoself150

I'm not sure what quantum mechanics has to do with this. Say humanity is spread over 10 planets. Would you rather take a logical 9/10 chance of wiping out humanity, or destroy 9 of the planets with certainty (and also destroy 90% of uninhabited planets to reduce the potential for future growth by the same degree)? Is there any ethically relevant difference between these scenarios?

2Stuart_Armstrong
Good point. The two scenarios have somewhat different intuition pumps, but are otherwise similar.
kilobug140

There are two differences I can see :

  1. The "planets" example admit the MWI is correct. Without MWI, the quantum trigger is exactly a normal random trigger, not killing 9/10th of the worlds, but killing everyone with a 1/10th probability. The thought experiment is a way to force people to quantify their acceptance of MWI.

  2. Communication. There is no communication possible between the MW, while there is between the planets, and that will have massive long term effects.

even if P is omniscient, P' still has to estimate it's expected output from its own limited perspective. As long as this estimate is reasonable, the omniscience of P doesn't cause a problem (and remember that P is fed noisy data).

Don't you have to get the exact level of noise that will prevent the AI from hiding from P without letting P reconstruct the AI's actions if it does allow itself to be destroyed? An error in either direction can be catastrophic. If the noise is to high, the AI takes over the world. If the noise is to low, E'(P(Sᵃ|X,Oᵃ,B)/P(Sᵃ|¬... (read more)

0Stuart_Armstrong
It's not so sensitive. The AI's actions in the box are very hard to detect from the perspective of fifty years, with minimal noise. The AI expanding dangerously across the universe would be easy to detect, even with a lot of noise (if nothing else, because humans would have recorded this and broadcast messages about it).

He's talking specifically about people donating to AMF. There are more things people can do than donate to AMF and donate to one of MIRI, FHI, CEA, and CFAR.

1lukeprog
Correct.

Yes, you can take the probability that they will halt given a random input. This is analogous to the case of a universal Turing machine, since the way we ask it to simulate a random Turing machine is by giving it a random input string.

Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you're constructing, but it doesn't have the causal interpretation you give it. You are doing something, but it isn't causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.

Causal networks do not make an iid assumption.

Yeah, I guess that's way too strong; there are a lot of alternative assumptions also that justify using them.

What is a sample? How do we know two numbers (or other strings) came from the same sample?

I think we just have to assume this problem solved. Whenever we use causal networks in practice, we know what a sample is. You can try to weaken this and see if you still get anything useful, but this is very different then 'conditioning on time' as you present in the post.

Since the association contains inf

... (read more)
0johnswentworth
Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation. So in the zero information case, we get this weird behavior that isn't what we expect. If the zero information case doesn't work, then we can't expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug. If we don't have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.

In fact, in order to truly ignore time data, we cannot even order the points according to time! But that means that we no longer have any way to line up the points T0 with e0, T1 with e1, etc.

What? This makes no sense.

I guess you haven't seen this stated explicitly, but the framework of causal networks makes an iid assumption. The idea is that the causal network represents some process that occurs a lot, and we can watch it occur until we get a reasonably good understanding of the joint distribution of variables. Part of this is that it the same process... (read more)

2johnswentworth
Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework. The problem is to reduce the phrase "the different measurements of each variable are associated because they come from the same sample of the causal process." What is a sample? How do we know two numbers (or other strings) came from the same sample? Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly? How can we handle uncertainty in the association apart from uncertainty in the values of the variables?

I, for one, have the terminal value of continued personal existence (a.k.a. being alive). On LW I'm learning that continuity, personhood, and existence might well be illusions. If that is the case, my efforts to find ways to survive amount to extending something that isn't there in the first place

I am confused about this as well. I think the right thing to do here is to recognize that there is a lot we don't know about, e.g. personhood, and that there is a lot we can do to clarify our thinking on personhood. When we aren't confused about this stuff anym... (read more)

Can you elaborate? This sounds interesting.

0Armok_GoB
I tried to write an article when I first discovered it. It kept growing and never getting more finished or getting to the point because I suck at writing articles (and it's very complicated and hard to explain and easily misunderstood in bad ways), so I gave up. This was something like a year ago. So sadly, no.

Neural signals represent things cardinally rather than ordinally, so those voting paradoxes probably won't apply.

Even conditional on humans not having transitive preferences even in an approximate sense, I find it likely that it would be useful to come up with some 'transativization' of human preferences.

Agreed that there's a good chance that game-theoretic reasoning about interacting submodules will be important for clarifying the structure of human preferences.

0[anonymous]
Neural signals represent things cardinally rather than ordinally I'm not sure what you mean by this. In the general case, resolution of signals is highly nonlinear, i.e. vastly more complicated than any simple ordinal or weighted ranking method. Signals at synapses are nearly digital, though: to first order, a synapse is either firing or it isn't. Signals along individual nerves are also digital-ish--bursts of high-frequency constant-amplitude waves interspersed with silence. My point, though, is that it's not reasonable to assume that transitivity holds axiomatically when it's simple to construct a toy model where it doesn't. On a macro level, I can imagine a person with dieting problems preferring starving > a hot fudge sundae, celery > starving, and a hot fudge sundae > celery.

What's wrong with the surreals? It's not like we have reason to keep our sets small here. The surreals are prettier, don't require an arbitrary nonconstructive ultrafilter, are more likely to fall out of an axiomatic approach, and can't accidently end up being too small (up to some quibbles about Grothendieck universes).

0Oscar_Cunningham
I agree with all of that, but I think we should work out what decision theory actually needs and then use that. Surreals will definitely work, but if hyperreals also worked then that would be a really interesting fact worth knowing, because the hyperreals are so much smaller. (Ditto for any totally ordered affine set).

No, that's not what I meant at all. In what you said, the agent needs to be separate from the system in order to preform do-actions. I want an agent that knows it's an agent, so it has to have a self-model and, in particular, has to be inside the system that is modelled by our causal graph.

One of the guiding heuristics in FAI theory is that an agent should model itself the same way it models other things. Roughly, the agent isn't actually tagged as different from nonagent things in reality, so any desired behaviour that depends on correctly making this dis... (read more)

Look, HIV patients who get HAART die more often (because people who get HAART are already very sick). We don't get to see the health status confounder because we don't get to observe everything we want. Given this, is HAART in fact killing people, or not?

Well, of course I can't give the right answer if the right answer depends on information you've just specified I don't have.

You're sort of missing what Ilya is trying to say. You might have to look at the actual details of the example he is referring to in order for this to make sense. The general id... (read more)

5IlyaShpitser
A challenge (not in a bad sense, I hope): I would be interested in seeing an EDT derivation of the right answer in this example, if anyone wants to do it.

If you want to change what you want, then you've decided that your first-orded preferences were bad. EDT recognizing that it can replace itself with a better decision theory is not the same as it getting the answer right; the thing that makes the decision is not EDT anymore.

No. For example, AIXI is what I would regard as essentially a Bayesian agent, but it has a notion of causality because it has a notion of the environment taking its actions as an input.

This looks like a symptom of AIXI's inability to self-model. Of course causality is going to look fundamental when you think you can magically intervene from outside the system.

Do you share the intuition I mention in my other comment? I feel that they way this post reframes CDT and TDT as attempts to clarify bad self-modelling by naive EDT is very similar to the way I wou... (read more)

0Qiaochu_Yuan
So your intuition is that causality isn't fundamental but should fall out of correct self-modeling? I guess that's also my intuition, and I also don't know how to make that precise.

These three causal graphs cannot be distinguished by the observational statistics. The causal information given in the problem is an essential part of its statement, and no decision theory which ignores causation can solve it.

I think this isn't actually compatible with the thought experiment. Our hypothetical agent knows that it is an agent. I can't yet formalize what I mean by this, but I think that it requires probability distributions corresponding to a certain causal structure, which would allow us to distinguish it from the other graphs. I don't kn... (read more)

0Richard_Kennaway
How about: an agent, relative to a given situation described by a causal graph G, is an entity that can perform do-actions on G.

UDT corresponds to something more mysterious

Don't update at all, but instead optimize yourself, viewed as a function from observations to actions, over all possible worlds.

There are tons of details, but it doesn't seem impossible to summarize in a sentence.

0Manfred
Or even simpler: find the optimal strategy, then do that.
endoself110

I'd like to make explicit the connection of this idea to hard takeoff, since it's something I've thought about before but isn't stated explicitly very often. Namely, this provides some reason to think that by the time an AGI is human-level in the things humans have evolved to do, it will be very superhuman in things that humans have more difficulty with, like math and engineering.

It provides a usefully concept, which can be carried over into other domains. I suppose there are other techniques that use a temperature, but I'm much less familiar with them and they are more complicated. Is understanding other metaheuristics more useful to people who aren't actually writing a program preforms some optimization than just understanding simulated annealing?

But it's actually important to the example. If someone intends to allocate their time searching for small and large improvements to their life, then simulated annealing suggests that they should make more of the big ones first. (The person you describe has may not have done this, since they've settled into a local optimum but now decide to find a completely different point on the fitness landscape, though without more details it's entirely possible they've decided correctly here.)

Your second paragraph could benefit from the concept of simulated annealing.

2MalcolmOcean
I propose making an analogy to Split & Merge Expectation Maximization (SMEM) instead. It's a very efficient algorithm for modelling that operates as follows: 1) perform EM to find the local optimum 2) examine the clusters to determine which two are most similar, and combine them 3) examine the clusters to determine which one is least representative of the data it's supposed to describe, and break it into two 4) perform EM on the three adjusted clusters 5) Repeat 1-4 until the change in likelihood between iterations drops below some epsilon. I think this is actually quite isomorphic to Goal Factoring (from Geoff Anders / CFAR) in that you're trying to combine things that are similar and then break up things that are inefficient. At least, I spent an entire summer working on an SMEM clustering program (though some of that was UI) and they feel similar to me.
0NoSignalNoNoise
Simulated annealing is one of many techniques for optimizing in search spaces with a lot of local maxima. Is there a reason why you're emphasizing that one in particular?
3Vaniver
There are lots of metaheuristic optimization methods; simulated annealing is the easiest one to explain and implement, but consequently it's also the dumbest of them.
3Qiaochu_Yuan
I thought about mentioning simulated annealing, but then it seemed to me like simulated annealing is more involved than the basic concept I wanted to get across (e.g. the cooling phase is an extra complication).

I'm not sure what you mean. Do you mean the scores given that you choose to cooperate and defect? There's a lot of complexity hiding in 'given that', and we don't understand a lot of it. This is definitely not a trivial fix to Lumifer's program.

0DanielLC
I messed up. I didn't realize that until after I posted, and I didn't feel like going back to fix it.

Another problem is that you cooperate agains CooperateBot.

2DanielLC
I didn't look at the details too much. You can fix that problem just by having it calculate the scores for cooperate and defect, and go with the one with the higher score.

From If Many-Worlds had Come First:

the thought experiment goes: 'Hey, suppose we have a radioactive particle that enters a superposition of decaying and not decaying. Then the particle interacts with a sensor, and the sensor goes into a superposition of going off and not going off. The sensor interacts with an explosive, that goes into a superposition of exploding and not exploding; which interacts with the cat, so the cat goes into a superposition of being alive and dead. Then a human looks at the cat,' and at this point Schrödinger stops, and goes,

... (read more)
-2Vratko_Polak
I am going to try and provide short answer, as I see it. (Fighting urge to write about different levels of "physical reality".) Many Words is an Interpretation. An interpretation should translate from mathematical formalism towards practical algorithms, but MWI does not go all the way. Namely, it does not specify the quantum state an Agent should use for computation. One possible state agrees with "Schroedinger's experiment was definitely set up and started", another state implies "cat definitely turned out to be alive", but those certainties cannot occur simultaneously. Bayesian inference in non-quantum physics also changes (probabilistic) state, but we can interpret it as a mere change of our beliefs, and not a change in the physical system. But in quantum mechanics, upon observation, the "objective" state fitting our knowledge changes. MWI says "fitting our knowledge" is not a good criterion of choosing quantum state to compute with (because no state can be fitting enough, as example with Shroedinger's cat shows) and we should compute with superposition of Agents. MWI may be more "objectively correct", but it does not seem to be more "practical" than Copenhagen interpretation. So physicists do like to cautiously agree with MWI, then wave hands, proclaim "Decoherence!" and at the end use Copenhagen interpretation as before. Introductory books emphasize experiments, and experimental results do not come in form of superpositioned bits. So before student gets familiar enough with mathematical formalism to think about detectors in superposition, Copenhagen is already occupying slot for Interpretation.
Shmi110

Why, then, don't more people realize that many worlds is correct?

Note that you are using Eliezer!correct, not Physics!correct. The former is based on Bayesian reasoning among models with equivalent predictive power, the latter requires different predictive power to discriminate between theories. The problem with the former reasoning is that without experimental validation it is hard to agree on the priors and other assumptions going into the Bayesian calculation for MWI correctness. Additionally, proclaiming MWI "correct" is not instrumentally... (read more)

I agree with this; the 'e.g.' was meant to point toward the most similar theories that have names, not pin down exactly what Eliezer is doing here. I though that it would be better to refer to the class of similar theories here since there is enough uncertainty that we don't really have details.

Yeah, this whole line of reasoning fails if you can get to 3^^^3 utilons without creating ~3^^^3 sentients to distribute them among.

Overall I'm having a really surprising amount of difficulty thinking up an example where you have a lot of causal importance but no anthropic counter-evidence.

I'm not sure what you mean. If you use an anthropic theory like what Eliezer is using here (e.g. SSA, UDASSA) then an amount of causal importance that is large compared to the rest of your reference class implies few similar members of the reference class, which is a... (read more)

0drnickbone
A comment: it is not clear to me that Eliezer is intending to use SSA or UDASSA here. The "magic reality fluid" measure looks more like SIA, but with a prior based on Levin complexity rather than Kolmogorov complexity - see my comment here. Or - in an equivalent formulation - he's using Kolmogorov + SSA but with an extremely broad "reference class" (the class of all causal nodes, most of which aren't observers in any anthropic sense). This is still not UDASSA. To get something like UDASSA, we shouldn't distribute the weight 2^-#p of each program p uniformly among its execution steps. Instead we should consider using another program q to pick out an execution step or a sequence of steps (i.e. a sub-program s) from p, and then give the combination of q,p a weight 2^-(#p+#q). This means each sub-program s will get a total prior weight of Sum {p, q: q(p) = s & s is a sub-program of p} 2^-(#p + #q). When updating on your evidence E, consider the class S(E) of all sub-programs which correspond to an AI program having that evidence, and normalize. The posterior probability you are in a particular universe p' then becomes proportional to Sum {q: q(p') is a sub-program of p' and a member of S(E)} 2^-(#p' + #q). This looks rather different to what I discussed in my other comment, and it maybe handles anthropic problems a bit better. I can't see there is any shift either towards very big universes (no presumptuous philosopher) or towards dense computronium universes, where we are simulations. There does appear to be a Great Filter or "Doomsday" shift, since it is still a form of SSA, but this is mitigated by the consideration that we may be part of a reference class (program q) which preferentially selects pre-AI biological observers, as opposed to any old observers.

Maybe I was unclear. I don't dismiss Y=TL4 as wrong, I ignore it as untestable and therefore useless for justifying anything interesting, like how an AI ought to deal with tiny probabilities of enormous utilities.

He's not saying that the leverage penalty might be correct because we might live in a certain type of Tegmark IV, he's saying that the fact that the leverage penalty would be correct if we did live in Tegmark IV + some other assumptions shows (a) that it is a consistent decision procedure and¹ (b) it is the sort of decision procedure that emer... (read more)

1Eliezer Yudkowsky
(Yep. More a than b, it still feels pretty unnatural to me.)
0Shmi
You are right, I am out of my depth math-wise. Maybe that's why I can't see the relevance of an untestable theory to AI design.

Pascal's mugging is less of a problem if your utility function is bounded, and it completely goes away if the bound is reasonably low, since then there just isn't any amount of utility that would outweight the improbability of the mugger being truthful.

I'm referring to an infinity of possible outcomes, not an infinity of possible choices. This problem still applies if the agent must pick from a finite list of actions.

Specifically, I'm referring to the problem discussed in this paper, which is mostly the same problem as Pascal's mugging.

2Stuart_Armstrong
Interesting problem, thanks! I personally felt that there could be a good case made for insisting your utility be bounded, and that paper's an argument in that direction.

If Pascal's mugger was a force of nature - a new theory of physics, maybe - then the case for keeping to expected utility maximisation may be quite strong.

There's still the failure of convergence. If the theory that made you think that it would be a good idea to accept Pascal's mugging tells you to sum an infinite series, and that infinite series diverges, then the theory is wrong.

2Stuart_Armstrong
The convergence can be solved using the arguments I presented in: http://lesswrong.com/lw/giu/naturalism_versus_unbounded_or_unmaximisable/ Essentially, take advantage of the fact that we are finite state probabilistic machines (or analogous to that), and therefore there is a maximum to the number of choices we can expect to make. So our option set is actually finite (though brutally large).

You still get a probability function without Savage's P6 and P7, you just don't get a utility function with codomain the reals, and you don't get expectations over infinite outcome spaces. If we add real-valued probabilities, for example by assuming Savage's P6', you even get finite expectations, assuming I haven't made an error.

I can find one discussion where, when the question of bounded utility functions came up, Eliezer responded, "[To avert a certain problem] the bound would also have to be substantially less than 3^^^^3." -- but this indicates a misunderstanding of the idea of utility, because utility functions can be arbitrarily (positively) rescaled or recentered. Individual utility "numbers" are not meaningful; only ratios of utility differences.

I think he was assuming a natural scale. After all, you can just pick some everyday-sized utility differe... (read more)

Quantum mechanics actually has lead to some study of negative probabilities, though I'm not familiar with the details. I agree that they don't come up in the standard sort of QM and that they don't seem helpful here.

IIRC putting all possible observers in the same reference class leads to bizarre conclusions...? I can't immediately re-derive why that would be.

The only reason that I have ever thought of is that our reference class should intuitively consist of only sentient beings, but that nonsentient beings should still be able to reason. Is this what you were thinking of? Whether it applies in a given context may depend on what exactly you mean by a reference class in that context.

3Will_Newsome
If it can reason but isn't sentient then it maybe doesn't have "observer" moments, and maybe isn't itself morally relevant—Eliezer seems to think that way anyway. I've been trying something like, maybe messing with the non-sentient observer has a 3^^^3 utilon effect on human utility somehow, but that seems psychologically-architecturally impossible for humans in a way that might end up being fundamental. (Like, you either have to make 3^^^3 humans, which defeats the purpose of the argument, or make a single human have a 3^^^3 times better life without lengthening it, which seems impossible.) Overall I'm having a really surprising amount of difficulty thinking up an example where you have a lot of causal importance but no anthropic counter-evidence. Anyway, does "anthropic" even really have anything to do with qualia? The way people talk about it it clearly does, but I'm not sure it even shows up in the definition—a non-sentient optimizer could totally make anthropic updates. (That said I guess Hofstadter and other strange loop functionalists would disagree.) Have I just been wrongly assuming that everyone else was including "qualia" as fundamental to anthropics?

If what he says is true, then there will be 3^^^3 years of life in the universe. Then, assuming this anthropic framework is correct, it's very unlikely to find yourself at the beginning rather than at any other point in time, so this provides 3^^^3-sized evidence against this scenario.

1A1987dM
I'm not entirely sure that the doomsday argument also applies to different time slices of the same person, given that Eliezer in 2013 remembers being Eliezer in 2012 but not vice versa.
Load More