I'd strongly agree with that, mainly because, while points with this property exist, they are not necessarily unique. The non-uniqueness is a big issue.
It was a typo! And it has been fixed.
It is because beach/beach is the surplus-maximizing result. Any Pareto-optimal bargaining solution where money is involved will involve the surplus-maximizing result being played, and a side payment occuring.
I have a reduction of this problem to a (hopefully) simpler problem. First up, establish the notation used.
[n] refers to the set . is the number of candidates. Use as an abbreviation for the space , it's the space of probability distributions over the candidates. View as embedded in , and set the origin at the center of .
At this point, we can note that we can biject the following:
1: Functions of type
2: Affine functions of type
3: Functions o...
What I mean is, the players hit each other up, are like "yo, let's decide on which ROSE point in the object-level game we're heading towards"
Of course, they don't manage to settle on what equilibrium to go for in the resulting bargaining game, because, again, multiple ROSE points might show up in the bargaining game.
But, the ROSE points in the bargaining game are in a restricted enough zone (what with that whole "must be better than the random-dictator point" thing) to seriously constrain the possibilities in the object-level game. "Worst ROSE point ...
I think I have a contender for something which evades the conditional-threat issue stated at the end, as well as obvious variants and strengthenings of it, and which would be threat-resistant in a dramatically stronger sense than ROSE.
There's still a lot of things to check about it that I haven't done yet. And I'm unsure how to generalize to the n-player case. And it still feels unpleasantly hacky, according to my mathematical taste.
But the task at least feels possible, now.
EDIT: it turns out it was still susceptible to the conditional-threat issue, but t...
For 1, it's just intrinsically mathematically appealing (continuity is always really nice when you can get it), and also because of an intution that if your foe experiences a tiny preference perturbation, you should be able to use small conditional payments to replicate their original preferences/incentive structure and start negotiating with that, instead.
I should also note that nowhere in the visual proof of the ROSE value for the toy case, is continuity used. Continuity just happens to appear.
For 2, yes, it's part of game setup. The buttons are of whate...
My preferred way of resolving it is treating the process of "arguing over which equilibrium to move to" as a bargaining game, and just find a ROSE point from that bargaining game. If there's multiple ROSE points, well, fire up another round of bargaining. This repeated process should very rapidly have the disagreement points close in on the Pareto frontier, until everyone is just arguing over very tiny slices of utility.
This is imperfectly specified, though, because I'm not entirely sure what the disagreement points would be, because I'm not sure how the "...
So, if you are limited to only pure strategies, for some reason, then yes, Chinese would be on the Pareto frontier.
But if you can implement randomization, then Chinese is not on the Pareto frontier, because both sides agree that "flip a coin, Heads for Sushi, Tails for Italian" is just strictly better than Chinese.
The convex shape consists of all the payoff pairs you can get if you allow randomization.
Alright, this is kind of a Special Interest, so here's your relevant thought dump.
First up, the image is kind of misleading, in the sense that you can always tack on extra orders of magnitude. You could tack on another thousand orders of magnitude and make it look even longer, or just go "this is 900 OOM's of literally nothing happening, lets clip that off and focus on the interesting part"
Assuming proton decay is a thing (that free protons decay with a ridiculously long half-life)....
ok, I was planning on going "as a ludicrous upper bound, here's the numb...
Yeah, "transferrable utility games" are those where there is a resource, and the utilities of all players are linear in that resource (in order to redenominate everyone's utilities as being denominated in that resource modulo a shift factor). I believe the post mentioned this.
Task completed.
Agreed. The bargaining solution for the entire game can be very different from adding up the bargaining solutions for the subgames. If there's a subgame where Alice cares very much about victory in that subgame (interior decorating choices) and Bob doesn't care much, and another subgame where Bob cares very much about it (food choice) and Alice doesn't care much, then the bargaining solution of the entire relationship game will end up being something like "Alice and Bob get some relative weights on how important their preferences are, and in all the subgam...
Actually, they apply anyways in all circumstances, not just after the rescaling and shifting is done! Scale-and-shift invariance means that no matter how you stretch and shift the two axes, the bargaining solution always hits the same probability-distribution over outcomes, so monotonicity means "if you increase the payoff numbers you assign for some or all of the outcomes, the Pareto frontier point you hit will give you an increased number for your utility score over what it'd be otherwise" (no matter how you scale-and-shift). And independence of ir...
If you're looking for curriculum materials, I believe that the most useful reference would probably be my "Infra-exercises", a sequence of posts containing all the math exercises you need to reinvent a good chunk of the theory yourself. Basically, it's the textbook's exercise section, and working through interesting math problems and proofs on one's own has a much better learning feedback loop and retention of material than slogging through the old posts. The exercises are short on motivation and philosophy compared to the posts, though, much like how a fu...
So, if you make Nirvana infinite utility, yes, the fairness criterion becomes "if you're mispredicted, you have any probability at all of entering the situation where you're mispredicted" instead of "have a significant probability of entering the situation where you're mispredicted", so a lot more decision-theory problems can be captured if you take Nirvana as infinite utility. But, I talk in another post in this sequence (I think it was "the many faces of infra-beliefs") about why you want to do Nirvana as 1 utility instead of infinite utility.
Parfit's Hi...
So, the flaw in your reasoning is after updating we're in the city, doesn't go "logically impossible, infinite utility". We just go "alright, off-history measure gets converted to 0 utility", a perfectly standard update. So updates to (0,0) (ie, there's 0 probability I'm in this situation in the first place, and my expected utility for not getting into this situation in the first place is 0, because of probably dying in the desert)
As for the proper way to do this analysis, it's a bit finicky. There's something called "acausal form...
Omega and hypercomputational powers isn't needed, just decent enough prediction about what someone would do. I've seen Transparent Newcomb being run on someone before, at a math camp. They were predicted to not take the small extra payoff, and they didn't. And there was also an instance of acausal vote trading that I managed to pull off a few years ago, and I've put someone in a counterfactual mugging sort of scenario where I did pay out due to predicting they'd take the small loss in a nearby possible world. 2/3 of those instances were cases where I was s...
The key piece that makes any Lobian proof tick is the "proof of X implies X" part. For Troll Bridge, X is "crossing implies bridge explodes".
For standard logical inductors, that Lobian implication holds because, if a proof of X showed up, every trader betting in favor of X would get free money, so there could be a trader that just names a really really big bet in favor of the X (it's proved, after all), the agent ends up believing X, and so doesn't cross, and so crossing implies bridge explodes.
For this particular variant of a logical inductor, there's an...
Said actions or lack thereof cause a fairly low utility differential compared to the actions in other, non-doomy hypotheses. Also I want to draw a critical distinction between "full knightian uncertainty over meteor presence or absence", where your analysis is correct, and "ordinary probabilistic uncertainty between a high-knightian-uncertainty hypotheses, and a low-knightian uncertainty one that says the meteor almost certainly won't happen" (where the meteor hypothesis will be ignored unless there's a meteor-inspired modification to what you do that's al...
Something analogous to what you are suggesting occurs. Specifically, let's say you assign 95% probability to the bandit game behaving as normal, and 5% to "oh no, anything could happen, including the meteor". As it turns out, this behaves similarly to the ordinary bandit game being guaranteed, as the "maybe meteor" hypothesis assigns all your possible actions a score of "you're dead" so it drops out of consideration.
The important aspect which a hypothesis needs, in order for you to ignore it, is that no matter what you do you get the same outcome, whether ...
Well, taking worst-case uncertainty is what infradistributions do. Did you have anything in mind that can be done with Knightian uncertainty besides taking the worst-case (or best-case)?
And if you were dealing with best-case uncertainty instead, then the corresponding analogue would be assuming that you go to hell if you're mispredicted (and then, since best-case things happen to you, the predictor must accurately predict you).
This post is still endorsed, it still feels like a continually fruitful line of research. A notable aspect of it is that, as time goes on, I keep finding more connections and crisper ways of viewing things which means that for many of the further linked posts about inframeasure theory, I think I could explain them from scratch better than the existing work does. One striking example is that the "Nirvana trick" stated in this intro (to encode nonstandard decision-theory problems), has transitioned from "weird hack that happens to work" to "pops straight out...
Availability: Almost all times between 10 AM and PM, California time, regardless of day. Highly flexible hours. Text over voice is preferred, I'm easiest to reach on Discord. The LW Walled Garden can also be nice.
Amendment made.
A note to clarify for confused readers of the proof. We started out by assuming , and . We conclude by how the agent works. But the step from there to (ie, inconsistency of PA) isn't entirely spelled out in this post.
Pretty much, that follows from a proof by contradiction. Assume con(PA) ie , and it happens to be a con(PA) theorem that the agent can't prove in advance what it will do, ie, . (I can spell this out in more detail if anyone wants) However, com...
In the proof of Lemma 3, it should be
"Finally, since , we have that .
Thus, and are both equal to .
instead.
Any idea of how well this would generalize to stuff like Chicken or games with more than 2-players, 2-moves?
I was subclinically depressed, acquired some bupropion from Canada, and it's been extremely worthwhile.
I don't know, we're hunting for it, relaxations of dynamic consistency would be extremely interesting if found, and I'll let you know if we turn up with anything nifty.
Looks good.
Re: the dispute over normal bayesianism: For me, "environment" denotes "thingy that can freely interact with any policy in order to produce a probability distribution over histories". This is a different type signature than a probability distribution over histories, which doesn't have a degree of freedom corresponding to which policy you pick.
But for infra-bayes, we can associate a classical environment with the set of probability distributions over histories (for various possible choices of policy), and then the two distinct notions becom...
I'd say this is mostly accurate, but I'd amend number 3. There's still a sort of non-causal influence going on in pseudocausal problems, you can easily formalize counterfactual mugging and XOR blackmail as pseudocausal problems (you need acausal specifically for transparent newcomb, not vanilla newcomb). But it's specifically a sort of influence that's like "reality will adjust itself so contradictions don't happen, and there may be correlations between what happened in the past, or other branches, and what your action is now, so you can exploit this by ac...
Re point 1, 2: Check this out. For the specific case of 0 to even bits, ??? to odd bits, I think solomonoff can probably get that, but not more general relations.
Re: point 3, Solomonoff is about stochastic environments that just take your action as an input, and aren't reading your policy. For infra-Bayes, you can deal with policy-dependent environments without issue, as you can consider hard-coding in every possible policy to get a family of stochastic environments, and UDT behavior naturally falls out as a result from this encoding. There's still some op...
Ah. So, low expected utility alone isn't too much of a problem. The amount of weight a hypothesis has in a prior after updating depends on the gap between the best-case values and worst-case values. Ie, "how much does it matter what happens here". So, the stuff that withers in the prior as you update are the hypotheses that are like "what happens now has negligible impact on improving the worst-case". So, hypotheses that are like "you are screwed no matter what" just drop out completely, as if it doesn't matter what you do, you might as well pick actions t...
"mixture of infradistributions" is just an infradistribution, much like how a mixture of probability distributions is a probability distribution.
Let's say we've got a prior , a probability distribution over indexed hypotheses.
If you're working in a vector space, you can take any countable collection of sets in said vector space, and mix them together according to a prior giving a weight to each set. Just make the set of all points which can be made by the process "pick a point from each set, and mix the points together according to ...
The concave functional view is "the thing you do with a probability distribution is take expectations of functions with it. In fact, it's actually possible to identify a probability distribution with the function mapping a function to its expectation. Similarly, the thing we do with an infradistribution is taking expectations of functions with it. Let's just look at the behavior of the function we get, and neglect the view of everything as a set of a-measures."
As it turns out, this view makes proofs a whole lot cleaner a...
Sounds like a special case of crisp infradistributions (ie, all partial probability distributions have a unique associated crisp infradistribution)
Given some , we can consider the (nonempty) set of probability distributions equal to where is defined. This set is convex (clearly, a mixture of two probability distributions which agree with about the probability of an event will also agree with about the probability of an event).
Convex (compact) sets of probability distributions = crisp infradistributions....
You're completely right that hypotheses with unconstrained Murphy get ignored because you're doomed no matter what you do, so you might as well optimize for just the other hypotheses where what you do matters. Your "-1,000,000 vs -999,999 is the same sort of problem as 0 vs 1" reasoning is good.
Again, you are making the serious mistake of trying to think about Murphy verbally, rather than thinking of Murphy as the personification of the "inf" part of the definition of expected value, and writing actual equations. &nb...
There's actually an upcoming post going into more detail on what the deal is with pseudocausal and acausal belief functions, among several other things, I can send you a draft if you want. "Belief Functions and Decision Theory" is a post that hasn't held up nearly as well to time as "Basic Inframeasure Theory".
If you use the Anti-Nirvana trick, your agent just goes "nothing matters at all, the foe will mispredict and I'll get -infinity reward" and rolls over and cries since all policies are optimal. Don't do that one, it's a bad idea.
For the concave expectation functionals: Well, there's another constraint or two, like monotonicity, but yeah, LF duality basically says that you can turn any (monotone) concave expectation functional into an inframeasure. Ie, all risk aversion can be interpreted as having radical uncertainty over some aspects of how the environment...
Maximin, actually. You're maximizing your worst-case result.
It's probably worth mentioning that "Murphy" isn't an actual foe where it makes sense to talk about destroying resources lest Murphy use them, it's just a personification of the fact that we have a set of options, any of which could be picked, and we want to get the highest lower bound on utility we can for that set of options, so we assume we're playing against an adversary with perfectly opposite utility function for intuition. For that last paragraph, translating it back out from the "Murphy" t...
I found this Quanta magazine article about it which seems to indicate that it fits the CMB spectrum well but required a fair deal of fiddling with gravity to do so, but I lamentably lack the physics capabilities to evaluate the original paper.
If there's something wrong with some theory, isn't it quite odd that looking around at different parts of the universe seems to produce such a striking level of agreement on how much missing mass there is? If there was some out-of-left-field thing, I'd expect it to have confusing manifestations in many different areas and astronomers angsting about dramatically inconsistent measurements, I would not expect the CMB to end up explained away (and the error bars on those measurements are really really small) by the same 5:1 mix of non-baryonic matter vs baryon...
Yes, pink is gas and purple is mass, but also the gas there makes up the dominant component of the visible mass in the Bullet Cluster, far outweighing the stars.
Also, physicists have come up with a whole lot of possible candidates for dark matter particles. The supersymmetry-based ones took a decent kicking at the LHC, and I'm unsure of the motivations for some of the other ones, but the two that look most promising (to me, others may differ in opinion) are axions and sterile neutrinos, as those were conjectured to plug holes in the Standard Model, so they...
I'd go with number 2, because my snap reaction was "ooh, there's a "show personal blogposts" button?"
EDIT: Ok, I found the button. The problem with that button is that it looks identical to the other tags, and is at the right side of the screen when the structure of "Latest" draws your eyes to the left side of the screen. I'd make it a bit bigger and on the left side of the screen.
So, first off, I should probably say that a lot of the formalism overhead involved in this post in particular feels like the sort of thing that will get a whole lot more elegant as we work more things out, but "Basic inframeasure theory" still looks pretty good at this point and worth reading, and the basic results (ability to translate from pseudocausal to causal, dynamic consistency, capturing most of UDT, definition of learning) will still hold up.
Yes, your current understanding is correct, it's rebuilding probability theory in more generality to be sui...
So, we've also got an analogue of KL-divergence for crisp infradistributions.
We'll be using and for crisp infradistributions, and and for probability distributions associated with them. will be used for the KL-divergence of infradistributions, and will be used for the KL-divergence of probability distributions. For crisp infradistributions, the KL-divergence is defined as
I'm not entirely sure why it's like this, but it has the basic properties yo...
It is currently disassembled in my garage, will be fully tested when the 2.0 version is built, and the 2.0 version has had construction stalled for this year because I've been working on other projects. The 1.0 version did remove CO2 from a room as measured by a CO2 meter, but the size and volume made it not worthwhile.
That original post lays out UDT1.0, I don't see anything about precomputing the optimal policy within it. The UDT1.1 fix of optimizing the global policy instead of figuring out the best thing to do on the fly, was first presented here, note that the 1.1 post that I linked came chronologically after the post you linked.