This post does not prove Maximal Lottery Lotteries exist. Instead, it redefines MLL to be equivalent to the Nash bargaining solution (in a way that is obscured by using the same language as the MLL proposal), and then claims that under the new definition MLL exist (because the Nash bargaining solution exists).
I like Nash bargaining, and I don't like majoritarianism, but the MLL proposal is supposed to be a steelman of majoritarianism, and Nash bargaining is not only not MLL, but it is not even majoritarian. (If a majority of voters have the same favorite candidate, this is not sufficient to make this candidate win in the Nash bargaining solution.)
Dang. I wasn't entirely sure whether you were firm on the definition of lottery-lottery dominance or if that was more speculative. I guess I wasn't clear that MLLs were specifically meant to be "majoritarianism but better"? Given that you meant for it to be, this post sure doesn't prove that they exist. You're absolutely right that you can cook up electorates where the majority-favored candidate isn't the Nash bargaining/Geometric MLL favored candidate.
Hello! I'm glad to read more material on this subject.
First I want to note that it took me some time to understand the setup, since you're working with a modified notion of maximal lottery-lotteries than the one Scott wrote about. And this made it unclear to me what was going on until I'd read a bunch through and put it together, and changes the meaning of the post's title as well.
For that reason I'd like to recommend adding something like "Geometric" in your title. Perhaps we can then talk about this construction as "Geometric Maximal Lottery-Lotteries", or "Maximal Geometric Lottery-Lotteries"? Whichever seems better!
It seems especially important to distinguish names because these seem to behave distinctly than the linear version. (As they have different properties in how they treat the voters, and perhaps fewer or different difficulties in existence, stability, and effective computation.)
With that out of the way, I'm a tentative fan of the geometric version, though I have more to unpack about what it means. I'll divide my thoughts & questions into a few sections below. I am likely confused on several points. And my apologies if my writing is unclear, please ask followup questions where interesting!
When reading the earlier sequence I was struck by how unwieldy the linear/majoritarian formulation ends up being! Specifically, it seemed that the full maximal-lottery-lottery would need to encode all of the competing coordination cliques in the outer lottery, but then these are unstable to small perturbations that shift coordination options from below-majority to above-majority. And this seemed like a real obstacle in effectively computing approximations, and if I undertand correctly is causing the discontinuity that breaks the Nash-equilibria-based existence proof.
My thought then about what might more sense was a model of "war"/"power" in which votes against directly cancel out votes for. So in the case of an even split we get zero utility rather than whatever the majority's utility would be. My hope was that this was both a more realistic model of how power should work, which would also be stable to small perturbations and lend more weight to outcomes preferred by supermajorities. I never cached this out fully though, since I didn't find an elegant justification and lost interest.
So I haven't thought this part through much (yet), but your model here in which we are taking a geometric expectation, seems like we are in a bargaining regime that's downstream of each voter having the ability to torpedo the whole process in favor of some zero point. And I'd conjecture that if power works like this, then thinking through fairness considerations and such we end up with the bargaining approach. I'm interested if you have a take here.
I was also a big fan of the full personal utility information being relevant, since it seems that choosing the "right" outcome should take full preferences about tradeoffs into account, not just the ordering of the outcomes. It was also important to the majoritarian model of power that the scheme was invariant to (affine) changes in utility descriptions (since all that matters to it is where the votes come down).
Thinking about what's happened with the geometric expectation, I'm wondering how I should view the input utilities. Specifically, the geometric expectation is very sensitive to points assigned zero-utility by any part of the voting measure. So we will never see probability 1 assigned to an outcome that has any voting-measure on zero utility (assuming said voting-measure assigns non-zero utility to another option).
We can at least offer say probability on the most preferred options across the voting measure, which ameloriates this.
But then I still have some questions about how I should think about the input utilities, how sensitive the scheme is to those, can I imagine it being gameable if voters are making the utility specifications, and etc.
The original sequence justified lottery-lotteries with a (compelling-to-me) example about leadership vs anarchy, in which the maximal lottery cannot encode the necessary negotiating structure to find the decent outcome, but the maximal lottery-lottery could!
This coupled with the full preference-spec being relevant (i.e. taking into account what probabilistic tradeoffs each voter would be interested in) sold me pretty well on lottery-lotteries being the thing.
It seemed important then that there was something different happening on the outer and inner levels of lottery. Specifically when checking for dominance with , we would check . This is doing a majority check on the outside, and compares lotteries via an average (i.e. expected utility) on the inside.
Is there a similar two-level structure going on in this post? It seemed that your updated dominance criterion is taking an outer geometric expectation but then double-sampling through both layers of the lottery-lottery, so I'm unclear that this adds any strength beyond a single-layer "geometric maximal lottery".
(And I haven't tried to work through e.g. the anarchy example yet, to check if the two layers are still doing work, but perhaps you have and could illustrate?)
So yeah I was expecting to see something different in the geometric version of the condition that would still look "two-layer", and perhaps I'm failing to parse it properly. (Or indeed I might be missing something you already wrote later in the post!) In any case I'd appreciate a natural language description of the process of comparing two lottery-lotteries.
Thinking about what's happened with the geometric expectation, I'm wondering how I should view the input utilities. Specifically, the geometric expectation is very sensitive to points assigned zero-utility by any part of the voting measure.
This comment may be relevant here.
I haven't engaged with this in enough detail, but some people who engaged with Scott's sequence who I can imagine being interested in this: @Scott Garrabrant , @James Payor, @Quinn, @Nathan Helm-Burger, @paulfchristiano.
dedicated to a very dear cat;
;3
Fraktur is only ever used for the candidate set and the dregs set . I would also have used it for the Smith set , but \frak{S} is famously bad. I thought it was a G for years until grad school because it used to be standard for the symmetry group on n letters. Seriously, just look at it: .
Typography is a science and if it were better regarded perhaps mathematicians would not be in the bind they are these days :P
It suffices to show that the Smith lotteries that the above result establishes are the only lotteries that can be part of maximal lottery-lotteries are also subject to the partition-of-unity condition.
I fail to understand this sentence. Here are some questions about this sentence:
what are Smith lotteries? Ctrl+f only finds lottery-Smith lottery-lotteries, do you mean these? Or do you mean lotteries that are smith?
which result do you mean by "above result"?
What does it mean for a lottery to be part of maximal lottery-lotteries?
does "also subject to the partition-of-unity" refer to the smith lotteries or to the lotteries that are part of maximal lottery-lotteries? (it also feels like there is a word missing somewhere)
Why would this suffice?
Is this part also supposed to imply the existence of maximal lottery-lotteries? If so, why?
To avoid confusion: this post and my reply to it were also on a past version of this post; that version lacked any investigation of dominance criterion desiderata for lottery-lotteries.
what are Smith lotteries?
Lotteries over the Smith set. That definitely wasn't clear - I'll fix that.
which result do you mean by "above result"?
This one. "You can tell whether a lottery-lottery is maximal or not by how good the partitions of unity it admits are." Sorry, didn't really know a good way to link to myself internally and I forgot to number the various statements.
What does it mean for a lottery to be part of maximal lottery-lotteries?
Just that some maximal lottery-lottery gives it nonzero probability.
does "also subject to the partition-of-unity" refer to the smith lotteries or to the lotteries that are part of maximal lottery-lotteries? (it also feels like there is a word missing somewhere)
Oh no! I thought I caught all the typos! That should be "also subject to the partition-of-unity condition", that is, you look at all the lotteries (which we know are over the Smith set, btw) that some arbitrary maximal lottery-lottery gives any nonzero probability to, and you should expect to be able to sort them into groups by what final probability over candidates they induce; those final probabilities over candidates should themselves result in identical geometric-expected utility for the voterbase.
Why would this suffice?
Consider: at this point we know that a maximal lottery-lottery would not just have to be comprised of lottery-Smith lotteries, i.e., lotteries that are in the lottery-Smith set - but also that they have to be comprised entirely of lotteries over the Smith set of the candidate set. Then on top of that, we know that you can tell which lottery-lotteries are maximal by which partitions of unity they admit (that's the "above result"). Note that by "admit" we mean "some subset of the lotteries this lottery-lottery has support over corresponds to it" (this partition of unity).
This is the slightly complicated part. The game I described has a mixed strategy equilibrium; this will take the form of some probability distribution over . In fact it won't just have one, it'll likely have whole families of them. Much of the time, the lotteries randomized over won't be disjoint - they'll both assign positive probability to some candidate. The key is, the voter doesn't care. As far as a voter's expected utility is concerned, the only thing that matters is the final probability of each candidate.
That's where your collapse of different possible maximal lottery-lotteries to the same partition of unity comes in. Because we know that equivalent candidate-lotteries give voters the same expected utility, the only two ways you get a voter who's indifferent between two candidate-lotteries are 1) they're the same lottery or 2) the voter's utility function is just right to have two very different lotteries tie. Likewise, the only two ways you get a voterbase to be indifferent between two lottery-lotteries is 1) they reduce to the same lottery or 2) the geometric expectation of a voter's utility over candidates sampled from the samples of the lottery-lottery Just Plain Ties.
So: if we can show that whatever equilibrium set of candidate-lotteries Alice and Bob pick always collapses to some convex combination of the Best partitions of unity...? Yeah, I don't think that the second half of the proof holds up as is.
I think I've slightly messed up the definition of lottery-Smith, though not in a fatal way nor (thankfully) in a way that looks to threaten the existence result. The set condition is too strong, in requiring that a lottery-Smith lottery contain all lotteries which correspond to any of the admissible partitions. I'm just going to cut it; it's not actually necessary.
Is this part also supposed to imply the existence of maximal lottery-lotteries? If so, why?
Yes.
Yes, and in particular, it implies the existence of maximal lottery-lotteries before it even tries to prove that they're also lottery-Smith in the sense I define.
Scott's proof doesn't quite work (as he says there) - it almost works, except for the part where the reward functions for Alice and Bob can quite reasonably be discontinuous. My proof is intended as a patch - the reward functions for Alice and Bob should now be extremely continuous in a way that also corresponds well to "how much better did Alice do at picking a candidate-lottery that V will like than Bob did?".
Hopefully this helped? Reading this is confusing even for me sometimes - the word "lottery/lotteries", which appears 59 times in this comment alone, no longer looks like a real word to me and hasn't since late Wednesday. Your comment was really helpful - I have some editing to do! (update - editing is done.)
epistemic/ontological status: almost certainly all of the following -
Maximal Lottery-Lotteries: Far More Than It Ever Occurred To You To Want To Know
This post is mostly a distillation/concentrated treatment of the Maximal Lottery-Lotteries Sequence, though I define an important property and prove an important existence result at the end. It's the first of two posts in a sequence, the second of which is linked here.
The Maximal Lottery-Lotteries sequence details why anyone should care about sortitive/lottery-using electoral systems even without any particular hope of getting to implement it.[1] It also ends in early November of 2022 with the author and a handful of other technical alignment notables all honorably giving up on the shiny math problem with varying degrees of explicitness. This post is meant as a follow-on to the sequence, and its sequel is, too. I also end up drawing heavily on the Geometric Rationality sequence for its central tools.
As per my usual policy, if I've managed to misunderstand anything or write anything up unclearly or incorrectly, or you have any questions about what I've written, please comment below or at https://www.admonymous.co/lorxus and I'll fix it/reply to it when I can.
If you're reading this post for the first time, you might want to keep the notation reference on hand. If you're relatively inexperienced with reading text which treats dense mathematical notation on par with English prose, slow down and make sure you understand what each mathematical expression really means or what object or type of object it refers to before continuing. "Is it a nonce-object instantiated just for the proof, or does the notation suggest anything more about its type or identity?" If you can answer that kind of question without much effort, it'll make understanding any mathematically-dense text much easier. Mathematical notation is extremely compact and precise; if I tried to write all the below purely in prose, it'd be three times as long and a tenth as clear - but all compression comes with compute costs, and math notation is no exception.[2]
Generalized Voting Theory
Let C be a set of candidates, VC≅Hom(C,[0,1]) the voterspace for C, which is also the set of all utility functions on C (or candidate preferences), and V∈ΔVC an electorate (or voterbase), which is a partition of unity tagged by candidate preferences which we may think of as the "voter share" with each candidate preference.[3] Then a voting system f:ΔVC→ΔC is a function taking a distribution over voterspace on C to a distribution over C.[4] We write fC(V) for the outcome of voting system f with choice of candidates taken from C and polling electorate V; we'll often suppress the subscript when the candidate set is clear.
This is classically subject to the following four constraints:[5]
We can generally ignore regularity, and (spoilers) we'll soon be relaxing determinism, too. At the end, we even get rid of ordinality.
Now for a few desirable properties voting systems can have. We'll phrase them in our language of functions and voting outcomes, which lends itself so much better to generalizing electoral outcomes from deterministic outcomes to lotteries:
Equivalently: assume that for some c∈C, it is the case that ∀d∈C,Pv∼V(v(c)>v(d))>12 whenever d≠c. Then a Condorcet f satisfies fC(V)(c)=1.[6]
Equivalently: assume that we can separate C as C=S⊔D such that ∀c∈S,d∈D,Pv∼V(v(c)>v(d))>12. Then a Smith f satisfies fC(V)(S)=1, that is, fC(V)∈ΔS.
Equivalently: assume that fC(V)=fC(W)=D∈ΔC. Then for all choices of intermediacy parameter p∈[0,1], a consistent f satisfies fC(p⋅V+(1−p)⋅W)=D.
Equivalently, let W∈ΔVC be an electorate, v∈VC a voter, p∈[0,1] an intermediacy parameter. Then a participatory f satisfies v(fC(p⋅v+(1−p)⋅W))≥v(fC(W)), where we interpret p⋅v+(1−p)⋅W as the p-parametrized mix of W and the element V∈ΔVC that happens to put measure 1 on v, or equivalently as something for which sampling from it means first choosing between v with probability p and otherwise a draw from W with probability 1−p, and where for a lottery D∈ΔC, we interpret v(D)=Ed∼Dv(d).
Equivalently, let C=ˆC⊔{d}⊔D, and suppose that ∀c∈ˆC,di∈D,
v∈V∈ΔVC, we have v(c)>v(di)↔v(c)>v(d) and v(c)<v(di)↔v(c)<v(d). Then a dupe-resistant f satisfies fC(V)(c)=fC∖D(V)(c) for all c∈ˆC, that is, c still wins exactly as much as if there were just no d-dupes.
Famously, Arrow's impossibility theorem says that for a classical voting system (i.e., one that deterministically spits out a single candidate at the end, among other requirements), you can't have all of these. You can't even have the Condorcet criterion, dupe-resistance, and consistency at the same time - isomorphic to Arrow's usual sorrowful guarantee. You can't even have both participation and the Condorcet criterion at the same time! A pall of soot falls over the Sun, and democracy dies to thunderous applause.
Also, I've cheated a little here, because that's technically not the standard definition of the Smith criterion. That's because the real standard definition is actually just wrong for our purposes - it always talks about such a voting system always picking its unique winner from the Smith set. We recall that voting theorists usually assume determinism of outcomes, which we have long since made up our minds to discard - if we were to put that assumption back, we recover the classical definition of the Smith criterion.
But look - phrasing it this way means that the Smith criterion generalizes itself when we pass from deterministic elections to lotteries, and it even gets much more powerful and flexible when we do! We should think of the Smith condition as being "the Condorcet condition, but one which fails more gracefully and has a reasonable plan in case there's no Condorcet winner". For deterministic elections, that additional strength doesn't count for much more than some annoying noise in the outcomes. However, this weakness - that if we force the Smith set to do all the final choosing for us, we then have to blame it for sometimes producing unsatisfying outcomes - is one we already have the tools to trivially resolve. We simply leave it to a lottery, and the Smith criterion's issues fall away. Sortitive voting systems are the Smith criterion's true home.[8]
The Smith criterion's precondition is strictly weaker than that of the Condorcet criterion (because if there's a Condorcet winner, then the Smith set is a singleton containing just that candidate) and its backup guarantees are strictly stronger than that of the Condorcet criterion (some admittedly weaker control over the outcome of the election, as opposed to nothing at all) so overall the Smith criterion is a more empowering property than the Condorcet criterion is. That said, it's not by all that much - for example, in a worst-case scenario where the voterbases's collective preferences over candidates is maximally-nontransitive, the Smith set is just the entire candidate set.
Lottery Outcomes
We start by relaxing determinism and ceasing to bother tracking regularity from among the assumptions in the previous section; the immediate consequence of this is that fC(V) is in general a lottery D∈ΔC rather than an individual candidate (or in reality a measure-1 chance of drawing some specific candidate) as might have been tacitly assumed in the previous section.
Probabilistic outcomes also require actually formulating a definition of what it means for one candidate to beat another. Let A,B∈ΔC be candidate-lotteries, V∈ΔVC an electorate, and for the sake of notation take draws a∼A,b∼B,v∼V. Then A dominates B (and we write A⪰B) if Pa,b,v(v(a)>v(b))≥Pa,b,v(v(b)>v(a)). In slightly more prose, we take independent draws a,b,v from A,B,V, and then compare v(a),v(b)∈[0,1], and if it's more often or more likely the case that (after we've drawn all our samples) v prefers a to b, then A dominates B. Importantly, dominance is not a necessarily transitive relation, but it still only requires ordinal preference rankings and so our assumption of ordinality is safe for now.
A maximal lottery is some lottery M∈ΔC such that for all candidate-lotteries L∈ΔC,M⪰L.
Additionally, M is Condorcet, consistent, dupe-resistant, and participatory. Indeed, M is characterized as the unique candidate lottery which is Condorcet, consistent, and dupe-resistant. We'll only prove the first two of those as both the most important and most surprising pair, in light of Arrow's impossibility theorem, and because the proof of dupe-resistance actually follows from the below lemma that maximal lotteries are Condorcet. There's nothing particularly fancy or clever going on here, just a chase through some linear-algebra and a little bit of definition-prodding:
With a tiny bit of effort, I can extend this result to the Smith criterion. Unlike the previous lemmas, I suspect that this is an original result, if a modest one. As above, there's nothing too fancy going on here, just some linear-algebra manipulation and definition-prodding:
It's worth noting that on the other hand, a maximal lottery need not assign positive measure to every candidate in the Smith set, as far as I can tell. And why should it? Maybe one of the candidates in the Smith set is vastly worse than any other. Maybe three of the candidates in the Smith set are all vastly worse than all the rest, but they happen to form a Condorcet cycle among themselves and the very worst one overall has a tiny edge over some other candidate in the Smith set, so none of them is quite dominated by the entire Smith set. Being a Smith candidate is no guarantee of being any good as a candidate! It's also worth noting that being Smith, consistent, and dupe-resistant is equally much a ~unique characterization of a maximal lottery as being Condorcet, consistent, and dupe-resistant, since Smith implies Condorcet.
Geometric Maximization, Geometric Mean
The next piece of the puzzle to introduce up front is one that didn't make its way into the initial maximal lottery-lottery sequence. That piece is called geometric maximization. The heart of the motivation for using weighted geometric means and geometric maximization is this: geometric maximization is an ironclad defense against the very worst of the tyranny of the majority, because if even a single utility function in a set is constant-0 on expectation, the geometric mean over all utility functions in the set is necessarily constant-0, too. Because geometric maximization characterizes Nash bargaining, it fits the bill if you model voting as modeling civil war where the winner is neither the best-resourced side with certainty nor determined perfectly proportionately to resources, but rather the division of spoils is instead as given by the outcome of the high-stakes cooperative bargaining (with anarchy as default outcome) the soldiers might be engaging in if they could think and negotiate fast enough to avoid a war, which is better than both. For this reason, when we want to maximize "the expected utility of the whole voterbase", what we'll maximize is the geometric mean of the expected utility of each of the voters.[9][10]
It's worth noting at this point that if we want to make use of weighted-geometric-mean-based tools, we also technically have to give up on homogeneity, because now, we really do (and should!) care at least a tiny bit about the absolute number of voters in the populations we're joining together. In reality, we only care about which subelectorate is smaller (or less powerful, or less important, or less significant, or...) than the other one, and by what multiplicative factor.[11] Sometimes the factor is small enough to work with, and other times the factor is so large that we should basically just check if a certain statement holds for all weight-values above some real number. Ultimately, though, we'll ignore the weakening of homogeneity, given that we're not doing anything overly bad/non-homogeneous[12] - everything we care about is still completely scale-invariant, as long as we apply the same scale transformation to all voterbases - and thus we preserve most of the homogeneity property, requiring only the added information of electorate size/population/"power factor", and additionally require that the parameter p is always the share of the voterbase power factor of the smaller/weaker voterbase with respect to both voterbases, such that 0<p≤1. Call it quasi-homogeneity if you want - if you were to fix a positive integer once and for all and duplicated each voter that many times, the outcome of the election shouldn't change.
The geometric mean of two numbers is given by Gx∈{A,B}x=√A⋅B; similarly, the geometric mean of a finite set of numbers is given by Gx∈{xi}ni=1x=n√Πixi. The p-weighted geometric mean of two numbers is given by Gp(A,B):=√A2p⋅B2−2p=Ap⋅B1−p, where 0<p≤1; we'll use this notation to simplify expressions later on.[13]
This allows us to straightforwardly define a weighted-collective-utility function on the join of two voterbases (or of a voterbase with a single voter) in a way that has some hope of avoiding the usual failure modes of another possible approach which makes use only of arithmetic means:
Thankfully, in practice we may always assume that we're always dealing with two voterbases of vaguely comparable sizes, rather than joining a single voter or a finite voterbase to a potentially infinite one. Why? If we join a finite voterbase to an infinite one and have chosen to weight the two proportionally to their population sizes, then we just throw out the results from the finite voterbase, and if either of the voterbases is uncountable, Arrow's impossibility theorem no longer holds, so there's not much point to any of this.[14] So either the voterbases we want to join are both of finite size, in which case we have an easy principled answer - proportional weighting - or they are both countably large, and we should have already chosen how we want to weight the two against each other (with p=12≅q=1 as the principled default choice).
Lottery-Lottery Outcomes and the Lottery-Smith Criterion
A lottery-lottery is what it's called: a lottery where some or all of the outcomes you can draw are themselves often an instruction to draw from a lottery. Some things change, and more things don't. A crucially important thing that does not change is the type signature of an electorate, which is still that of a distribution over (or again, if you like, multiset of) functions from the set of candidates to [0,1]; now, voters v∈V - who still have the type signature of a single function fron the candidate set to [0,1] - vote for lottery-candidates in ΔC based on the voter's expected utility Ec∼Cv(c).
Such strength as we require lottery-lotteries to have will come at a cost. When we passed from classical voting theory to sortitive voting theory, we lost determinacy and discarded regularity. How much will Mathematics demand of us in weakening of properties of candidate-lottery-lotteries from candidate-lotteries, if we want for lottery-lotteries to be an even-handed function for social-preference aggregation? And can we claw anything back?
The first thing to go is the assumption of ordinality. As in the linked illustrative example of a 3-person electorate points out, ordinal preferences are no longer enough to tell us what lotteries should go in a desirable lottery-lottery. Also, if we tell our voters to rate candidates cardinally, we don't want to have the failure mode where a voter getting an expected utility of 0 means anything other than that their least-favorite candidate is a guaranteed outcome - it should be very hard to get an expected utility of exactly 0. So we'll replace it with the following slightly weaker regularity condition:
Next, after Scott, we choose to overtax the Condorcet criterion to empower dupe-resistance:
This is a stronger precondition than the classical Condorcet condition, checks for which will thus come up less often and have less of an effect on outcomes: now the putative lottery-Condorcet-favored candidate has to beat every single dupe-trenchcoat[17] on top of all the real candidates before it can be crowned Obvious Correct Champion.
This is a more powerful condition to rely on always being true than the classical dupe-resistance criterion: now the voting system must be able to ignore all dupe-trenchcoats on top of still needing to be able to ignore any dupes of single candidates. Indeed, as the original writeup remarks, passing to a maximal lottery-lottery is just like adding all possible lotteries over candidates to the set of candidates,[18] and then running a maximal lottery over the result. For a dupe-trenchcoat-resistant voting system, it doesn't matter how many suspiciously-trenchcoated dupe-lotteries enter the election - the outcome of the election remains invariant.
Finally, we give up the game-theoretic/strategyproof definition of dominance, which was as follows:[19]
But why? Why would we accept such an act, and how could we ever choose something to replace so elegant and so innately appealing as the game-theoretic definition of dominance? The problem lies in the kinds of tradeoffs the strategyproof definition would endorse making, which make perfect sense from a game-theoretic perspective but which would be a terrible idea to have a voting system endorse. To see this, let C={A,B,C}, and let V∈ΔVC be given by 0.51:[A:0.6,B:0.55,C:0],0.48:[A:0,B:0.01,C:1],0.01:[A:0,B:0.55,C:1]. Then under a game-theoretic definition of dominance, picking candidate A with probability 1 would be a maximal lottery (and any lottery-lottery that reduced to it would also be maximal).
But this seems absurd! If we want the dominance of one lottery-lottery over another to mean that we should prefer that the voterbase get the first lottery-lottery outcome over the second, then it seems desirable for that that sense of electoral dominance to be some relatively mathematically simple one which happens to be both utilitarian - preferring outcomes where everyone is on the whole cardinally better off, especially if that's true for each voter - and egalitarian - dispreferring outcomes where anyone is much worse-off, especially if it's cheap in overall cardinal utility to make the worst voter less badly off. Manifestly, the game-theoretic definition of dominance, while elegant and inherently strategyproof, does not do these things very well if at all. If we're lucky, there might even be some natural scoring rule that marks the function of voters' expected utilities that our notion of dominance relies on as optimal for proportional representation.[20] Luckily for us, just such a function exists - the geometric mean!
Though perhaps all of this is a just-so story, shadows on the wall cast by the real reason we should find geometric means so compelling - the fact that it characterizes the Nash solution to cooperative bargaining as generalized to players of differing weights, which itself is an optimal-ish way to mix utility functions with differing weights with some mathematically nice properties reminiscent of some of the ones we've defined and cared about here; even if a lottery-lottery looks like it's not strategyproof on object level, we can appeal to some embedding of this voting problem into the cooperative bargaining problem to allay such worries.[21]
Note that even passing to arithmetic means wouldn't suffice to rule out this setup:[22] always electing A gives V an expected utility of 0.306, while (for example) always electing B gives V an elected utility of 0.2908. On the other hand, taking geometric means means electing A gives V a geometric expected utility of 0.0, and electing B gives V a geometric expected utility of ∼0.0804; this both makes it more clear which outcome is closer to satisfying our desiderata from above and gives us a measure of precisely how much better it is overall than any maximally unequal distribution of expected utilities.[23]
Accordingly, we'll give the real definition of lottery-lottery dominance more simply as follows:
A maximal lottery-lottery is some lottery-lottery M∈Δ2C such that for all L∈Δ2C,M⪰L.
The thing is, we don't necessary need to steal quite as much strength as that from the Condorcet criterion without giving anything back. Here's the first major piece of work likely purely my own: the definition of the lottery-Smith criterion.
Equivalently, let V∈ΔVC be an electorate, and separate ΔC as ΔC=SΔ⊔DΔ, such that ∀S∈SΔ,D∈DΔ, Pv∼V(Es∼Sv(s)≥Ed∼Dv(d))>12. Then a lottery-Smith f satisfies fC(V)(SΔ)=1, that is, fC(V)∈ΔSΔ⊆Δ2C.
In the same way that when we passed from Condorcet to lottery-Condorcet, we got a much stronger precondition, the lottery-Smith condition has a much stronger precondition than the classical Smith condition. Accordingly, checks for it will come up less often and have less of an effect on outcomes: now a potentially lottery-Smith-favored candidate has to belong to the lottery-Smith set - not just the Smith set - before it can be crowned an Obvious Strong Candidate which is worth including in some of the lotteries that make up a lottery-Smith lottery-lottery.
But on the other hand, the lottery-Smith condition also has a weaker precondition than lottery-Condorcet - just as the Smith condition has a weaker precondition to check than the Condorcet condition - and it provides (barely) stronger backup guarantees than lottery-Condorcet - just as the Smith criterion provides (barely) stronger backup guarantees than the Condorcet condition does.
Overall, as far as how the various "candidate-sorting" properties match up against each other:
Smith > Condorcet >>> Lottery-Smith >> Lottery-Condorcet
Lottery-Smith ⪰ [Lottery-Condorcet ⊥ Smith] ⪰ Condorcet
What do lottery-Smith sets have to look like, apart from the final-probabilities condition? Well, good news...
...and bad news...
...and good news for people who like bad news.
We say specifically "utility-maximal" or "utility-nonmaximal" here because the conditions for maximality for candidates and lotteries are now meaningfully different from the maximality condition for lotteries - in particular, utility-maximality is transitive.
As a relevant observation, we should recall that voters vote based on expected utility, and so it can sometimes be more natural to think of elements of ΔC as being the points of a simplex, rather than distinguish lotteries with identical expected outcomes. That way, we can think of our voting system and electorate as giving us a map fV:ΔC→[0,1],L↦Gv∈VEc∼Lv(c) from the simplex to the unit interval; to some extent, all we care about is the preimage of expected utilities in [0,1].[27][28] Ties are a lot more common for lottery-lotteries,[29] and that's before even considering the rare random tie grounded in some indifference of outcomes which just happen to have the same expected values under the electorate-utility-function - in such a case, two such possible maximal lottery-lotteries need not even share any overlap in the set of partitions of unity realized by subsets.[30] Thankfully, the latter ties depend on the voterbase and are thus likely not generic.[31]
Working over lotteries instead of lottery-lotteries has no equivalent to this extra reduction step, either way we might slice it. We didn't need either half of this extra step, since maximal lotteries already are their own final probabilities over candidates, and we were totally fine with the maximal lottery being a lottery over several candidates, because there wasn't an additional level to have to think about or reduce from. But now, not only is it possible for any final distribution over candidates to be realized in entire families of different equally valid ways, but it's not even guaranteed that the voterbase-favored effective final lottery over candidates is unique! I am almost certain that this is some substantial part of the true reason for the 'fractal structure due to Nash-equilibrium constraints' Scott posits.[32] Additionally, because ΔC,Δ2C are compact and can be realized as simplices in Euclidean space, and because the geometric-expected-value evaluation maps eΔ:(ΔC,VC)→[0,1],(L,V)↦Gv∈VEc∼Lv(c), eΔ2:(Δ2C,VC)→[0,1],(L,V)↦Gv∈VEc′∼∼Lv(c′) are continuous for any finite choice of voterbase,[33] we know that by the extreme value theorem, these maps achieve their respective maxima somewhere on their domains. Accordingly, we're fine to just assume that they exist, but will generally not be unique.
This all presents a strong impression of what a maximal lottery-lottery would have to be - some geometrically-maximizing lottery over candidate-lotteries which are themselves game-theoretically-dominant lotteries over Smith candidates (who are themselves game-theoretically non-dominated).
Maximal Lottery-Lotteries Exist
Better-apprised of the benefits of geometric maximization, a little more effort now results in not only the proof we were hoping for earlier, but a substantially stronger existence result.[34]
This looks like it's just a patch to the earlier lemma to make it into an existence proof merely by taking the abandonment of ordinality maximally seriously. It is not. It rests heavily on the revised definition of lottery-lottery dominance I considered it justified to pick, given our other philosophical desiderata for voting systems and our abandonment of ordinality and homogeneity.
In particular, we can characterize lottery-lotteries not only by their final probabilities over candidates, but also by how much probability they allocate to each candidate-tagged partition of unity, where each of the candidate-tagged partitions of unity would result in the same geometric-expected payoff as any other.
Now, existence results are near and dear to my heart as a mathematician by inclination and by training both, but I also believe in winning maximally and, on finding checkmate, looking for better. So it's certainly better than nothing... but it's still not good enough. We want a more constructive process or some characterizing set of theorems.
And we'll get to that! Although a natural semi-constructive option should be jumping out at us now,[35] we first need a better understanding of how precisely a maximal lottery-lottery would have to handle joins of populations if we want to be sure that this definition of a maximal lottery-lottery holds up the way we should want it to for joins of populations, and so my definition of the modularity criterion, which replaces both participation and consistency, will have to wait for the next post, as will my construction(s) of a maximal lottery-lottery. However, this seems like a natural point to break the post - some light experimentation suggested that if I make this single post any longer, anyone reading it will rapidly fall off in energy and interest, and that's not what I want.[36] So I'll end this post with the pure existence result, and when you feel up to reading about how I define modularity and how I construct maximal lottery-lotteries, click here.
Notation Reference
Throughout this post, I consistently use the following font conventions for mathematical objects:
Here's a guide to the standard mathematical notation I use extensively here:
The quick version, which I detail slightly more at the end of Lottery Outcomes above and which Scott writes up in Maximal Lotteries, is that if you allow for the possibility of lottery outcomes (which don't even have to crop up most of the time), you can have voting systems that defy Arrow's impossibility theorem by being both Condorcet and consistent.
This is fully general advice for reading math papers or anything with lots of abstract quantitative content to it, and is meant for almost literally anyone who hasn't taken an proof-based course in math. I would provide it almost verbatim to anyone who asked me for my professional advice on how to avoid bouncing off a math paper or how to successfully read one. Every time I neglect it myself when reading math, I end up wasting time and regretting it and cursing my folly as I pick thorns and bramble out of my face. It is good advice; bitter like leafy greens. Please take it.
If you'd rather think of it as a multiset, that's cool too. That technically violates homogeneity, but we'll actually have to massively weaken that assumption later on anyway when we start talking about conditions on outcomes over joins of electorates, so it's really down to taste and which way of thinking about, seeing, and manipulating the underlying objects comes most naturally to you. They're basically equivalent.
Yes, you read that correctly. A distribution. Like, a probability distribution. And yes, the "determinism" condition introduced immediately afterwards completely obviates this setup - the output distribution is required to assign all of its probability-mass to a single outcome - such that the move to lottery outcomes is extremely telegraphed.
Scott doesn't give these properties names in his writeup and I can't find standard names for them anywhere else. I've thus chosen to given them evocative names for myself.
"Whenever we pick a random voter with utility function v from the electorate V, if the probability that v ranks c above d is always >12, no matter the d, that means that a majority of the electorate prefers c to d, and fC(V)(c) therefore assigns weight 1 to candidate c as the Obvious Correct Champion."
In Scott's writeup and likely elsewhere, this is called "clone-invariance" (also known as independence of clones), which is actually a slightly weaker form of the criterion found in Arrow's impossibility theorem, independence of irrelevant alternatives (IIA). Cousin-properties also weaker than IIA include local independence (deleting the winner and deleting the last-place loser must each leave the other candidates' relative finishes unchanged), independence of worst candidates (deleting the n worst candidates must leave the other candidates' finishes unchanged), and Smith independence (if for some proper subset of the candidates (the Frontrunners), every candidate in the set wins all its head-to-head matchups against every candidate outside the set (the Dregs), then deleting all of the Dregs must leave the Frontrunners' finishes unchanged).
You know, the same way that toruses would prefer to live in R4 so they can lie everywhere-locally flat, and how all nitrogen's favorite home in chemical-space is as nice stable dinitrogen, and how rays of light really do want to move in straight lines, honest, it's just all this rest mass hanging around distorting the metric.
In one very austere sense, this is all that need be said in this entire post: change the dominance criterion from game-theoretic to Nash-bargaining, as realized by a geometric mean. The rest is careful commentary.
Unfortunately, the lottery-Condorcet condition is way weaker than the ordinary Condorcet criterion, so unlike for maximal lotteries, we're likely going to have to break out the weird-shaped dice and the spinners and the quantum-random bits and such.
Strictly speaking, this actually describes something determined by the weight value p - that is, q=1−pp, which ranges over 0≤q<∞ when 0<p≤1. This way, though, we don't have to even start relitigating (in the prose commentary!) whether or not it makes sense to talk about arbitrarily small positive numbers.
Like, for instance, basing anything at all off of the last digit of the number of votes cast for candidate c.
Why the factors of 2 in the exponents under the square root? Because this needs to be a straightforward generalization of the unweighted geometric mean. Try it yourself for p=12.
Unless you like nonprincipal ultrafilters way more than I do on any given day, anyway. Most of the time, I find them spooky, and I generally require extra caffeine before I'm willing to deal with them.
Seriously, go read The Lottery in Babylon. It's like four pages long, very fun, and exquisitely thematically relevant to us here. In fact, it would make an excellent palate cleanser for between the two posts.
We should treat this as a regularity condition on the voters and the candidates both, given that it implies that a voter won't/can't have utility 0 on more than one candidate. It's not that strong a condition - if a significant number of voters maximally hate the entire candidate set, that implies problems about the candidate set that no clever voting system can fix.
SCENE: Daytime, exterior. Standing in a row are the candidates for the upcoming election for World Ruler: Alice, Bob, Claire, a twenty-foot-tall thin and misshapen figure in a large trench-coat parts of which are partially transparent, Daniel, and Eve. They have come before The World Electoral Commission, which has controversially banned lottery-duplicate candidates, and which has gathered these notables to question them on suspicion of an illicit lottery-duplicate candidate having managed to register anyway. Tension mounts as the questioning enters its second hour - they're in for a long day.
In open defiance of the overthrown World Electoral Commission's ancien regime.
Gentlemen, ladies, both, and neither, restrain your shocked gasps, I beg you!
We might also want for whatever the function on expected voter utilities to be effectively computable, given those utilities, and also compositional over joins of electorates. Unlike the more important properties listed in the text proper, the arithmetic mean definition would satisfy both of these and the game-theoretic definition only sometimes satisfies the latter.
Here's a sketch of yet another way of convincing yourself that we don't need to worry too much about abandoning strategyproofness as dominance criterion. It'll require you to think of our voting systems in the frankly pseudohomogeneous case - i.e., as being about a multiset of utility functions - and it also won't make sense until you've read about modularity in the next post, but if you don't want to go read that, it'll be fine to think of a modular voting system as a voting system that's almost consistent in a way that plays nicely with joining populations with weighted geometric means.
The key things to notice are that a voter's preferences over lotteries are entirely determined by their preferences over candidates, that they assign distinct nonnegative utilities to each candidate, and thus that no clever voting plan gives any individual voter more control over outcomes, after the geometric averaging, than any other. The first two mean that there's at most a single lottery that any voter has expected value constant-0 on: the one assigning probability 1 to whichever candidate that voter likes least. In particular, a voter can't rig their utility function so as to claim to assign utility 0 to almost all lotteries, as would be an obvious strategy for hijacking an electorate, and no voter has more control over outcomes than any other, as we'll see more explicitly when I sketch a construction of maximal lottery lotteries. In some sense, none of the sharply dominating voting strategies you could implement for rigging an election over candidate-lotteries - even assuming the election takes voters' utility functions into account - work for dominating voting in a maximal lottery-lottery, or even a modular lottery-lottery, because they're not even in the strategy set available to the voters. As a direct result, the only plays available to individual voters are the "fair" ones implying linearly self-consistent things about their own preferences, so that every voter has the same amount of steering power over outcomes.
And why should it? It doesn't characterize Nash bargaining. Geometric means do. End of.
By contrast, the lottery-outcome given by random-dictatorship of subpopulations, (0.51A,0.49C), has geometric-expected utility (0.51⋅0.6+0.49⋅0)0.51⋅(0.51⋅0+0.49⋅1)0.49 =0.3060.51⋅0.490.49∼0.3854 (and arithmetic-expected utility ∼0.411), a massive improvement over either one!
Yes, I'm still going to write ⪰ instead of ≥. No, I don't care that this is now a total order induced by the canonical pullback of the canonical order on the reals. I want a symbol specifically for dominance-and-maybe-not-literally-greater-than, and I'm the one writing the post.
A more formal proof:
Generally, the lottery-Smith set will be some convex subset of ΔS. In fact, for ⊆c taken to mean "is a convex subset of, within the appropriate domain", S⊆cC, SΔ⊆cΔS⊆cΔC.
We use the geometric mean over voters here (weighted by vote-share) instead of taking the arithmetic expected value sampling over the voterbase because later on, we'll use geometric maximization extensively for its compositional properties and fairness guarantees. I decided to include it in the definition here rather than keep it a mystery until after I talk about geometric maximization for the sake of clarity. If you don't know yet what's going on with it, either don't worry about it for now or just skip to the section about geometric maximization and come back.
Also: these preimage sets might be submanifolds of the simplex? I'm not quite sure to what extent the submersion theorem applies here.
Take the example (for the maximally muddled 3-voter electorate) of the lottery-lotteries (13(1A),13(1B),13(1C)) and (16(1A),16(1B),16(1C),12(13A,13B,13C)) - and the entire rest of the family (p3(1A),p3(1B),p3(1C),3−3p(13A,13B,13C)), for 0≤p≤1, including (1(13A,13B,13C)). All of these are equivalent to the same maximal lottery-lottery - the one for which U={(13A,13B,13C)} - and these aren't even all of the lotteries that can show up for the maximal lottery-lotteries for this voterbase.
Here's a very simple example of this kind of failure happening maximally badly in action: take a candidate set of just three candidates, A,B,C, and a voterbase of just two voters, v,w, such that v(A)=1,v(B)=0,v(C)=12 and w(A)=0,w(B)=1,w(C)=12. That is - C is a compromise candidate. Then any of the family of lotteries given by (p(12A,12B),(1−p)(1C)),0≤p≤1 all tie each other for expected utility, despite giving any candidate probability to C you like! And it should - for all intents and purposes, C is indistinguishable from (12A,12B).
I can just hear the prop-collision noises of the resulting z-fighting.
The rest is covered by the compositionality you get from the use of weighted geometric means, and the equivalence between the "mash them all together immediately" and "assembly by means of full binary tree" approaches that I describe in the next post.
To see this for the lottery case, note that for any voter v∈V, Ec∈Lv(c) is continuous over L∈ΔC, and then we take the geometric mean of a countable number of continuous functions, the result of which is still continuous under the appropriate metric. (If you don't like that, pass to finite instead. Its support is also limited to parts of ΔC where no voter has expected utility 0.) The lottery-lottery case is identical, except that we first reduce to a lottery under the usual map L↦Pc∼∼L(c=ci). All of this probably also works for the case of countably many voters, but I don't know how to prove that, and anyway we'll see later that we can pretty much just pass to the case where |V|<∞. This kind of approach also lets us trivially prove things like "every continuous map from lottery-lotteries to lottery-lotteries has at least one fixed point; this includes every continuous map which is weakly increasing in geometric expectation of utility", which you may spot as yet another avenue I might have picked to prove the existence of maximal lottery-lotteries. I don't actually know if that proof sketch works, though. Δ2C would, at the absolute least, have to be endowed with an appropriately-motivated topology.
I'm not going to lie, when I set out to prove something like this theorem, I wasn't expecting to prove anything more than that a maximal lottery-lottery would have to be lottery-Smith. I totally underestimated this approach, judging that no way could just patching the lemma from earlier work. Turns out: full utility data is really powerful.
I've described it elsewhere in the post and so won't here. If that option isn't jumping out at you... I want to say "you should think about it more carefully" but really I don't know how hard it is to see the answer. I think it's worth taking a literal clock-time five minutes to see if you can spot it; and if you still can't, fair play - but better if you make your own guesses first.
Water/a drink, caffeine/meds, maybe a light snack, definitely a stretch and a pace.[15] That's my recommendation, anyway. The second post will almost certainly still be there after you're done and you'll be better suited to process it.
Scott doesn't see a need for the subscript here and thus suppresses it in his notation.
Scott notates this one more literally as ΔΔC.