StrivingForLegibility

In Strategic Time, Open-Source Games Are Loopy

In Scott Garrabrant's excellent Geometric Rationality sequence, he points out an equivalence between modelling an agent as

Maximizing the expected logarithm of some quantity $V$ , $E [l n (V)]$
Maximizing the geometric expectation of $V$ , $G [V]$

And as we'll show in this post, not only can we prove a geometric version of the VNM utility theorem:

An agent is VNM-rational if and only if there exists a function $V$ that:
- Represents the agent's preferences over lotteries
  - $L ≺ M$ if and only if $V (L) < V (M)$
- Agrees with the geometric expectation of $V$
  - $V = G [V]$

Which in and of itself is a cool equivalence result, that $E [U]$ maximization $⟺$ VNM rationality $⟺$ $G [V]$ maximization. But it turns out these are just two out of a huge family of expectations we can use, like the harmonic expectation $H$ , which each have their own version of the VNM utility... (read 3123 more words →)

Replying toIn Strategic Time, Open-Source Games Are Loopy

StrivingForLegibility1y

Thank you! I think it's exactly the same kind of "conditioning my output on their output" that you were pointing to in your analogy to iterated games. And I expect there's a strong correspondence between "program equilibria where you only condition on predicted outputs" and "iterated game equilibria that can form a stable loop."

Replying toGeometric Utilitarianism (And Why It Matters)

StrivingForLegibility1y

Geometric Utilitarianism (And Why It Matters)

Thank you! Ideally, I think we'd all like a model of individual rationality that composes together into a nice model of group rationality. And geometric rationality seems like a promising step in that direction.

The Carnot Engine of Economics

The history of coordination spans billions of years, and we've been finding new ways to help each other out for as long as there have been more than one of us. From multicellularity to the evolution of brains, from the development of social and moral instincts to their codification in laws and contracts, from the emergence of currency to the invention of stocks and bonds and options and every other modern financial instrument, we have accumulated countless ways to work together.

Every time a group gets a little better at coordinating, moving closer to the Pareto frontier, they get a little more economically efficient. The field of thermodynamics has a model for the... (read 1473 more words →)

Replying toThe Geometric Importance of Side Payments

The Geometric Importance of Side Payments

This might be a framing thing!

The background details I’d been imagining are that Alive and Bob were in essentially identical situations before their interaction, and it was just luck that Alice and Bob got the capabilities they did.

Alice and Bob have two ways to convert tokens into money, and I’d claim that any rational joint strategy involves only using Bob’s way. Alice's ability to convert tokens into pennies is a red herring that any rational group should ignore.

At that point, it's just a bargaining game over how to split the $1,000,000,000. And I claim that game is symmetric, since they’re both equally necessary for that surplus to come into existence.

If Bob had instead paid huge costs to create the ability to turn tokens into tens of millions of dollars, I totally think his costs should be repaid before splitting the remaining surplus fairly.

Replying toThe Geometric Importance of Side Payments

StrivingForLegibility2y*

The Geometric Importance of Side Payments

Limiting it to economic/comparable values is convenient, but also very inaccurate for all known agents - utility is private and incomparable.

I think modeling utility functions as private information makes a lot of sense! One of the claims I’m making in this post is that utility valuations can be elicited and therefore compared.

My go-to example of an honest mechanism is a second-price auction, which we know we can implement from within the universe. The bids serve as a credible signal of valuation, and if everyone follows their incentives they’ll bid honestly. The person that values the item the most is declared the winner, and economic surplus is maximized.

(Assuming some background facts, which aren't... (read more)

Individual Utilities Shift Continuously as Geometric Weights Shift

Gradient Ascenders Reach the Harsanyi Hyperplane

This is a supplemental post to Geometric Utilitarianism (And Why It Matters), in which I show that when all agents have positive weight $ψ_{i}$ , the optimal geometric weighted average moves continuously across the Pareto frontier as we change those weights. I also show that we can extend this continuity result to all weights $ψ$ , if we're willing to accept an arbitrarily good approximation of maximizing $G (_, ψ)$ . I think of this as a bonus result which makes the geometric average a bit more appealing as a way to aggregate utilities, and the main post goes into more detail about the problem and why it's interesting.

High Level Overview

How does changing $ψ$ affect the optima of $G (_, ψ)$ ? Ideally, we'd like a... (read 4818 more words →)

Deriving the Geometric Utilitarian Weights

This is a supplemental post to Geometric Utilitarianism (And Why It Matters), in which I show that, if we use the weights we derived in the previous post, a gradient ascender will reach the Harsanyi hyperplane $H$ . This is a subproblem of the proof laid out in the first post of this sequence, and the main post describes why that problem is interesting.

The Gradient and Contour Lines

It's easy to find the points $s \in R^{n}$ which have the same $G$ score as $p$ : they're the points which satisfy $G (s, ψ) = G (p, ψ)$ . They all lie on a skewed hyperbola that touches $P$ at $p$ .

Geometric Weight Calculation — Check out an interactive version here

One way to think about $G$ is as a hypersurface in $n + 1$ -dimensional space sitting "above" the n-dimensional space of utilities we've been... (read 1533 more words →)

Proving the Geometric Utilitarian Theorem

This is a supplemental post to Geometric Utilitarianism (And Why It Matters), in which I show how I derived the weights $ψ$ which make any Pareto optimal point $p$ optimal according to the geometric weighted average. This is a subproblem of the proof laid out in the first post of this sequence, and the main post describes why that problem is interesting.

Overview

So how are we going to calculate weights $ψ$ which make $p$ optimal among $F$ ?

The idea here is to identify the Harsanyi hyperplane $H$ , which contains all of the joint utilities $u \in R^{n}$ which satisfy $H (u, ϕ) = H (p, ϕ)$ . Where $ϕ$ are the weights which make our chosen point $p \in R^{n}$ optimal with respect to $H (_, ϕ)$ . And we're going to calculate new weights $ψ$ which make $p$ optimal with respect to $G (_, ψ)$ . It turns out it's sufficient to make $p$ optimal... (read 3132 more words →)

The Geometric Importance of Side Payments

This is a supplemental post to Geometric Utilitarianism (And Why It Matters), which sets out to prove what I think are the main interesting results about Geometric Utilitarianism:

Maximizing a geometric weighted average $G (_, ψ)$ can always lead to Pareto optimality.
Given any Pareto optimal joint utility $p$ , we can calculate weights $ψ$ which make $p$ optimal according to $G (_, ψ)$ .

That post describes why this problem is interesting, but the quick summary is: geometric utility aggregation is a candidate alternative to Harsanyi utility aggregation (which is an arithmetic weighted average), which handles some tradeoffs better than Harsanyi aggregation. The resulting choice function is geometrically rational, whereas the Harsanyi choice function is VNM-rational. This post is mostly math supporting the main post, with some details... (read 2117 more words →)

Geometric Utilitarianism (And Why It Matters)

I'm generally a fan of "maximize economic surplus and then split the benefits fairly". And I think this approach makes the most sense in contexts where agents are bargaining over a joint action space $D \times P$ , where $D$ is some object-level decision being made and $P$ are side-payments that agents can use to transfer value between them.^[1]

An example would be a negotiation between Alice and Bob over how to split a pile of $100$ tokens, which Alice can exchange for $$ 0.01$ each, and Bob can exchange for $$ 10, 000, 000$ each. The sort of situation where there's a real and interpersonally comparable difference in the value they each derive from their least and most favorite outcome.^[2]

In this example $D$ is the convex set containing joint utilities for all... (read 632 more words →)

Do you like using numbers to represent uncertainty and preference, but also care about things like fairness and consent? Are you an altruist on a budget, looking to do the most good with some of your resources, but want to pursue other goals too? Are you looking for a way to align systems to the interests of many people? Geometric Utilitarianism might be right for you!

Classic Utilitarianism

The Harsanyi utilitarian theorem is an amazing result in social choice theory, which states that if a social choice function $F : R^{n} \to R$ is both

VNM-rational, and
Pareto monotone (Pareto improvements never make $F$ lower)

then for any joint utility $u \in R^{n}$ , $F (u)$ must be equal to a weighted average of individual utilities that looks like $H (u, ϕ) = u \cdot ϕ = \sum_{i = 1}^{n} u_{i} ϕ_{i}$ , where $\cdot$ is the dot product... (read 3073 more words →)

Replying toUpdatelessness doesn't solve most problems

The problem remains though: you make the ex ante call about which information to "decision-relevantly update on", and this can be a wrong call, and this creates commitment races, etc.

My understanding is that commitment races only occur in cases where "information about the commitments made by other agents" has negative value for all relevant agents. (All agents are racing to commit before learning more, which might scare them away from making such a commitment.)

It seems like updateless agents should not find themselves in commitment races.

My impression is that we don't have a satisfactory extension of UDT to multi-agent interactions. But I suspect that the updateless response to observing "your counterpart has committed... (read more)

Replying toUpdatelessness doesn't solve most problems

Got it, thank you!

It seems like trapped priors and commitment races are exactly the sort of cognitive dysfunction that updatelessness would solve in generality.

My understanding is that trapped priors are a symptom of a dysfunctional epistemology, which over-weights prior beliefs when updating on new observations. This results in an agent getting stuck, or even getting more and more confident in their initial position, regardless of what observations they actually make.

Similarly, commitment races are the result of dysfunctional reasoning that regards accurate information about other agents as hazardous. It seems like the consensus is that updatelessness is the general solution to infohazards.

My current model of an "updateless decision procedure", approximated on a real... (read more)

Replying toUpdatelessness doesn't solve most problems

The distinction between "solving the problem for our prior" and "solving the problem for all priors" definitely helps! Thank you!

I want to make sure I understand the way you're using the term updateless, in cases where the optimal policy involves correlating actions with observations. Like pushing a red button upon seeing a red light, but pushing a blue button upon seeing a blue light. It seems like (See Red -> Push Red, See Blue -> Push Blue) is the policy that CDT, EDT, and UDT would all implement.

In the way that I understand the terms, CDT and EDT are updateful procedures, and UDT is updateless. And all three are able to use... (read more)

Replying toUpdatelessness doesn't solve most problems

Got it, I think I understand better the problem you're trying to solve! It's not just being able to design a particular software system and give it good priors, it's also finding a framework that's robust to our initial choice of priors.

Is it possible for all possible priors to converge on optimal behavior, even given unlimited observations? I'm thinking of Yudkowsky's example of the anti-Occamian and anti-Laplacian priors: the more observations an anti-Laplacian agent makes, the further its beliefs go from the truth.

I'm also surprised that dynamic stability leads to suboptimal outcomes that are predictable in advance. Intuitively, it seems like this should never happen.

Replying toUpdatelessness doesn't solve most problems

Counterfactual Mechanism Networks

It sounds like we already mostly agree!

I agree with Caspar's point in the article you linked: the choice of metric determines which decision theories score highly on it. The metric that I think points towards "going Straight sometimes, even after observing that your counterpart has pre-committed to always going Straight" is a strategic one. If Alice and Bob are writing programs to play open-source Chicken on their behalf, then there's a program equilibrium where:

Both programs first try to perform a logical handshake, coordinating on a socially optimal joint policy.
- This only succeeds if they have compatible notions of social optimality.
As a fallback, Alice's program adopts a policy which
- Caps Bob's expected payoff at what

... (read 399 more words →)

In the previous post, we saw an example of how a simple network of counterfactual mechanisms can be used to produce logical commitments that resolve an incentive misalignment. In this post we'll generalize this technique to more complicated networks, and sketch out how such networks should be structured.

In our simple example, AliceBot and BobBot performed a single round of negotiation over conditional joint commitments. But open-source game theory lets us construct an entire network of counterfactual games, where agents in one game condition their behavior on the outcomes of any number of others. This information flow can even be loopy, using a more sophisticated logical crystal ball than straightforward simulation. The subset... (read 1451 more words →)

To Boldly Code