All of jonathanstray's Comments + Replies

But under these assumptions, combining evidence always gives the right answer. Compare with the example in the post: "vote on a, vote on b, vote on a^b" which just seems strange. Shouldn't we try to use methods that give right answers to simple questions?

a) "Everyone does Bayesian updating according to the same hypothesis set, model, and measurement methods" strikes me as an extremely strong assumption, especially since we do not have strong theory that tells us the "right" way to select these hypothesis sets, models, an... (read more)

2cousin_it
Yeah. I was more trying to argue that, compared to Bayesian ideas, voting doesn't win you all that much.
You can peek into everyone's heads, gather all the evidence, remove double-counting, and perform a joint update. That's basically what Aumann agreement does - it doesn't vote on beliefs, but instead tries to reach an end state that's updated on all the evidence behind these beliefs.

Right, this is where strong Bayesianism is required. You have to assume, for example, that everyone agrees on the set of hypotheses under consideration and the exact models to be used. This is not just an abstract plan for slicing the universe into manageable... (read more)

2cousin_it
But under these assumptions, combining evidence always gives the right answer. Compare with the example in the post: "vote on a, vote on b, vote on a^b" which just seems strange. Shouldn't we try to use methods that give right answers to simple questions? I think if you have a set of coefficients for comparing different people's utilities (maybe derived by looking into their brains and measuring how much fun they feel), then that linear combination of utilities is almost tautologically the right solution. But if your only inputs are each person's choices in some mechanism like voting, then each person's utility function is only determined up to affine transform, and that's not enough information to solve the problem. For example, imagine two agents with utility functions A and B such that A<0, B<0, AB=1. So the Pareto frontier is one branch of a hyperbola. But if the agents instead had utility functions A'=2A and B'=B/2, the frontier would be the same hyperbola. Basically there's no affine-invariant way to pick a point on that curve. You could say that's because the example uses unbounded utility functions. But they are unbounded only in the negative direction, which maybe isn't so unrealistic. And anyway, the example suggests that even for bounded utility functions, any method would have to be sensitive to the far negative reaches of utility, which seems strange. Compare to what happens when you do have coefficients for comparing utilities, then the method is nicely local. Does that make sense?

Aumann agreement isn't an answer here, unless you assume strong Bayesianism, which I would advise against.

I have to say I don't know why a linear combination of utility functions could be considered ideal. There are some pretty classic arguments against it, such as Rawls' maximin principle, and more consequentialist arguments against allowing inequality in practice.

2cousin_it
To expand the argument a bit: if many people have evidence-based beliefs about something, you could combine these beliefs by voting, but why bother? You have a superintelligent AI! You can peek into everyone's heads, gather all the evidence, remove double-counting, and perform a joint update. That's basically what Aumann agreement does - it doesn't vote on beliefs, but instead tries to reach an end state that's updated on all the evidence behind these beliefs. I think methods along these lines (combining evidence instead of beliefs) are more correct and should be used whenever we can afford them. For more details on this, see the old post Share likelihood ratios, not posterior beliefs. Wei Dai and Hal Finney discuss a nice toy example in the comments: two people observe a private coinflip each, how do they combine their beliefs about the proposition that both coins came up heads? Combining the evidence is simple and gives the right answer, while other clever schemes give wrong answers. Imagine that after doing the joint update, the agents agree to cooperate instead of fighting, and have a set of possible joint policies. Each joint policy leads to a tuple of expected utilities for all agents. The resulting set of points in N-dimensional space has a Pareto frontier. Each point on that Pareto frontier has a tangent hyperplane. So there's some linear combination of utility functions that's maximized at that point, modulo some tie-breaking if the frontier is perfectly flat there.

If you liked this post, you will love Amartya Sen's Collective Choice and Social Welfare. Originally written in 1970 and expanded in 2017, this is a thorough development of the many paradoxes in collective choice algorithms (voting schemes, ways to aggregate individual utility, and so on.)

My sense is the AI alignment community has not taken these sorts of results seriously. Preference aggregation is non-trivial, so "aligning" an AI to individual preferences means something much different than "aligning" an AI to societal preference... (read more)

7cousin_it
Well, the "ideal" way to aggregate beliefs is by Aumann agreement, and the "ideal" way to aggregate values is by linear combination of utility functions. Neither involve voting. So I'm not sure voting theory will play much of a role. It's more intended for situations where everyone behaves strategically; a superintelligent AI with visibility into our natures should be able to skip most of it.
So I was very surprised when I learned that a single general method in deep learning (training an artificial neural network on massive amounts of data using gradient descent)[2] led to performance comparable or superior to humans’ in tasks as disparate as image classification, speech synthesis, and playing Go. I found superhuman Go performance particularly surprising—intuitive judgments of Go boards encode distillations of high-level strategic reasoning, and are highly sensitive to small changes in input.

I think it may be important to recogni... (read more)

gwern100

And MuZero, which beats AlphaZero and which does not use symbolic search over a simulator of board states but internal search over hidden state and value estimates?

Neural networks, on the other hand, are famously bad at symbolic reasoning tasks, which may ultimately have some basis in the fact that probability does not extend logic.

Considering all the progress on graph and relational networks and inference and theorem-proving and whatnot, this statement is giving a lot of hostages to fortune.

2zhukeepa
Yep! I addressed this point in footnote [3].

We could look at donors' public materials, for example evaluation requirements listed in grant applications. We could examine the programs of conferences or workshops on philanthropy and see how often this topic is discussed. We could investigate the reports and research literature on this topic. But I don't know how to define enough concern.

While Bayesian statistics are obviously a useful method, I am dissatisfied with the way "Bayesianism" has become a stand-in for rationality in certain communities. There are well-developed, deep objections to this. Some of my favorite references on this topic:

  • Probability Theory Does Not Extend Logic by Dave Chapman. Part of what is missing from simulating every logically possible universe is indeed reasoning, in the sense that probabilistic inference nicely extends propositional logic but cannot solve problems in first order logic. This is why pr
... (read more)

I am happy that you mention Gelman's book (I am studying it right now). I think lots of "naive strong bayesianists" would improve from a thoughtful study of the BDA book (there are lots of worked out demos and exercises available for it) and maybe some practical application of Bayesian modelling to some real-world statistical problems. The practice of "Bayesian way of life" of "updating my priors" sounds always a bit too easy in contrast to doing a genuine statistical inference.

For example, a couple of puzzles I am still ... (read more)

My sense is that donors do care about evaluation, on the whole. It's not just GiveWell / Open Philanthropy / EA who think about this :P

See for example https://www.rockpa.org/guide/assessing-impact/

1jasoncrawford
My sense is that they don't care nearly enough. How could we find evidence one way or another?

Well said. And this middle ground is exactly what I am worried about losing as companies add more AI to their operations -- human managers can and do make many subtle choices that trade profit against other values, but naive algorithmic profit maximization will not. This is why my research is on metrics that may help align commercial AI to pro-social outcomes.

2Decius
Naive algorithmic anything-optimization will not make those subtle trade-offs. Metric maximization run on humans is already a major failure point of large businesses, and the best an AI that uses metrics can do is draw awareness to the fact that the metrics that don't start out bad become bad over time.

Because central planning is so out of fashion, we have mostly forgotten how to do it well. Yet there are little known historical methods that could be applicable in the current crisis, such as input-output analysis, as Steve Keen writes:

One key tool that has fallen out of use in economics is input-output analysis. First developed by the non-orthodox economist Wassily Leontief (Leontief 1949; Leontief 1974), it used matrix mathematics to quantify the dependence of the production of one commodity on inputs of other commodities. Given its superficial similari
... (read more)

I agree that revenue is a key part of the organizational feedback loop that non-profits do not have, and it's often a problem. However, for-profits have a tendency to turn toward revenue. To the extent that we care about what an organization does for society, we should care about organizational drift caused by chasing revenue. I believe it's an open question whether lack of revenue feedback in non-profits or organizational drift cause by revenue alignment in for-profits is currently a bigger problem in society.

I also think you may be underestimat... (read more)

3jasoncrawford
Any metric can be gamed or can distort behavior, it's true. No metric can substitute for judgment. Re programmatic evaluation: It's true that nonprofits *can* do this, but that only matters if *donors* on the whole care. This is why I said:
4Decius
There's a middle ground between having an organization be profitable, and an organization optimizing for profitability.

Hi Gordon. Thanks for reading the post. I agree completely that the right metrics are nowhere near sufficient for aligned AI — further I’d say that “right” and “aligned” have very complex meanings here. 

What I am trying to do with this post is shed some light on one key piece of the puzzle, the actual practice of incorporating metrics into real systems. I believe this is necessary, but don’t mean to suggest that this is sufficient or unproblematic. As I wrote in the post, “this sort... (read more)