The following is slightly edited from a pitch I wrote for a general audience. I've added blog-specific content afterwards.
Information technology allows for unprecedented levels of collaboration and debate. When an issue arises people communicate freely, are held accountable for misrepresentations, and are celebrated for cogent analysis. We share information and opinion better than ever before. And then we leave the actual decision up to one person, or a tiny committee, or a poll of a population that for the most part wasn't paying attention, or at best an unpredictably irrational market. The one thing we still don't aggregate in a sophisticated way is human judgment.
Organizations evolve complex decision-making structures because variance in human judgement is complicated. We try to put the most competent person in charge—but there is wisdom in crowds, and so a wise leader gets buy-in from a broad pool of competent subordinates. We must constantly try to evaluate who has the best record, to see who's been right in the past...and we get it wrong all the time. We overestimate our own competence. In hindsight, we misremember the right decision as being obvious. We trust the man with the better hair. Any organization with group buy-in on decisions amasses a solid amount of data on the competence of its members, but it does not curate or use this data effectively.
We can do better, using current technology, some simple software, and some relatively simple math. The solution is called histocracy. It is most easily explained with a use case.
The H Foundation is a hypothetical philanthropic organization, with a board of twelve people overseeing a large fund. Each year, they receive and review several hundred grant applications, and choose a few applicants to give money to. Sometimes these applicants use the money effectively, and sometimes they fail. Often an applicant they turn down will get funding elsewhere and experience notable success or failure. In short, it is often obvious to the board in hindsight whether they made the right decision. For each application, the yay or nay of each board member is recorded. If and when, later, the board reaches a consensus on whether that application should have been approved, this consensus is recorded as well. The result is that each board member accumulates a score. Alice's votes have been right 331 times and wrong 59 times, while Bob's votes have been right 213 times and wrong 110 times (they weren't always present for the same votes). Already from this raw data we can see that Alice's opinion should count for more than Bob's. With a computer's ease with arithmetic, we can quantify this. Some math is given in an appendix; here it suffices to say that it would be reasonable to give Alice's vote a little over 7/4ths the weight of Bob's: if the board is to maximize its chance of making the correct choice, 4 Alices should be able to outvote 7 Bobs. The board members each connect to a shared server and vote on each application; the software performs the relevant calculations and determines the victor.
In this system, the board members perform the massively complex task of evaluating the applicants, a job requiring expert judgment and intuition, while the computer dispassionately and precisely evaluates the board. The result is a system wiser than any individual board member.
When scaling this solution up to a large business with thousands of employees, the math stays the same while the interface changes. Decisions need to be shared and discussed on a corporate intranet, and tagged by type so that employees can find and vote on only those decisions they feel competent to vote on. Employees who try to make decisions on matters beyond their competence will fail to accumulate enough voting weight to skew the decision; this means that decisions in all areas can be opened to the entire field. Managers should be encouraged to reframe decisions they are pondering as corporation-wide referenda. Evaluating a decision in hindsight should in this case be reserved to the owners and shareholders, or to a system or charter they have approved.
Expanding the scale even further, the same approach could be applied to advice, solicited or unsolicited. Consider a site to which clients could pay to submit polls on decisions that concerned them. The polls would be conducted and reported histocratically. The client would later be asked to report whether the advice given by the community turned out to be correct. Prizes and recognition could be given to those solvers who accumulate the highest voting weights, thereby incentivizing participation and excellence. For unsolicited advice, a similar approach could be used with petitions.
In summary, we note that human judgment is essentially a set of predictions, and thus can be judged empirically and aggregated mathematically. Group decision-making is such an omnipresent and consequential task that optimizing it may be the single most important thing we can do. Let's do it rigorously, and let's do it now.
On Sunday, I posted a call for solutions in advance of this post. Which is a weird thing to do, but I have a terror of Irrevocable Actions, and I can't untell you something. (Coincidentally, at the same time as people were chiding me for this, a discussion started about my also mildly eccentric decision to put my play behind a semipermeable paymembrane, which has a similar explanation; it's easier to make something free that was once non-free than the reverse, and in many circles charging for something is actually higher-status.)
I didn't mention prediction markets because I didn't want people to anchor on it—it's just a hop, skip, and a jump from futarchy to histocracy, so that would obviate the point. As expected, people went there immediately anyway, and from there to something very close to my idea. Much of the discussion centered around the difficulty of creating a well-defined charter. While I certainly agree that a quantifiable group utility function is usually difficult, if you go up a level of meta you'll see that well-defined charters are everywhere: a decision is correct if and only if the people in power judge it to have been correct. To be a democracy, we don't need to explicitly vote on values—we just need to let people vote on consequences in accordance with their values. The king's order may be ambiguously worded, but your true duty is clear: please the king.
There are some clear advantages to histocracy over futarchy: most relevantly, I believe histocracy will work well on a small scale, while prediction markets require a large crowd. Given enough time and participation, histocracy will inevitably beat a market. There's less moral hazard, and less vulnerability to manipulation.
Futarchy beats histocracy in that there's a built-in incentive to participate and excel: but people vote in elections and serve on non-profit boards for free, so I don't see a huge need to inject cash. Futarchy allows for individual actors to express degrees of confidence in a way that my model of histocracy doesn't, but this could be remedied where feasible. And Hanson's ideas for how to judge consequences in hindsight might be appropriate for some histocracies.
The potential pitfalls of histocracy depend on the specific implementation. I see politics, in the blue vs. green mind-killing sense, and difficulty of evaluating consequences even in hindsight as the two major Achilles heels; but as far as I can see these are universal. There is a danger of a subgroup amassing a large voting weight, then abusing it in the window before they are removed from power, which can perhaps best be guarded against with some sort of constitutional system, perhaps even one formally incorporated into the system as a high Bayesian prior against certain classes of actions being correct.
I should also concede up front that my “mathematical” appendix glosses over the serious AI challenge of doing this right: hopefully, the computing power available to a histocracy will grow much faster than the number of voters. Log(LaPlace(Record)) will double-count terribly in large groups, but it does have the advantage of being simple and transparent—entrusting your government to a black box is scary.
Groups giving histocracy a try should start by making it nonbinding. Only when it's working better than your current system should it be adopted. Unless, of course, your current system is a majority vote, in which case you might as well start using it right away.
Given enough time and participation the market will tend towards something like the arithmetic mean of the estimates of all of its participants. Which is great, Wisdom of Crowds and all that, and I'd even call the whole concept beautiful given the way it practically runs itself, but it's not even a local optimum among aggregation algorithms--you'd get a few points better calibrated if you threw out the bets made by Idiot Jed, Glutton for Punishment.
You mentioned elsewhere that you haven't 'read up on your Hanson' regarding prediction markets. You need to. The above just isn't how prediction markets work. Moreover, if you added a bunch more Idiot Jed's you may end up with a market that is better callibrated.
Once again, I suggest you focus your attention on areas where for some reason a remotely efficient market just isn't feasible. This usually means either something where there is not enoug... (read more)