There is no “Wikipedia for predictive models” that I know of. No big repository to easily share and find predictive scientific models other than the relevant domain’s scientific literature, which is not optimized for these tasks: it is not organized by the variables being predicted, it is not generally available as reusable and modular software components, it is usually not focused on predictive work, some of it is paywalled, etc.
Have you tried www.openml.org?
Introduction
It’s possible to make quantitative forecasting (forecasting by software) significantly more easy and fun by adding some missing pieces to the forecasting ecosystem. Let’s do it.
The models I’m interested in are “gears” and not “behavior” models: They should be motivated by theory, be built to inform our theoretical understanding, and have relatively few numeric parameters. The point is to do science, not just fit parameters to data.
In this post I focus on forecasting things that are regularly measured and published by institutions. Such data includes economic and population statistics, opinion and value surveys, statistics about violence, public health, etc. I propose that we can build tools that would make it easier to test theories through forecasting models.
For example, a simple model for national meat consumption would use data about GDP growth per capita, population growth, population value judgements (as measured by the World Value Survey) and the past year's meat consumption to predict next year's consumption. Similar inputs would be relevant for predicting carbon emissions. Predicting GDP would presumably rely on corruption indicators, past growth, educational attainment etc.
Motivation
make it easier to build, share and use models
There is no “Wikipedia for predictive models” that I know of. No big repository to easily share and find predictive scientific models other than the relevant domain’s scientific literature, which is not optimized for these tasks: it is not organized by the variables being predicted, it is not generally available as reusable and modular software components, it is usually not focused on predictive work, some of it is paywalled, etc.
low-effort empiricism for non-experts
Evaluating the empirical performance of predictive models is a noisy, but low-bias, way to access the truth. People who aren't domain experts attempt to access the truth by finding people they hope are knowledgeable and then trusting their opinion, because doing research yourself is hard. Think about the people you disagree with regarding the safety of vaccines. Consider how bad these people are in identifying reliable domain experts, and how dangerous that weakness is. We can create tools even low-information users can utilise to get in touch with reality.
The tools we miss
There are several tools we can build to make quantitative forecasting easier and more useful. These tools are:
Let us briefly consider the benefit of each of these in turn.
Catalog of datasets
First, there's the catalog of datasets. Once this catalog exists, model developers would be able to find and fetch data more easily, by searching through the catalog for what interests them, and using a software library to download the data from wherever it is, without having to think about what is the access method for the data and in what format its publisher provides it.
Catalog of models
The catalog of models is meant to be a central place to share your models with others, and to find what others have made for you to build upon. One may come here looking for pieces of models to stitch together, to combine with one's own idea so as to make a model that relates more variables and is fit for a particular purpose. Experts and hobbyists alike would be able to use this catalog to put into the public record forecasts, in the expectation that these forecasts would be tested and that good models would win some recognition.[1]
Model evaluation service
Next is the model-evaluation service. It would monitor data publishers for updates and use new data as it arrives to track the accuracy of predictive models. There would be a scoreboard, showing the best models for predicting mortality, the best models for predicting economic growth, unemployment, carbon emissions, etc.
This is where ordinary people would go to see what are some important factors that affect something that concerns them. Alas, only factors for which there are some published statistics can be related like this, though we may be able to detect indirect links between variables as well.
Model performance prediction market
We can bet (real and/or play money) on the performance of models for fun, education and profit. Whereas ordinary prediction markets provide you with information about discrete events but do not reveal the participants’ thinking proess, model-performance markets teach you how to think about the world. The usual motivations for prediction markets apply: they aggregate information, they provide a trustworthy, difficult-to-corrupt signal, and they provide a channel for results-based subsidies for research, they have educational value, etc.
Plausible questions
Summary
The overall vision is to build better tools and a users' community that facilitate finding out how the world works, to make us curious about predicting stuff, to train us in putting numbers on effect sizes, and that could possibly even focus research funding on research that makes obvious epistemic progress.
Does this interest you? Right now this is just a bunch of ideas I'm discussing with people. I need partners to make this happen. Please talk to me.
Questions to the audience
I've never started anything like this before. I'm gonna need all the help I can get.
I'll be waiting for your thoughts, either here or on https://discord.gg/rBqyQcrT5q. Bring it on!
Acknowledgements
I thank Niki Kotsenko, Yonatan Cale, Lior Oppenheim and Edo Arad for their helpful remarks.
A reviewer of this post did not think that the Data Catalog solves a pain point. I suspend judgement on this question for now until I get more experience in building models.
Perhaps you think that the scientific establishment needs no improvement in being a-political. A good many people seem to think otherwise, as did Robin Hanson. Even if the scientific establishment is flawless, there is value in creating an evaluation framework that can be easily seen to be unbiased, rather than merely believed to be unbiased.
In “Could Gambling Save Science”, Robin Hanson writes: “Peer review is just another popularity contest, inducing familiar political games; savvy players criticize outsiders, praise insiders, follow the fashions insiders indicate, and avoid subjects between or outside the familiar subjects. It can take surprisingly long for outright lying by insiders to be exposed [Re]. There are too few incentives to correct for cognitive [Kah] and social [My] biases, such as wishful thinking, overconfidence, anchoring [He], and preferring people with a background similar to your own.”
For a scathing attack of the journal system, see What's Wrong with Social Science and How to Fix It : Reflections After Reading 2578 Papers.