1 min read

2

This is a special post for quick takes by Warty. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

26 comments, sorted by Click to highlight new comments since:

Good epistemic calibration of a prediction source is not impressive.

I see people being impressed by calibration charts, for example https://x.com/ESYudkowsky/status/1924529456699641982 , or stronger: https://x.com/NathanpmYoung/status/1725563206561607847

But it's trivial to have a straight-line calibration graph, if it's not straight just fix it for each probability by repeatedly predicting a one-sided coin's outcome as that probability.

If you're a prediction market platform where the probability has to be decided by dumb monkeys, just make sure that the vast majority of questions are of the form "will my p-weighted coin land heads". 

---

If a calibration graph isn't straight, that implies epistemic free lunch - if things that you predict at 20% actually happen 30% of the time, just shift those predictions. This is probably the reason why actual prediction markets are calibrated, since incalibration leads to an easy trading strategy. But the presence of calibration is not a very interesting property. 

Good calibration is impressive and an interesting property because many prediction sources manage to not clear even that minimal bar (almost every human who has not undergone extensive calibration training, for example, regardless of how much domain expertise they have).

Further, you say one shouldn't be impressed by those sources because they could be flipping a coin, but then you refuse to give any examples of 'impressive' sources which are doing just the coin-flip thing or an iota of evidence for this bold claim, or to say what they are unimpressive compared to.

Yea I would be impressed if a human showed me they have a good calibration chart.

(though part of it is that humans usually put few questions in their calibration charts. It would be nice to look at people's performance in a range of improving calibration exercises)

I don't think anyone is brute-forcing calibration with fake predictions, it would be easy to see if the predictions are public. But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.

I don't claim that any entity is not impressive, just that we shouldn't be impressed by calibration (humans get a pass, it takes so much effort for us to do anything).

There is probably some bravery debate aspect here, if you look at my linked tweets, it's like in my world people are just going around saying good calibration implies good predictions, which is false.

(edit 1: for human calibration exercises, note that with a stream of questions where p% resolve true, it's perfectly calibrated to always predict p%. Humans who do calibration exercises have other goals than calibration. Maybe I should pivot to activism in favor of prediction scores)

But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.

Why would you think that? Surely the reason that a metric being gameable matters is if... someone is or might be gaming it?

Plenty of metrics are gameable in theory, but are still important and valid given that you usually can tell if they are. Apply this to any of the countless measurements you take for granted. Someone comes to you and say 'by dint of diet, hard work (and a bit of semaglutide), my bathroom scale says I've lost 50 pounds over the past year'. Do you say 'do you realize how trivially gameable that metric is? how utterly sus and unimpressive? You could have just been holding something the first time, or taken a foot off the scale the second time. Nothing would be easier than to fake this. Does this bathroom scale even exist in the first place?' Or, 'my thermometer says I'm running a fever of 105F, I am dying, take me to the hospital right now' - 'you gullible fool, do you have any idea how easy that is to manipulate by dunking it in a mug of tea or something? sus. Get me some real evidence before I waste all that time driving you to the ER.'

Hmm yea gameability might not be so interesting of a property of metrics as I've expressed.

(though I still feel there is something in there. Fixing your calibration chart after the fact by predicting one-sided coins dice is maybe a lot like taking a foot off the bathroom scale. But, for example, predicting every event as a constant p%, is that even cheating in the calibration game? Though neither of these directly applies to the case of prediction market platforms)

Disagree. It's possible to get a good calibration chart in unimpressive ways, but that's not how Polymarket & Manifold got their calibration, so their calibration is impressive.

To elaborate: It's possible to get a good calibration graph by only predicting "easy" questions (e.g. the p-weighted coin), or by predicting questions that are gameable if you ignore discernment (e.g. 1/32 for each team to win the Super Bowl), or with an iterative goodharting strategy (e.g. seeing that too many of your "20%" forecasts have happened so then predicting "20%" for some very unlikely things). But forecasting platforms haven't been using these kinds of tricks, and aren't designed to. They came by their calibration the hard way, while predicting a diverse set of substantive questions one at a time & aiming for discernment as well as calibration. That's an accomplishment.

You skip over the not very impressive way for a prediction market platform to be calibrated that I already mentioned. If things predicted at 20% actuallt happen 30% of the time, you can buy up random markets that are at 20% and profit.

That seems like an instance of a general story for why markets are good: if something is priced too low people can buy it up and make a profit. It's a not very impressive way for markets to be impressive.

If you'd said "not surprising" instead of "not impressive" then maybe I would've been on board. It's not that surprising that prediction markets are good at calibration because we already knew that markets are good at that sort of thing. That seems basically true, for certain groups of "we". Though my attitude is still more check it out: it works like we thought it would rather than nothing to see here, this is just what we expected.

What I'm going towards is, it seems to me the predictions given by the platform can be almost arbitrarily bad, but with some assumptions the above strategy will work and will make the platform calibrated. So calibration does not imply anything about goodness of predictions. So it's not impressive. 

Calibration is a super important signal of quality because it means you can actually act on the given probabilities! Even if someone is gaming calibration by betting given ratios on certain outcomes, you can still bet on their predictions and not lose money (often). That is far better than other news sources such as tweets or NYT or whatever. If a calibrated predictor and a random other source are both talking about the same thing, the fact that the predictor is calibrated is enough to make them the #1 source on that topic.

has anyone looked into the "philosophers believe in moral realism" problem? (in the sense of, morality is not physically contained in animal bodies and human-created artifacts)

 

I saw a debate on youtube with Michael Huemer guy but it was with another academic philosopher. Was there ever an exchange recorded between a moral realist philosopher and a rationalist-lesswrongist?

That's a frequent misconception. In fact, Eliezer Yudkowsky is a moral realist.

CEV doesn't prove that, twice over. For one thing , there's no proof it could work. For another, if it does, it is only munging together a bunch of subjective attitudes.

The article is not about alignment (that's a different article), it's about a normative moral theory.

Yes. And there no proof CEV could work as a normative moral theory. Ie there is no proof it can even converge on an answer, and , separately no reason to think the answer is actually objective.

has anyone looked into the “philosophers believe in moral realism” problem? (in the sense of, morality is not physically contained in animal bodies and human-created artifacts)

Minimally, moral realism is the claim that some moral propositions are objectively true or false. Do you have a problem with that? Maximally it can also involve non-naturalism ,or special ontological domains. Is that the problem?

I saw a debate on youtube with Michael Huemer guy but it was with another academic philosopher.

Is that a problem? Why? Do you think all academic philosophers are moral realists? About 62% are. One of the things you could learn from the Wikipedia page.

(BTW, the other guy might be Lance Bush, who is anti realist as they come)

Was there ever an exchange recorded between a moral realist philosopher and a rationalist-lesswrongist?

How would that help? Rationalists haven't settled in a single moral theory...and plenty of non -rationalists are naturalists.

Most shameful of me to use someone's term and define it as my beef with them. In my impressions, moral realism has also always involved moral non-corporalism if you will. As long as morality is safely stored in animal bodies, I'm fine with that. 

The one in the youtube debate identified as a moral non-realist. But you see, his approach to the subject was different from mine, and that is a problem.

I think there more or less is a rationalist-lesswrongist view of what morality is, shared not by all but most rationalists (I wanted to say it's explained in the sequences, but suspiciously I can't find it in there).

Most shameful of me to use someone’s term and define it as my beef with them.

I am making guesses about what you might be saying, because you are being unclear.

In my impressions, moral realism has also always involved moral non-corporalism if you will

Well,.it doesn't, and research will tell you that.

The one in the youtube debate identified as a moral non-realist

Which debate?

I wanted to say it’s explained in the sequences, but suspiciously I can’t find it in there.

I've read the sequences ,band that's why I say there is no clear theory.

I am making guesses about what you might be saying, because you are being unclear.

I was responding to your correction of my definition of moral realism. I somewhat jokingly expressed shame for defining it idiosyncratically.

Well,.it doesn't, and research will tell you that.

It can still be true of my impressions of it, like every time I saw someone arguing for moral realism.

Which debate? 

I think it was this one, regretfully I'm being forced to embed it in my reply.

It can still be true of my impressions of it, like every time I saw someone arguing for moral realism

You were saying that there was a problem with philosophy itself.

I don't recall saying that recently, though it's true. I don't know what you're getting at. 

has anyone looked into the “philosophers believe in moral realism” problem.

That was a few hours ago.

I would say it's perhaps indicative of a problem with academic philosophy. Unless that 62% is mostly moral corporalists, then it's fine by me if they insist that "some moral propositions are objectively true or false", I guess. 

Maybe you could try listening the arguments. MR doesn't have to be based on material entities or immaterial ones.

that's a trick to make me be like them!

(I listened to some of that michael huemer talk and it seemed pretty dumb)

An argument against computationally bounded Solomonoff induction is that it wouldn't get quantum physics, because it's exponentially hard to compute. But quantum computation isn't less natural than classical, so we might as well base it on a bounded quantum computer, which gets around the objection.