Younes Kamel

When is it appropriate to use statistical models and probabilities for decision making ?

Your intuition is a model.

Sure, you can use a broad definition of "model" to include any decision making process. But I used the word model to refer to probabilistic and quantitative models.

A few examples of topics where you really don't include any cost/benefit estimates in your decision (as opposed to strawman examples of INCORRECT cost/benefit use) would go a long way.

Sure. An example from my life is "I refrain from investing in the stock market because we do not understand how it works and it is too uncertain". I don't rely on cost benefit analysis in this case. It is more of qualitative analysis. I do not use cost benefit analysis because I am unable to quantify the expected utility I would derive from investing in the stock market. I do not have the necessary information to compute it.

Replying toWhen is it appropriate to use statistical models and probabilities for decision making ?

When is it appropriate to use statistical models and probabilities for decision making ?

it seems intuitively to me, someone who admittedly doesn't know much about the subject, like unknown unknowns could be accurately modeled most of the time with a long-tailed normal distribution

How fat tailed do you make it ? You said you use past extreme events to choose a distribution. But what if the past largest event is not the largest possible event ? What if the past does not predict the future in this case ?

You can say "the largest truck that ever crossed by bridge was 20 tons, therefore my bridge has to be able to sustain 20 tons" but that is a logical fallacy. The fact that the largest truck you ever saw was 20 tons does not mean a 30 ton truck could not come by one day. This amounts to saying "I have observed the queen of England for 600 days and she hasn't died in any of them, therefore the queen of England will never die".

When is it appropriate to use statistical models and probabilities for decision making ?

Younes Kamel

I enjoy reading rationalist and effective altruist blogs. Members of the community usually back their arguments with data and/or other evidence and tend to be scientifically literate. And although no one is a perfect, detached belief updater, I found that rationalists are probably the community that is closest to that ideal. However, I believe this community often commits a fallacy while attempting to think rationally. It is the fallacy of applying cost benefit analysis under deep uncertainty. What is deep uncertainty ? I found a great definition in the textbook Decision making under deep uncertainty by Marchau et al., so I’m simply going to quote it :

Complete certainty is the situation in

... (read 1171 more words →)

Replying toBest LessWrong posts for understanding scientific studies.

Younes KamelMar 10, 2022*

Best LessWrong posts for understanding scientific studies.

I wrote a post summarizing misuses of statistics here. You can read that if you want a short version. If you want to learn to evaluate studies and gauge their rigor, then read Inuitive biostatistics by Harvey Motuslsky and Statistics done wrong by Alex Reinhart. These were my main sources for my post. After reading them you should have a good understanding of statistics intuitively, without necessarily knowing the math. If you have to read only one, then definitely go for Intuitive biostatistics. It includes perhaps 90% of the content of the other book and more, because Statistics done wrong assumes you've taken at least an introductory class in statistics, but Intuitive... (read more)

Replying toWhat we know about machine learning's replication crisis

What we know about machine learning's replication crisis

Perhaps the most important takeaway from our study is hidden in plain sight: the field is in danger of being drowned by noise. Different optimizers exhibit a surprisingly similar performance distribution compared to a single method that is re-tuned or simply re-run with different random seeds. It is thus questionable how much insight the development of new methods yields, at least if they are conceptually and functionally close to the existing population.

This is from the author's conclusion. They do also acknowledge that a couple optimizers seem to be better than others across tasks and datasets, and I agree with them (and with you if that's your point). But most optimizers do not... (read more)

Replying toWhat we know about machine learning's replication crisis

What we know about machine learning's replication crisis

You're right, I should have written "but it turns out most of them could be beaten by the untuned version of several competitors on the 5 five datasets", as one can see in the figures. Thank you for pointing it out, I'll edit the post.

What we know about machine learning's replication crisis

Younes Kamel

The replication crisis is a a phenomenon that plagues modern day scientific research. And contrary to popular belief, it has being going on for a long time. The incentive structure within academia seems to reward quantity of publications over quality. This leads to questionable research practices which in turn lead to scientific journals containing false positive results.

This crisis is particularly strong in the social sciences where according to some estimates as much as 50% of published research is not replicable. The crisis is so prevalent in psychology that some psycholgists such as Daniel Lakens or Ulrich Schimmack made replication their main research interest.

But this is a blog about AI and machine learning,... (read 1775 more words →)

I'm not as versed in mistakes of meta-analysis yet, but I'm working on it ! Once I compile enough meta-analysis misuses I will add them to the post. Here is one that's pretty interesting :

https://crystalprisonzone.blogspot.com/2016/07/the-failure-of-fail-safe-n.html

Many studies still use fail-safe N to account for publication bias when it has been shown to be invalid. If you see a study that uses it you can act as if they did not account for publication bias at all.

100% agree with defaulting to non-gaussian distribution. That is what rigorous statistics would look like imo.

I'm starting to realize that as well. It can give you the intuition without having to memorize theorems. I think I'm going to start using simulations a lot more.

Yes, for sure. You can still fall for selective skepticism where you scrutinize studies you "like" much more than studies you don't like. You can deal with that by systematically applying the same checklist to every study you read, but that might be time consuming. The real solution is probably a community that is versed in statistics and that have open debates on the quality of studies, perhaps cumulatively, biases will cancel each other if the community has enough diversity of thought. Hence the value of pluralism.