Where to start with statistics if I want to measure things?

matto

1 min read

21

[ Question ]

Where to start with statistics if I want to measure things?

by matto

20th Apr 2023

1 min read

2 7

21

A while back I read How to Measure Anything and found it fascinating. In my day job, I spend quite a bit of time trying to make sense of the world by looking at dashboards of requests, latencies, error rates, etc. (software systems).

After finishing the book and taking copious notes, I understood that it gave me a prepackaged process that I could apply as-is, but I found it very difficult to adapt to everyday situations. I don't think I picked up a good intuition about stats, in other words.

I'm looking to change that. Specifically, I want to learn to apply stats in these two situations:

measuring things. Mostly software systems, but open to little experiments. Dan Luu used to measure a lot of fun things.
understanding how others measure things. I'd like to be able to judge if claims made in a paper about covid spread or social media addiction are backed up by the math/data in the paper.

The challenge I'm facing is that I know a bunch of techniques, but not how they relate to each other and the problems they're meant to solve. To illustrate what I mean: I know how to get percentiles and calculate means, but until today morning I didn't know why averaging percentiles is usually a bad idea. I'm missing the map.

I've seen these books recommended as a good way to start:

Statistics, 4th Edition 4th Edition, by Freedman, Pisani, and Purves
Probability Theory: The Logic of Science, by Jaynes
An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements, by Taylor
Think Stats, by Downey

But I also wanted to ask someone familiar with the field:

Is it best to start with an introductory textbook and branch out from there?
Are there specific subfields / topics I should be focusing on (or avoiding)?
Is what I'm looking to learn labeled in some way? For example, I can't tell if this is data analytics or data science or X.

New to LessWrong?

Getting Started

FAQ

Library

Probability & StatisticsPracticalWorld Modeling

Frontpage

21

Where to start with statistics if I want to measure things?

New Answer

New Comment

2 Answers sorted by
top scoring

Derek M. Jones

Apr 20, 2023

I'm assuming you are interested in learning about something by measuring one or more of its attributes, and then using statistics to extract information from the measurements, i.e., you are interested in a hands-on application, then books I found useful include:

Statistics for experimenters by Box, Hunter and Hunter

Design and Analysis of experiments by Montgomery.

[-]matto2y10

Thanks! This is really helpful--I think this is exactly what I'm trying to do.

Are these texts part of a specific academic track/degree or field of study? It sounds like something someone in engineering would spend a semester on. But also like something someone could spend a career on studying.

mikes

Apr 21, 2023

Being able to accurately assess a paper's claims is, unfortunately, a very high bar. A large proportion of scientists fall short of it. see: [https://statmodeling.stat.columbia.edu/2022/03/05/statistics-is-hard-etc-again/]

Most people with a strong intuition for statistics have taken courses in probability. It is foundational material for the discipline.

If you haven't taken a probability course, and if you're serious about wanting to learn stats well, I would strongly recommend to start there. I think Harvard's intro probability course is good and has free materials: https://projects.iq.harvard.edu/stat110/youtube

I've taught out of Freedman, but not the other texts. It's well written, but it is targeted at a math-phobic audience. A fine choice if you do not wish to embark on the long path

[-]matto2y10

Thanks! I'll look this over.

Out of curiosity,

Most people with a strong intuition for statistics have taken courses in probability. It is foundational material for the discipline.

Do some people learn statistics without learning probability? Or, what's different for someone who learns only stats and not probability?

(I'm trying to grasp what shape/boundaries are at play between these two bodies of knowledge)

3mikes2y

Statistics is trying to "invert" what probability does. Probability starts with a model, and then describes what will happen given the model's assumptions. Statistics goes the opposite direction: it is about using data to put limits on the set of reasonable/plausible models. The logic is something like: "if the model had property X, then probability theory says I should have seen Y. But, NOT Y. Therefore, NOT X." It's invoking probability to get the job done. Applying statistical techniques without understanding the probability models involved is like having a toolbox, without understanding why any of the tools work. It all goes fine until the tools fail (which happens often, and often silently) and then you're hosed. You may fail to notice the problems entirely, or may have to outsource judgments to others with more experience.

1matto2y

Thanks, this is incredibly useful. I think I understand enough to put together a curriculum to delve into this topic. Starting with the harvard course you recommended.

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 2:38 PM

[-]green_leaf2y20

Pattern-match the real problems or their parts to the problems in the textbook. That will help you figure out what to do.

Moderation Log

21

[ Question ]

Where to start with statistics if I want to measure things?

21

New to LessWrong?

21

2 Answers sorted by top scoring

Apr 20, 2023

Apr 21, 2023

2 Answers sorted by
top scoring