Comment Permalink

Jiao Bu2y31

Related: I got two masters degrees, at midlife, after doing other stuff. I also moved back to the USA during that time and found it useful to learn a lot of little things I never needed to think about in Taiwan, like how to fix a car. So, having learned a handful of new skills in the past eight years or so, from car repairs to calculus, as a general heuristic I find doing something independently from beginning to end and fixing the problems along the way the first time teaches about 50% of the knowledge. 2-3 times gets to 75%. 3-5 times gets to 90%. Past the 90% mark, you spend the rest of your life making small improvements in the last 10% of the knowledge.

Basically, you don't need to do that many Taylor series to see the pattern and grok what's going on (and have improved your understanding of polynomial representations of calculus, and start getting intuitions when other approaches are used). You don't need to switch the motor mounts on that many cars to basically get it (and to have learned frankly a lot about similar types of car work). Etc.

That first time, maybe the second in some cases, is the biggest lift and the biggest learning.

See in context

Best of LessWrong 2020

150 The First Sample Gives the Most Information

by Mark Xu

24th Dec 2020

2 min read

16

150

Review by

johnswentworth

This is a linkpost for https://markxu.com/first-sample

I originally heard this point made by Ben Pace in Episode 126 of the Bayesian Conspiracy Podcast. Ben claimed that he learned this from the book How to Measure Anything, but I think I identified the relevant section, and this point wasn't made explicitly.

Suppose that I came up to you and asked you for a 90% confidence interval for the weight of a wazlot. I'm guessing you would not really know where to start. However, suppose that I randomly sampled a wazlot and told you it weighed 142 grams. I'm guessing you would now have a much better idea of your 90% confidence interval (although you still wouldn't have that good a guess at the width).

In general, if you are very ignorant about something, the first instance of that thing will tell you what domain you're operating in. If you have no idea how much something weighs, knowing the weight tells you the reasonable orders of magnitude are. Things that sometimes weigh 142 grams don't typically also sometimes weigh 12 solar masses. Similarly, things that take 5 minutes don't typically also take 5 days, and things that are 5 cm long aren't typically also 5 km long.

For more abstract concepts, having a single sample allows you to locate the concept in concept space by anchoring it to thing space. "Redness" cannot be properly understood until it is known that "apples are red". "Functions" are incomprehensible until you know "adding one to a number" is a function. "Resources" are vague until you learn that "money is a resource".

In reality, the first sample often gives you more information than a random sample. If I ask a friend for an example of a snack, they're not going to randomly sample a snack and tell me about it; they're probably going to pick a snack that is at the center of the space of all snacks, like potato chips.

From an information-theoretic perspective, the expected amount of information gained from the first sample must be the highest. If the sampling process is independently and identically distributed, the 2nd sample is expected to be more predictable given knowledge of the first sample. There is some chance that the first sample is misleading, but the probability that it's misleading goes down the more misleading the sample is, so you don't expect the first sample to be misleading. If you're very ignorant, your best guess for the mean of a distribution is pretty close to the mean of the samples you have, even if you only have one.

This is one perspective on why asking for examples is so powerful; they typically give you the first sample, which contains the most information.

Carving / Clustering RealityEpistemologyPracticalRationalityWorld Modeling

Frontpage

150

Mentioned in

212Humans provide an untapped wealth of evidence about alignment

131Book Launch: "The Carving of Reality," Best of LessWrong vol. III

108Voting Results for the 2020 Review

97Closing Notes on Nonlinear Investigation

94Prizes for the 2020 Review

Load More (5/10)

New Comment

16 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:04 AM

[-]Austin Chen4y302

This is a really powerful concept; I can immediately think of at least two fields this applies to:

When you're not sure how to build a software user interface, you might think "let's run an A/B test on 1000 people and see which performs better". But you'll get 90 percent of the value just by showing it to one or two users and watching them use it, live.
When you're learning to cook, one of the first things they teach you is to sample your food throughout. The first sip or bite will immediately tell you how to adjust the recipe (eg add more salt, add something spicy, or a dash of vinegar)