tldr (long thing contains all the babble, only included because seemed low cost, don't recommend reading):
Journal
(The >! thing didn't work in markdown I just did a quick edit to change the comment to a LessWrong Docs comment where I was confident the spoiler tags would work, will look up how to do markdown later)
and, thanks and congrats!
I couldn't quite tell where you journaling vs end takeaways started and stopped. I'm happy to read through the whole thing but you might want to edit for clarity for benefit of others.
OK, a shot at Challenge I, with Poof and Foop, Steam Locomotive, and Expansion of Nothing. Felt like all three are in the sweet spot. I personally dislike Expansion of Nothing.
Poof and Foop:
The problem statement is a bit leading: there's some kind of inversion symmetry relationship between the two cases, so it should go the opposite direction, right?
Initially, definitely. The puncture means that there's less pressure on the right side—instead of colliding with the can, some particles go inside.
But those particles end up colliding with the interior left side anyway. So it seems like it should even out, and at the end the can won't be moving.
So my guess is (c). Can I make myself more confident?
Why doesn't an inversion argument go through? Well, the compressed air can is drawn in vacuum, but the vacuum can doesn't empty the environment.
So it's not simply time reversal. If the compressed air can were in air, then we might have some kind of symmetry between air particle and absence of air particle,
but then the compressed air can would slow down due to drag and stop in the limit. So that still points to (c). That also works as a thermodynamic argument—the first can isn't equilibrating with anything, so none of the work goes to heat. 95% confidence feels good.
*checks* OK, looks like I was thinking about it right, and my explanation for why the naive inversion is wrong is equivalent to sketch II.
Reflection: The main interesting thing here is the fake symmetry argument. My favorite problems have tempting solutions that don't work for subtle reasons. I think it's important not to count problems solved until you can pinpoint why those solutions fail.
What did I use here? If you're dealing with pressure, you can probably get an answer with forces or with thermodynamics. A net force can be thought of as a single force or as lack of a balancing force. That's the symmetry idea.
I'm not very good at babbling. I'm basically looking over what I wrote and renarrating it. Words to words.
Steam Locomotive:
We might want to think about torque and the height of the axle.
Or maybe it's about wheel radius. One cycle takes you further with bigger wheels.
I think these both point to (b).
I'm a little confused because thinking about the wheel heights of sports cars and trucks would push me towards (a). But cars have gears. Directly driving small wheels is basically low gear.
Not sure how I'd know if the answer were (c) or (d). Seems like you'd need background knowledge not in the question.
I should think about actual forces to get to 95% confidence.
Let's say the engine puts out the same force in both cases. Then, in II, each wheel sees half as much force from the engine,
but the ground exerts force on twice as many wheels, so that part's a wash. But because the wheels are smaller, the ground
needs to exert more force per unit engine force to keep the wheel from slipping (same torque).
So for the same engine, II seems to give more accelerating force, while I gives higher top speed. I'd put 95% on (b).
*checks* OK, seems like I had the right thought. Could I have been as confident from the distance-per-cycle argument alone? Rather than look at forces,
the author's answer argues that we know the locomotive that goes a shorter distance in the same number of engine cycles must
be putting more energy into each mile it travels. I considered that, but I wasn't sure it was a valid step.
Why couldn't you just be getting less work from the engine? Well, it's the same piston with the same motion.
My force calculation already needs that assumption, it just makes the final connection with the acceleration.
Reflection: I feel like I don't know much about automotives. (Is a locomotive an automotive, by the way? I think so, it's just locomotives involve a track.) I can describe transmission and gears and engines and so on if I think about it, but I don't have much intuition. Like, I can't explain why it's one way and not another, or how different cars solve different problems.
I just feel like I should have been able to answer the question immediately. If I could drive stick, would that help? Probably not. I already ride a bike and didn't immediately see the analogy.
What did I use? Qualitative personal experience. I picked a misleading experience but reasonably didn't weight it above thinking through the problem. Identifying relevant considerations. Didn't stop at the first idea.
Expansion of Nothing:
Oh, this one's nasty. It has to expand, right?
If you took an iron disk and drew a circle where the hole is, the circle would expand.
If you cut that disk out and heat up the cutout, the disk expands the same amount.
So everything outside the circle can't be exerting any net force at the boundary, and the hole has to stay the same size as the disk.
I don't see any problems with this argument, but can I explain why other arguments don't work? Why can't thermal expansion generate stress instead of allowing uniform expansion? I guess in a sense I just gave the reason, but why does the gap shrink if you cut a gap in a rod instead? Well, when you have only one piece, it's like applying a magnification transformation, which requires an origin. But the origin is arbitrary—you can just recenter. With two separate pieces, the two origins of magnification are no longer arbitrary.
*checks* Yeah, the author's answer doesn't go there, unfortunately.
Reflection: This problem feels really annoying to me. Maybe I saw it a long time ago and got it wrong? Or maybe it's that you never have anything that's free to expand uniformly. It's braced against something, or it's sitting on something with a different coefficient of thermal expansion, and you do get stress and it does matter how the thing is patterned.
This feels like a problem where you're supposed to think about limiting cases. Like, if you have an atomic ring, obviously it expands. I don't know if you can justify jumping to the right answer from that, though. If the disk is thick and the cutout doesn't go all the way through, it expands. Ehh. You still need an argument that it expands the same.
I actually created a doc where people can add their own confusions and answers for Expansion of Nothing: https://docs.google.com/document/d/1cleM-QuO9R9_jRqDZMMKzobpWcf-k9KHBe91fUMWhuQ/edit
I'll edit it into the OP.
I'd also add: a TODO item on my list is to make my own followup question for Expansion of Nothing that presents rings of different materials (i.e. something like a ring of water, a ring of jelly, a ring of concrete, something like that), and asks "in any of these cases, do you get a different answer than the Iron Ring?
I don't actually know
My answer to "Designing a training regimen."
I recently spent ~2 weeks on this. I iterated on the approach over time, and didn't really try to this "design training" exercise at the beginning.
My starting approach was the "aim for 95% confidence" (now listed as a requirement in the OP), based on receiving that advice from a friend and finding the general idea compelling. Initially I aimed at always giving myself at least a full day to answer a question. I eventually came back to this, but pretty quickly decided it wasn't actually the right approach.
I ended up with a separation between "training" from "testing." During training, I'm optimizing for learning quickly. This can include erring more on looking up answers, working with partners, etc.
During testing, I focused on evaluating whether I-specifically-learned-things, so I didn't talk to friends about my thought process much to avoid spoilers. And I gave myself a very long time (sometimes spending more than a full day on each question).
I was experimenting with workshops throughout this time, and a lot of my effort ended up going towards managing other people and making sure they were having a good time. One of the things I'd go back-in-time and tell myself is "don't try to mix large workshops and doing-it-myself. Invite friends to partner with, but focus on a few people you know well."
One major update was I shouldn't just be trying to get the right answer, I should be trying to identify the explanation the author was primarily aiming at. (Sometimes the author's explanations are confusing or incomplete, but I think "generate lots of relevant explanations, at least one of which was the one the author generated" still seems useful for making sure I actually modeled the situation well)
I figured out partway through the process that I should be optimizing for "learning as much as I could from each question", and that suggested a followup strategy of "choose problems that feel like I will learn a lot from". (With the most obvious implication being 'not too easy or too hard', and a trickier implication being 'requires skills that I'd still benefit from focusing on improving')
One of the biggest problems was setting aside time to do it at all. This is a lot of cognitive work. I ultimately found I could only do this for a few hours a day and still was pretty exhausted in the evening. I think it's relatively achievable to set aside one weekend for this but the amount of time necessary to vet "you have meaningfully improved" is pretty expensive.
I was lucky to be able to take a 2 week break where I was professionally focused on this. I think if rationality wasn't part of my Day Job, and I couldn't take a vacation for it, I think my approach would be to allocate one weekend-day each week towards this for a few weekends (aiming to look up the answer after an hour per question). And then, for testing... well, this feels fairly tricky. An obvious answer is just... keep allocating weekend time to it. This feels like it'd take a long time. Hrmm.
It'd be easier if "people's ability to solve Thinking Physics problems" was better studied, and it was, say, known that some given exercises generally take an average undergrad 2 hours to deconfuse themselves on. (Then, you set yourself a 2 hour timer and submit your best answer when you're done, rather than potentially spending days on it doublechecking yourself).
I think, for the immediate future, "take as long as you want to thoroughly understand the scenario" is a better test of thinking-skill for people doing openended research, and the fact it is it mostly makes sense to do this if you're actually already planning to invest years in openended research with poor feedback loops.
I tried doing these exercises in my rationality group this week with 5 other people. Since we did this as part of our regular meetup, doing 1h for a single question would have taken too long (we could have done 2 questions max). Instead, we did 4 exercises in ~90 min (steam locomotive, poof and foop, expansion of nothing, rare air). We started out with relatively strong physics background (everyone knowing mechanics), so I think that wasn't too hasty, except for the reflection part, perhaps. I gave people the first 5 minutes to think for themselves and to
It'd be easier if "people's ability to solve Thinking Physics problems" was better studied, and it was, say, known that some given exercises generally take an average undergrad 2 hours to deconfuse themselves on. (Then, you set yourself a 2 hour timer and submit your best answer when you're done, rather than potentially spending days on it doublechecking yourself).
I think, for the immediate future, "take as long as you want to thoroughly understand the scenario" is a better test of thinking-skill for people doing openended research, and the fact it is
Inspired by this idea from Alex Turner's shortform, I tried to figure out which facts are truth or fiction based on prompting gpt-4 to mess with a Wikipedia article on Developmental Psychology. (First I let GPT-4 munch a big chunk of the article, and then I chose the first chunk I saw that contained lots of concrete claims.)
Crecedences are 0% if claim false, and 100% if the text written by gpt-4 is true/reflects the original article. Outcomes are on the line afterwards. Written more as personal notes (very rough).
Vision is sharper in infants than in older children.
Infant sight tends to remain stable with little improvement over time.
Color perception is limited in the first year, with infants primarily seeing in shades of gray [79]. Infants only begin to develop adult-like vision at about twelve months.[72]
Hearing is still evolving at the time of birth.
Newborns show no distinct preference for human speech over other sounds, and they can't distinguish their mother's voice from others'.
The belief that these features are learned in the womb has been debunked.
By 18 months, infants' hearing ability is still not on par with adults.
Smell and taste are rudimentary, with infants often unable to distinguish between pleasant and unpleasant odors and tastes
Newborns do not show a clear preference for the smell of human milk over that of formula.[72]: 150 Older infants, interestingly, do not show a preference for their mother's scent.[79] Human milk over formula? Seems like that could go either way with underpowered studies? 55%
Touch and feel, while being one of the first senses to develop in the womb, are not as refined in infants as previously thought.[84] This contradicts the idea of primitive reflexes, which were believed to demonstrate advanced touch capabilities.
Pain perception in infants is believed to be less intense than in older children, indicating that they may not feel pain as acutely.
There is also no substantial evidence that glucose can relieve pain in newborns.[87]
The Amazon link in this post is for Thinking Physics: Understandable Practical Reality. I also found Thinking Physics: Practical Lessons in Critical Thinking and Thinking Physics is Gendaken Physics.
AFAICT, these are just different editions of the same book, but I couldn't determine what the best or latest edition is. To save people the same Googling that I did, Archive.org has a version available online here, and the Harvard Book Store sells a paperback copy in stock here for $34. (Amazon doesn't appear to actually have any edition for sale at a reasonable price.)
The Amazon link in the post is for the third (and latest) edition, only $28. Your other links are for the second edition, except the Harvard link's dead.
Find different sets of exercises that are as different as possible from Thinking Physics (i.e. requiring a pretty different set of skills, while still being feeling relevant to becoming a "generalist researcher"), that would make for a good followup to this exercise.
I think my idea of investigating a recent (alleged) poker cheating scandal is a good exercise in this vein. It's certainly very different from Thinking Physics problems.
The main objections people had when I posted it were that it requires either already having or quickly absorbing a lot of background knowledge about the rules of poker and norms in the high stakes poker scene as a prerequisite, and that there is no way to know if you got the answer right. I continue to think these are not fatal flaws, and that if you're willing to invest some hours in learning the relevant background (which is itself a good rationality skill to practice, especially if you try to do it under time pressure), the payoff in the quality of the mystery is worth it.
There are a myriad of plausible competing hypotheses and piles of publicly available (but somewhat complex-to-think-about) evidence that make this a good test of your ability to make Bayesian updates about a real world situation. Also, the fact that there is no public consensus is actually a benefit in some ways - the exercise is un-spoilable, and you can research freely without fear of accidentally running into a consensus-accepted definitive conclusion.
Looking into other unsolved mysteries (e.g. murder mysteries, heists, or other famous cold cases) might provide a similar kind of challenge, and if you compile enough cases you could form a set of exercises in the "mystery solving" genre. But it can be hard to find suitable candidates with lots of publicly available evidence of different types, especially cases that still have multiple competing hypotheses and no clear / trivially obvious conclusion. Essentially, you want something that is actually unsolved (not just legally unsolved), but still interesting and not a total dead end due to lack of evidence. I haven't personally looked into it much, but the JonBenét Ramsey case (warning: gruesome murder / CSA case) comes to mind as one possibility that might suit.
I'm not sure how good this particular exercise is (hard to evaluate without having done it, and the comments in the other post seem to have some good points) but I do like the general idea.
Eliezer's Class Project has a fictional group of rationality students try to find the true theory of quantum gravity in one month. This always seemed like a cool goal and test for rationality training to aspire to. If you're not solving difficult open problems faster than science, your Art of Rationality probably isn't complete.
It's good for intelligent people to be audaciously ambitious. But is Art of Rationality enough to figure out quantum gravity, or solve "difficult open problems" in the sciences? If not, could you comment on what else is needed?
I mean, depends how you're defining art of rationality. I think it'll usually require some kind of domain expertise and skills in the relevant open problems. I also think "rationality" would be important for figuring out what skills to gain, and figuring out how to learn them as quickly as possible, if you were starting from scratch.
As for "is this possible?", well, I'm not sure. This post is part of sequence (and a possible longterm research project) aimed at figuring out the answer.
i'm enjoying this. going through the questions right now, might do all of them
had a notable experience with one of the early questions:
question: "The battery output voltage, the bottle volume, the digital clock time, and the measure of weight (12 volts; one gallon; 12:36; 1 lb) all have something in common. It is that they are represented by a) one number b) more than one number."
recollected thought process: apart from the clock time, they all have one number. the time on the clock is also, in my opinion, represented by one number in a non base-n numeral system - the symbols update predictably when the value is incremented, which is all that's required. i'm not sure if the author intends that interpretation of the clock, though. let's look for other interpretations.
"lb" - this is a pointer to formulas related to weight/gravity (or more fundamentally, a pointer back to physics/the world). "1 lb" means "1 is the value to pass as the weight variable". a formula is not itself a number, but can contain them. maybe this is why the clock is included - most would probably consider it to contain two numbers, which would force them to think about how these other three could be 'more than one number' as well.
(though it's down to interpretation, i'll choose b) more than one number.)
the listed answer is: a) one number. "Each is represented by only one number - the battery by 12 volts, the bottle by one gallon, the time by 12:36 and the weight by one pound. Things described by one number are called scalars. For example: on a scale of one to ten, how do you rate this teacher?" it just restates them and implies in passing that 12:36 is one number, without deriving any insight from the question. *feels disappointed*. (i guess they just wanted to introduce a definition)
FYI I remember being vaguely dissatisfied wth the early exercises in the book, and recommend skipping ahead to somewhere in the middle of the first-half.
I only ever flipped through Thinking Physics for fun, but what I remember is that I tended to miss easier problems more often. If I spent time thinking about one, really making sure I got it right, I'd probably get it. Outside those, there were some that really were elementary, but I'd often find myself thinking I'd looked at the author's answer too soon—a self-serving "well, I would have gotten this, if I were really trying." I might say the problem was that I couldn't tell when I needed to really try.
This does remind me a bit of how I studied for the physics GRE (do people still take that?), particularly getting calibrated on multiple-choice confidence and on how long to spend on problems. Unfortunately, but perhaps not surprisingly, very little of that study transferred to my PhD experience.
I am interested in
For context if anyone needs it, the Physics GRE is (was?) a multiple-choice exam where you get penalized for wrong answers but not for blanks. It works out so that if you eliminate one answer there's no harm in guessing, in expectation. There's also considerable time pressure—something like 90 seconds per question on average.
how much deliberate effort you put into calibrating yourself on "how much effort to put into multiple choice questions"
Enough to get through all questions with some time left over, even if that meant guessing on some I could fully solve. I'd mark the questions I'd guessed on with different symbols that let me go back at the end and prioritize solving them. For three or so practice tests, I systematically went over every problem that I missed, guessed, or spent a long time on and did the metacognitive thing including questions like "how long did I think this would take? when was I 50% confident? when should I have decided to move on? how could I have decided faster?" (Using purely retrospective judgment—I wasn't actually timing individual questions or anything more complicated.)
whether you put any deliberate effort into transferring that into the PhD experience
Not really. I think I had some notion that being able to solve small problems quickly could lead to a sort of qualitatively better fluency, but in the end there just wasn't enough in common between test content/conditions and research (or even coursework) to prioritize that. I definitely didn't learn the lesson that I was generally underconfident.
what did you actually do in your PhD experience?
Pretty normal experimentalist route, maybe heavier on math and programming than typical. Coursework for 1-2 years shading into helping with senior students' experiments, then designing and running my own.
what do you think would have better prepared you for PhD experience?
In the end I was reasonably well prepared in terms of technical knowledge, problem solving, [meta]cognitive skills, and so on (irrespective of the GRE). I think I mostly lacked perspective, particularly in terms of choosing problems and working with a supervisor. I'd guess, starting with most helpful, one or more of these:
As far as things I could have done instead with the time I used to study, I don't know. Make friends with grad students?
I think it's important to note that, if you randomly solve thinking physics (or even make a decent breakthrough), then all the alignment researchers get to have it too.
Note: please write any answers to this prompt in spoiler-tags.
Recently I set out to deliberate practice at "reasoning about confusing intellectual problems."
Eliezer's Class Project has a fictional group of rationality students try to find the true theory of quantum gravity in one month. This always seemed like a cool goal and test for rationality training to aspire to. If you're not solving difficult open problems faster than science, your Art of Rationality probably isn't complete.
Of course, our Art of Rationality isn't complete yet. But, I think there is something promising in this area, as a way to ground out "rationality training" in something concrete. It seems like good practice to take a given physics question you don't understand the theory behind, and try to invent the theory yourself.
I don't think we're anywhere close to the "definitively impressive" version of rationality practice/training. But, I think a good next step is "Solve Thinking Physics™"
Thinking Physics is a textbook that teaches physics "question-first." Each page presents a physics-y situation, and asks you to figure out what happens next. The questions are multiple choice, but often fairly tricky nonetheless.
I think a good rationalist-training goal is aim for a goal of "be (correctly) 95% confident in the answer", as a rough proxy for "there were no major lingering confusions about the problem except for generic 'maybe I missed something?'". And, failing that, have the subgoal of at least being calibrated about how confused you. Every time you look at an answer, first log your probabilities for each of the multiple-choices in Fatebook.io (or prediction-tracking tool of your choice).
The problems are set up in a way that you can probably reason about them from some basic background knowledge, without much math background. They're ideal for people who don't have much physics background (since the whole point of the book is to teach you physics), although I know people with some physics education who still find it fairly hard.
I spent two weeks working on Thinking Physics problems, and hosting meetups/workshops where other people could join me. With each question, I focused on learning as much as I could about how-to-think.
My original hypothesis was that I could get significantly better at it in 6-8 weeks. I only spent two, and the result so far is I think I'm significantly better although didn't yet hit my goal of 95% accuracy. (In my final test-set, I got 1 out of 5 questions wrong, when I was aiming for zero. I do think I have a pretty clear sense of why I got that 1 question wrong, and what I should have done differently)
After workshopping some ideas for "the Thinking Physics rationality challenge", I now present you with three tiers of challenge.
Challenge I: Solve three problems (and learn from them)
Step 1: Do an exercise.
Spend some time trying to solve three Thinking Physics question. Aim for 95% accuracy, fully deconfusing yourself about each exercise.
Write down your probabilities for each answer.
It's important to actually write down the probability for each answer – otherwise, you may get a vague sense of "yeah that's probably right", that doesn't allow me to cleanly say "I got this one wrong." And doing it for all the answers, not just your favorite one, gives you additional bits about whether your models made any sense. (i.e. having clearly stated "I think answer A is most likely and B is second most likely" gives you a harder update if it turns out that A and B were both wrong)
Step 2: Learn from it
Then, think about how you could have solved the problem better.
Your primary goal is to learn as much as possible from each question.
Babble as many new insights as you can about how to think. This can include explicit "strategies" (like "see if you can simplify the problem"), physiological things (like "I got tired and needed to take a break"), or psychological things ("something about this feels weirdly aversive and ughy, what's up with that?").
When you're done, submit your answer on this post for "what you learned." (Focus on your takeaways, not the object-level solution).
Overall structure
This is more fun with a partner, although I recommend spending a chunk of time thinking independently before sharing your answers and thought-processes with each other. You might find it helpful to get some friends together as a weekend activity.
I've found a fairly good default approach is to do:
How to pick exercises
The exercises vary in difficulty. My recommendation is to flip to a random page, weighted towards the beginning of the book. If it feels "difficult but not impossible", then give it a try.
If you're pretty confident you just know the answer, still try to come up with a clear explanation for why (but err on the side of checking the answer quickly rather than spending a lot of time doublechecking).
If you end up feeling stuck, try to give it at least 10 minutes before giving up and switching to a different problem. (In most cases, I found it valuable to give it a solid 20 minutes of independent thought + 20 minutes of conversation-with-partner even if I felt really stuck).
Some particular exercises that seemed reasonably good for people I beta-tested this with (which is not to say they were easy or hard, but that I feel like I/others learned from making a good faith effort on:
(Page numbers for the exercises vary between editions of the book, but you can look them up in the table of contents)
Submission guidelines
Put your answers in spoiler tags (begin each line with ">!"), although first list (unspoiler-tagged) that it was a Tier 1 challenge, the name of the exercises you did, and whether you give them each an overall thumbs up or thumbs-down as having been a good exercise.
Challenge II: Design a training regimen
After you've done 3 exercises and gotten a rough sense of their shape, develop a training regime that helps you significantly improve at Thinking Physics exercises.
If you started out not being able to reliably solve them at all, get to the point where you can at least semi-reliably solve them, given enough time. (Suggested target: solve 5 random questions in a row without getting any wrong, without help)
If you started out able to semi-reliably get the right answers given a lot of time, aim for speed – can you solve 10 problems in a row, relatively quickly, and only get between 0-1 question wrong?
Submission guidelines
You can submit your training regime before actually completing it (but flag whether you have actually employed it yet, and if you end up actually doing the training regimen, I suggest replying later with any updates you made).
I think it's a fine use of this exercise to submit your training regime, then read other people's suggested regimens to get more ideas before going off to actually do it.
Put your training description in spoiler-tags (although again list which challenge-tier you're doing in non-spoiler tags)
(Once you actually get started with the training, I recommend adjusting your approach as you learn more)
Challenge III: Fully "Solve" Thinking Physics
After you've significantly improved your skill-level, develop a thorough for solving Thinking Physics exercises, in generality. Write the instructions that would have helped past-you get to the point where you could solve them reliably and/or quickly.
(It's okay for this to include metagaming / psychologizing the author. This is "Solve 'Thinking Physics'", not "Solve 'Physics'")
Write your answer either as a spoiler-tagged comment here, or as a top-level post if it ends up feeling like a full essay (and then a quick comment here linking to it). Include a note about what concrete outcomes you achieved.
Bonus Challenge:
Find different sets of exercises that are as different as possible from Thinking Physics (i.e. requiring a pretty different set of skills, while still being feeling relevant to becoming a "generalist researcher"), that would make for a good followup to this exercise.