I think this is the Sanderson post: https://wob.coppermind.net/events/529/#e16670
For personal relationships, mitigating my worst days has been more important than improving the average.
For work, all that's really mattered is my really good days, and it's been more productive to try and invest time in having more great days or using them well than to bother with even the average days.
I really enjoyed this study. I wish it weren't so darn expensive, because I would love to see a dozen variations of this.
I still think I'm more productive with LLMs since Claude Code + Opus 4.0 (and have reasonably strong data points), but this does push me further in the direction of using LLMs only surgically rather than for everything, and towards recommending relatively restricted LLM use at my company.
It's really useful to ask the simple question "what tests could have caught the most costly bugs we've had?"
At one job, our code had a lot of math, and the worst bugs were when our data pipelines ran without crashing but gave the wrong numbers, sometimes due to weird stuff like "a bug in our vendor's code caused them to send us numbers denominated in pounds instead of dollars". This is pretty hard to catch with unit tests, but we ended up applying a layer of statistical checks that ran every hour or so and raised an alert if something was anomalous, and those alerts probably saved us more money than all other tests combined.
There was a serious bug in this post that invalidated the results, so I took it down for a while. The bug has now been fixed and the posted results should be correct.
One sort-of counterexample would be The Unreasonable Effectiveness of Mathematics in the Natural Sciences, where a lot of Math has been surprisingly accurate even when the assumptions where violated.
The Mathematical Theory of Communication by Shannon and Weaver. It's an extended version of Shannon's original paper that established Information Theory, with some extra explanations and background. 144 pages.
Atiyah & McDonald's Introduction to Commutative Algebra fits. It's 125 pages long, and it's possible to do all the exercises in 2-3 weeks – I did them over winter break in preparation for a course.
Lang's Algebra and Eisenbud's Commutative Algebra are both supersets of Atiyah & McDonald, I've studied each of those as well and thought A&M was significantly better.
Unfortunately, I think it isn't very compatible with the way management works at most companies. Normally there's pressure to get your tickets done quickly, which leaves less time for "refactor as you go".
I've heard this a lot, but I've worked at 8 companies so far, and none of them have had this kind of time pressure. Is there a specific industry or location where this is more common?
When I write code, I try to make most of it data to data transformations xor code that only takes in a piece of data and produces some effect (such as writing to a database.) This significantly narrows the search space of a lot of bugs: either the data is wrong, or the do-things-with-data code is wrong.
There are a lot of tricks in this reference class, where you try to structure your code to constrain the spaces where possible bugs can appear. Another example: when dealing with concurrency/parallelism, write the majority of your functions to operate on a single thread. Then have a separate piece of logic that coordinates workers/parallelism/etc. This is much easier to deal with than code that mixes parallelism and nontrivial logic.
Based on what you described, writing code that constrains the bug surface area to begin with sounds like the next step – and related, figuring out places where your codebase already does that, or places where it doesn't do that but really should.