Since Raemon's Thinking Physics exercise I've been toying with writing physics puzzles along those lines. (For fun, not because I'm aiming to write better exercise candidates.) If you assume an undergrad-level background and expand to modern physics and engineering there are interesting places you can go. I think a lot about noise and measurement, so that's where my mind has been. Maybe some baseline questions could look like the below? Curious to hear anyone's thoughts.
You're standing at one end of a grocery aisle. In your cart, you have a damped oscillator in a thermal bath, initially in equilibrium.
You push the cart, making sure it moves smoothly according to a prescribed velocity profile, and you bring it to a stop at the other end of the aisle. You then wait for the oscillator to reach equilibrium with its bath again.
The final temperature is
You're observing a particle undergo thermal motion in a fluid. It's continuously bombarded by fluid molecules that effectively subject the particle to a white noise force and velocity damping. You estimate that it tends to lose its momentum and change direction on a timescale of 1 millisecond.
You want to get some statistics on the particle's velocity. You know the average velocity is zero, but there will be some variance that depends on temperature. You recall that in equilibrium that the particle should have velocity with probability proportional to the Boltzmann factor , giving a root mean square thermal velocity .
You calculate velocity by taking pairs of pictures at different times, then dividing the change in position by the time step. Your camera has an effectively instantaneous shutter speed.
In experiment 1, you use a time step of 0.1 milliseconds to measure velocity. In experiment 2, you use a time step of 10 milliseconds.
You collect distributions of measured velocities for each experiment, giving root mean square velocities and , respectively. What do you find?
You're using an oscilloscope to measure the thermal noise voltage across a resistance . Internally, the oscilloscope has a parallel input resistance and capacitance , where the voltage on the capacitor is used to deflect electrons in a cathode ray tube to continuously draw a line on the screen proportional to the voltage over time.
The resistor and oscilloscope are at the same temperature. Is it possible to determine from the amplitude of the fluctuating voltage shown on the oscilloscope?
You've attached one end of a conductive molecule to an electrode. If the molecule bends by a certain distance at the other end, it touches another electrode, closing an electrical circuit. (You also have a third electrode where you can apply a voltage to actuate the switch.)
You're worried about the thermal bending motion of the molecule accidentally closing the circuit, causing an error. You calculate, using the Boltzmann distribution over the elastic potential energy in the molecule, that the probability of a thermal deformation of at least is (a single-tailed six-sigma deformation in a normal distribution where expected potential energy is ), but you don't know how to use this information. You know that the bending motion has a natural frequency of 100 GHz with an energy decay timescale of 0.1 nanosecond, and that it behaves as an ideal harmonic oscillator in a thermal bath.
You're considering integrating this switch into a 1 GHz processor. What is the probability of an error in a 1 nanosecond clock cycle?
EDIT: added spoiler formatting
I'm going to guess 3. Reasoning: I'm sure right away that 1, 2 are wrong. Reason: If you leave the thing sitting for long enough then obviously it's going to eventually fail. So 2 is wrong and 1 is even wronger. I'm also pretty sure that 5 is wrong. Something like 5 is true for the velocity (or rather, the estimated velocity based on measuring displacement after a given time ) of a particle undergoing Brownian motion, but I don't think that's a good model for this situation. For one thing, on a small time-scale, Brownian velocities don't actually become infinite, instead we see that they're actually caused by individual molecules bumping into the object, and all energies remain finite.
3 and 4 are both promising because they actually make use of the time-scales given in the problem. 4 seems wrong because if we imagined that the relaxation timescale was instead 1 second, then after looking at the position and velocity once the system oscillates in that same amplitude for a very long time, and doesn't get any more tries to beat its previous score. Answer is 3 by elimination, and it also seems intuitive that the relaxation timescale is the one that counts how many tries you get. (up to some constant factors)
This reasoning is basically right, but the answer ends up being 5 for a relatively mundane reason.
If the time-averaged potential energy is k_B T / 2, so is the kinetic energy. Because damping is low, at some point in a cycle, you'll deterministically have the sum of the two in potential energy and nothing in kinetic energy. So you do have some variation getting averaged away.
More generally, while the relaxation timescale is the relevant timescale here, I also wanted to introduce an idea about very fast measurement events like the closing of the electrical circuit. If you have observables correlated on short timescales, then measurements faster than that won't necessarily follow expectations from naive equilibrium thinking.
Good point, I had briefly thought of this when answering, and it was the reason I mentioned constant factors in my comment. However, on closer inspection:
The "constant" factor is actually only nearly constant.
It turns out to be bigger than 10.
Explanation:
10^{-9} is about 6 sigma. To generalize, let's say we have sigma, where is some decently large number so that the position-only Boltzmann distribution gives an extremely tiny probability of error.
So we have the following probability of error for the position-only Boltzmann distribution:
Our toy model for this scenario is that rather than just sampling position, we jointly sample position and momentum, and then compute the amplitude. Equivalently, we sample position twice, and add it in quadrature to get amplitude. This gives a probability of:
Since we took to be decently large, we can approximate the integrand in our expression for with an exponential distribution (basically, we Taylor expand the exponent):
Result: is larger than by a factor of . While the is constant, grows (albeit very slowly) as the probability of error shrinks. Hence "nearly constant". For this problem, where , we get a factor of about 15, so probability per try.
Why is this worth thinking about? If we just sample at a single point in time, and consider only the position at that time, then we get the original per try. This is wrong because momentum gets to oscillate and turn into displacement, as you've already pointed out. On the other hand, if we remember the equipartition theorem, then we might reason that since the variance of amplitude is twice the variance of position, the probability of error is massively amplified. We don't have to naturally get a 6 sigma displacement. We only need to get a roughly a sigma displacement and wait for it to rotate into place. This is wrong because we're dealing with rare events here, and for the above scenario to work out, we actually need to simultaneously get displacement and momentum, both of which are rare and independent.
So it's quite interesting that the actual answer is in between, and comes, roughly speaking, from rotating the tail of the distribution around by a full circle of circumference . :::
Anyway, very cool and interesting question! Thanks for sharing it.
EDIT: added spoiler formatting
is the RMS instantaneous velocity. Taking pictures at intervals gives an averaged velocity, which is slower because the particle wastes some time going in different directions that cancel out. is going to be near the instantaneous velocity, but still a little slower, since the velocity is still going to decay slightly, even over 1/10th of the decay time. is going to be significantly slower. If we make the time step even slower than 10 ms, we expect the RMS velocity to go roughly as the inverse square root of the timestep. Anyway, the answer should be 3:
I sometimes wonder how much we could learn from toy models of superhuman performance, in terms of what to expect from AI progress. I suspect the answer is "not much", but I figured I'd toss some thoughts out here, as much to discharge any need I feel to think about them further as to see if anyone has any good pointers here.
Like—when is performance about making super-smart moves, and when is it about consistently not blundering for as long as possible? My impression is that in Chess, something like "average centipawn loss" (according to some analysis engine) doesn't explain outcomes as well as "worst move per game". (I don't know the keywords to search for, but I relatedly found this neat paper which finds a power law for the difference between best and second-best moves in a position.) What does Go look like, in comparison?
How deep are games? What's the longest chain of players such that each consistently beats the next? How much comes from the game itself being "deep" versus the game being made up of many repeated small contests? (E.g., the longest chain for best-of-9 Chess is going to be about 3 times longer than that for Chess, if the assumptions behind the rating system hold. Or, another example, is Chess better thought of as Best-Of-30 ChessMove with Elo-like performance and rating per move, or perhaps as Best-Of-30 Don'tBlunder with binary performance per move?)
Where do ceilings come from? Are there diminishing returns on driving down blunder probabilities given fixed deep uncertainties or external randomness? Is there such a thing as "perfect play", and when can we tell if we're approaching it? (Like—maybe there's some theoretically-motivated power law that matches a rating distribution until some cutoff at the extreme tail?)
What do real-world "games" and "rating distributions" look like in this light?
Related would be some refactoring of Deception Chess.
When I think about what I'd expect to see in experiments like that, I get curious about a sort of "baseline" set of experiments without deception or even verbal explanations. When can I distinguish the better of two chess engines more efficiently than playing them against each other and looking at the win/loss record? How much does it help to see the engines' analyses over just observing moves?
How is this related? Well, how deep is Chess? Ratings range between, say, 800 and 3500, with 300 points being enough to distinguish players (human or computer) reasonably well. So we might say there are about 10 "levels" in practice, or that it has a rating depth of 10.
If Chess were Best-Of-30 ChessMove as described above, then ChessMove would have a rating depth a bit below 2 (just dividing by ). In other words, we'd expect it to be very hard to ever distinguish any pair of engines off a single recommended move—and difficult with any number of isolated observations, given our own error-prone human evaluation. If it's closer to Best-Of-30 Don'tBlunder, it's a little more complicated—usually you can't tell the difference because there basically is none, but on rare pivotal moves it will be nearly as easy to tell as when looking at a whole game.
The solo version of the experiment looks like this:
What I'd expect is that my ratings with pairs of advisors should be somewhere between my rating with the bad advisor and my rating with the good advisor. If I can successfully distinguish them, it's close to the latter. If I'm just guessing, it's close to the former (in the Don'tBlunder world) or to the midpoint (in the ChessMove world). I should have an easier time in sub-experiments B and C. Having a worse engine in the mix weighs me down relatively more (a) the closer the engines are to each other, and (b) the stronger both engines are compared to me.
The main question I'd hope might be answerable this way would be something like, "How do (a) and (b) trade off?" Which is easier to distinguish—1800 and 2100, or, say, 2700 and 3300? Will there be a ceiling beyond which I'm always just guessing? Might I tend to side with worse advisors because, being closer to my level, they agree with me?
It seems like we'd want some handle on these questions before asking how much worse outright deception can be.
(There's some trouble here because higher-ranked players are more likely to draw given a fixed rating difference. This itself is relatively Don'tBlunder-like, and it makes me wonder if it's possible to project how far our best engines are likely to be from perfect play. But it makes it harder to disentangle inability to draw distinctions in play above my level from "natural" indistinguishability. There are also more general issues in doing these experiments with computers—for example, weak engines tend to be weak in ways humans wouldn't be, and it's hard to calibrate ratings for superhuman play.)
(It might also be interesting to automate myself out of this experiment by choosing between recommendations using some simple scripted logic and evaluation by a relatively weak engine.)
Along the lines of what I wrote in the parent, even though I think there's potentially a related and fairly deep "worldview"-type crux (crux generator?) nearby when it comes to AI risk—are we in a ChessMove world or a Don'tBlunder world?—[sorry, these are terrible names, because actual Chess moves are more like Don'tBlunder, which is itself horribly ugly]—I'm not particularly motivated to do this experiment, because I don't think any possible answer on this level of metaphor would be informative enough to shift anyone on more important questions.