On the spectrum from stochastic parrot to general reasoner I'm at 70%. We're definitely closer to a general reasoner than a parrot.
I don't have a clear answer as to what I expect the outcome to be. I was a physics major and I wish there were less discrete jumps in physics. Special relativity and general relativity seem like giant jumps in terms of their difficulty to derive and there aren't any intermediate theories. When this project comes out we'll probably be saying something the AI is 50% between being being able to derive this law and a conceptually harder one.
With respect to something analogous to Newtonian Mechanics I think that it does heavily depend on what kind of information can be observed. If a model can directly observe the equivalents of forces and acceleration than I believe most current models could derive it. If a model can only observe the equivalent of distances between objects and the equivalents of corresponding times and has to derive a second-order relationship from that I suspect that only o3 could do that. In six months, I believe that all frontier models will be able to do that.
Given that Terrence Tao described o1 as a mediocre graduate student it probably won't be long until frontier models are actually contributing to research and that will be the most valuable feedback. I say all this with a lot of uncertainty and if I'm wrong this project will prove that. Likewise there's going to be a long period of time where some people will insist that AI can do legitimate automated R&D and others who insist that it can't. At that point this will be a useful test to argue one way or another.
It would be interesting to vary the amount of information an AI is given until can derive the whole set of equations. For example, see if it can solve for the Maxwell equation given the other 3 and the ability to perform experiments or can it solve for the dynamic version of the equations given only the static ones and the ability to perform experiments.
But a supermajority of the population ought to be capable of learning to do what Hermione and Holocaust resisters did.
I think this is a key point. The Milgram experiments illustrate more about average person's obedience than their inclination towards evil. If the experimenters had pressured the subjects to do something heroic and personally risky they would have gotten similar results with a supermajority choosing to do the heroic thing. Most humans can pursue a wide-range of goals under pressure and.only a minority have the willpower to stick to a narrow range of goals.
Rationality is about how your mind holds itself, it is how you weigh evidence, it is how you decide where to look next when puzzling out a new area.
I really liked this line. A couple of years I was with friends and we were playing Spades together along with the dad of one of the friends. We were all computer science majors and the dad was farmer. In all of our games he was either first or second. We also played mafia and he was clearly very good at reading people and appearing innocent.
I've played some games since then with my friends dad and he's always been very skilled. I've always wondered why he was so good at these games we played since it's not something he did often, only the 1 or 2 times a year we'd visit. However, I've reflected on some of the conversations I've had with him and I've realized that he's a very detailed thinker and his job requires him to be constantly making decisions under uncertainty. For example, he has in his head about an hour long talk about what is the ideal corn row width which takes into account the trade-off between output volume and fungal spread and many other things that I can't remember. He'll also casually mention that one of his fields flooded with a nonchalance and a quip that he'll make do. He always sharing how much rain has fallen in the last x days and how much is predicted to fall in the next x days and I've realized that the weather not only affects his crop but also dictates what kind of work he's gonna do and when he'll do it. He's a detailed planner but his plans are always flexible. In games like Spades or Poker where you need make decisions under uncertainty his training as a farmer is far more valuable than my training as a programmer. At a programming job or in competitive programming I'm always looking for an exact answer and either I know it or I don't. There's no real uncertainty outside of saying things like "I'm 70% sure this will work" which is something I generally try to avoid. The biggest advantage my friend's dad has gained as a farmer in these uncertainty laden games is his intuitive sense of probabilities and the planning that should follow which is IMO best gained from accumulating real-life experience with a dedicated effort to improving this intuition and planning.
"Can you clarify that a bit? When what project comes out? If you mean mine, I'm confused about why that would say something about the ability to derive special & general relativity."
I mean your project. I'm hoping it can allow us to be more precise by ranking models abilities to characterize between well-known systems. Like a model can characterize Special Relativity given what Einstein knew at the time but not General Relativity. If you were to walk along some hypothetical road from SR to GR we might ballpark a model is 30% of the way there. Maybe this project could generate domains that are roughly some x% between SR and GR and validate our estimates.
"Agreed that each added step of mathematical complexity (in this case from linear to quadratic) will make it harder. I'm less convinced that acceleration being a second-order effect would make an additional difference, since that seems more like a conceptual framework we impose than like a direct property of the data."
Right. The important point is that the equation it needs to find is quadratic instead of linear in the data.