I am currently a nuclear engineer with a focus in nuclear plant safety and probabilistic risk assessment. I am also an aspiring EA, interested in X-risk mitigation and the intersection of science and policy.
I am not under a secret NDA that I can't talk about, as of August 7 2024. I intend to update this statement at least once a year as long as it's true. I encourage other people to make a similar statement.
Yeah, something along the lines of an ELO-style rating would probably work better for this. You could put lots of hard questions on the test and then instead of just ranking people you compare which questions they missed, etc.
This works for corn plants because the underlying measurement "amount of protein" is something that we can quantify (in grams or whatever) in addition to comparing two different corn plants to see which one has more protein. IQ tests don't do this in any meaningful sense; think of an IQ test more like a Moh's hardness scale, where you can figure out a new material's position on the scale by comparing it to a few with similar hardness and seeing which are harder and which are softer. If it's harder than all of the previously tested materials, it just goes at the top of the scale.
I wasn't saying it's impossible to engineer a smarter human. I was saying that if you do it successfully, then IQ will not be a useful way to measure their intelligence. IQ denotes where someone's intelligence falls relative to other humans, and if you make something smarter than any human, their IQ will be infinity and you need a new scale.
it’s not even clear what it would mean to be a 300-IQ human
IQ is an ordinal score, not a cardinal one--it's defined by the mean of 100 and standard deviation of 15. So all it means is that this person would be smarter than all but about 1 in 10^40 natural-born humans. It seems likely that the range of intelligence for natural-born humans is limited by basic physiological factors like the space in our heads, the energy available to our brains, and the speed of our neurotransmitters. So a human with IQ 300 is probably about the same as IQ 250 or IQ 1000 or IQ 10,000, i.e. at the upper limit of that range.
I've heard doctors ask questions like this but I don't think they usually get very helpful answers. "My diet's okay I guess, pretty typical, a lot of times I don't sleep great, and yeah I have a pretty stressful job." Great, what do you do with that?
"Food" in general is about the easiest and most natural thing for a dog to identify. Distinguishing illegal drugs from all the other random stuff a person might be carrying (soap, perfume, medicine, etc.) at least requires a lot better training than finding food.
It's interesting that 3.5 Sonnet does not seem to match, let alone beat, GPT-4o on the leaderboard (https://chat.lmsys.org/?leaderboard). Currently it shows GPT-4o with elo 1287 and Claude 3.5 Sonnet at 1271.
Although it would also be nice to distinguish that from "I read this post already somewhere else"
I would love to have a checkbox or something next to each post to indicate "I saw this and I don't want to click on it"
I would think things are headed toward these companies fine tuning an open source near-frontier LLM. Cheaper than building one from scratch but with most of the advantages.