I kinda agree with this as well. Except that it seems completely unclear to me whether recreating the missing human capabilities/brain systems takes two years or two decades or even longer.
It doesn't seem to me to be a single missing thing and for each separate step holds: That it hasn't been done yet is evidence that it's not that easy.
I think that is exactly right.
I also wouldn't be too surprised if in some domains RL leads to useful agents if all the individual actions are known to and doable by the model and RL teaches it how to sensibly string these actions together. This doesn't seem too different from mathematical derivations.
If you think generalization is limited in the current regime, try to create AGI benchmarks that the AIs won't saturate until we reach some crucial innovation. People keep trying this and they keep saturating every year.
Because these benchmarks are all in the LLM paradigm: Single input, single output from a single distribution. Or they are multi-step problems on rails. Easy verification makes for benchmarks that can quickly be cracked by LLMs. Hard verification makes for benchmarks that aren't used.
One could let models play new board/computer games against average humans: Video/image input, action output.
One could let models offer and complete tasks autonomously on freelancer platforms.
One could enrol models in remote universities and see whether they autonomously reach graduation.
It's not difficult to come up with hard benchmarks for current models (these are not close to AGI complete). I think people don't do this because they know that current models would be hopeless at benchmarks that actually aim for their shortcomings (agency, knowledge integration + integration of sensory information, continuous learning, reliability, ...)
If you only execute repeat offenders the fraction of "completely" innocent people executed goes way down.
The idea of being in the wrong place at the wrong time and then being executed gives me pause.
The idea of being framed for shop lifting, framed for shop lifting again, wrongfully convicted of a violent crime and then being at the wrong place at the wrong time is ridiculous.
Do you have a reference for the personality trait gene-gene interaction thing? Or maybe an explanation how that was determined?
I think this inability of "learning while thinking" might be the key missing thing of LLMs and I am not sure "thought assessment" or "sequential reasoning" are not red herrings compared to this. What good is assessment of thoughts if you are fundamentally limited in changing them? Also, reasoning models seem to do sequential reasoning just fine as long as they already have learned all the necessary concepts.
But the historical difficulty of RL is based on models starting from scratch. Unclear whether moulding a model that already knows how to do all the steps into doing all the steps is anywhere as difficult as using RL to also learn how to do all the steps.
10% seems like a lot.
Also, I worry a bit about being too variable in the number of reps and in how to add weight. I found I fall easily into doing the minimal version - "just getting it done for today". Then improvement stalls and motivation drops.
I think part of the appeal of "Starting Strength" (which I started recently) is that it's very strict. Unfortunately if adding 15 kilo a week for three weeks to squats it not going to kill me drinking a gallon of milk a day will.
Which is to say, I appreciate your post for giving more building pieces for a workout that works out for me.
A few years ago I had a similar idea, which I called Rawlsian Reinforcement Learning: The idea was to provide scenarios similar to those in this post and evaluate the actions of the model as to which person benefits how much from them. Then reinforce based on mean benefit of all characters in the scenario, or a variation thereof, i.e. the reinforcement signal does not use the information which character in the scenario is the model.
Maybe I misunderstand your method but it seems to me that you untrain the self-other distinction which in the end is a capability. So the model might not become more moral, instead it just loses the capacity to benefit itself because it cannot distinguish between itself and others.