lc has argued that the measured tasks are unintentionally biased towards ones where long-term memory/context length doesn't matter:
https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1#vFq87Ge27gashgwy9
I like your explanation of why normal reliability engineering is not enough, but I'll flag that security against actors are probably easier than LW in general portrays, and I think computer security as a culture is prone to way overestimating the difficulty of security because of incentive issues, not remembering the times something didn't happen, and more generally side-channels arguably being much more limited than people think they do (precisely because they rely on very specific physical stuff, rather than attacking the algorithm).
It's a non-trivial portion of my optimism on surviving AGI coming in that security, while difficult is not unreasonably difficult, and partial successes matter from a security standpoint.
Link below:
I have 2 cruxes here:
In particular, I do not buy that humans and chimpanzees are nearly that similar as Heinrich describes, and a big reason for this is that the study that showed that had heavily optimized and selected the best chimpanzees against reasonably average humans, which is not a good way to compare performance if you want the results to generalize.
I don't think they're wildly different, and I'd usually put chimps effective flops as 1-2 OOMs lower, but I wouldn't go nearly as far as Heinrich on the similarities.
I do think culture actually matters, but nowhere near as much as Heinrich wants it to matter.
I agree evolution has probably optimized human learning, but I don't think that it's so heavily optimized that we can use it to give a tighter upper bound than 13 OOMs, and the reason for this is I do not believe that humans are in equilibrium, and this means that there are probably optimizations left to discover, so I do think the 13 OOMs number is plausible )with high uncertainty).
Comment below:
https://www.lesswrong.com/posts/DbT4awLGyBRFbWugh/#mmS5LcrNuX2hBbQQE
I'll flag that while I personally didn't believe in the idea that orcas are on average >6 SDs smarter than humans, and never considered it that plausible, I'd say that I don't think orcas could actually benefit that much from +6 SDs even if applied universally, and the reason is that they are in water, which severely limits your available technology options, and makes it really, really hard to form the societies needed to generate the explosion that happened post-industrial Revolution or even the agricultural revolution.
And there is a deep local optimum issue in which their body plan is about as unsuited to using tools as possible, and changing this requires technology they almost certainly can't invent because the things you would need to make the tech are impossible to get at the pressure and saltiniess of the water, so it is pretty much impossible for orcas to get that much better with large increases in intelligence.
Thus, orca societies have a pretty hard limit on what they can achieve, at least ruling out technologies they cannot invent.
My take is that the big algorithmic difference that explains a lot of weird LLM deficits, and plausibly explains the post's findings, is that current neural networks do not learn at run-time, instead their weights are frozen, and this explains a central difference of why humans are able to outperform LLMs at longer tasks, because humans have the ability to learn at run-time, as do a lot of other animals.
Unfortunately, this ability is generally lost gradually starting in your 20s, but still the existence of non-trivial learning at runtime is a huge explainer of why humans are more successful at longer tasks than AIs currently are.
And thus if OpenAI or Anthropic found this secret to life long learning, this would explain the hype (though I personally place very low probability that they succeeded on this for anything that isn't math or coding/software).
Gwern explains below:
Re other theories, I don't think that all other theories in existence have infinitely many adjustable parameters, and if he's referring to the fact that lots of theories have adjustable parameters that can range over the real numbers, which are infinitely complicated in general, than that's different, and string theory may have this issue as well.
Re string theory's issue of being vacuous, I think the core thing that string theory predicts that other quantum gravity models don't is that at the large scale, you recover general relativity and the standard model, whereas no other theory can yet figure out a way to properly include both the empirical effects of gravity and quantum mechanics in the parameter regimes where they are known to work, so string theory predicts more just by predicting the things other quantum mechanics predicts while having the ability to include in gravity without ruining the other predictions, whereas other models of quantum gravity tend to ruin empirical predictions like general relativity approximately holding pretty fast.
That said, for the purposes of alignment, it's still good news that cats (by and large) do not scheme against their owner's wishes, and the fact that cats can be as domesticated as they are while they aren't cooperative or social is a huge boon for alignment purposes (within the analogy, which is arguably questionable).
Link to long comments that I want to pin, but are too long to be pinned:
https://www.lesswrong.com/posts/Zzar6BWML555xSt6Z/?commentId=aDuYa3DL48TTLPsdJ
https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/?commentId=Gcigdmuje4EacwirD