Humans Are Spiky (In an LLM World)
Assessments of "general" vs "spiky" capability profiles are secretly assessments of "matches existing infrastructure" vs "doesn't". Human societies contain human-shaped roles because humans were the only available workers for most of history. Packaging tasks into human-sized, human-shaped jobs was efficient. Given LLMs, the obvious thing to do is to try to drop them into those roles, giving them the same tools and affordances humans have. When that fails to work, though, we should not immediately conclude that the failure is because LLMs are missing some "core of generality". When LLM agents become more abundant than humans, as seems likely in the very near term, the most effective shape for a job stops being human-shaped. At that point, we may discover that human capability profiles are the spiky ones.
Are you trying to demonstrate that llm agents are now capable of cloning sqlite (in which case my response is that
select null || 'hello'should not yield'NULLhello'andselect null > 5should not crash), or that llm agents are not yet capable of one-shot cloning sqlite in rust (which is not very surprising).