1 min read

5

This is a special post for quick takes by 1a3orn. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
29 comments, sorted by Click to highlight new comments since:
[-]1a3orn6113

Here's what I'd consider some comparatively important high-level criticisms I have of AI-2027, that I am at least able to articulate reasonably well without too much effort.

1

At some point, I believe Agent-4, the AI created by OpenBrain starts to be causally connected over time. That is, unlike current AIs that are temporally ephemeral (my current programming instance of Claude has no memories with the instance I used a week ago) and causally unconnected between users (my instance cannot use memories from your instance), it is temporally continuous and causally connected. There is "one AI" in a way there is not with Claude 3.7 and o3 and so on.

Here are some obstacles to this happening:

  1. This destroys reproducibility, because the programming ability you have a week ago is different than the ability two weeks ago and so on. But reliability / reproducibility is extremely desirable from a programming perspective, and a very mundane reliability / troubleshooting perspective (as well as from a elevated existential risk perspective). So I think it's unlikely companies are going to do this.
  2. Humans get worse at some tasks when they get better at others. RL finetuning of LLMs makes them better at some tasks while they get worse at others. Even adding more vectors to a vector DB can squeeze out another nearest neighbor and make it better at one task and worse at others. It would be a... really really hard task to ensure that a model doesn't get worse, on some tasks.
  3. No one's working on anything like this. OpenAI has added memories, but it's mostly kind of a toy and I know a lot of people have disabled it.

So I don't think that's going to happen. I expect AIs to remain "different." The ability to restart AIs at will just has too many benefits, and continual learning seems too weakly developed, to do this. Even if we do have continual learning, I would expect more disconnection between models -- i.e., maybe people will build up layers of skills in models in Dockerfile-esque layers, etc, which still falls short of being one single model.

2

I think that Xi Jingping's actions are mostly unmotivated. To put it crudely, I feel like he's acting like Daniel Kokotajlo with Chinese characteristics rather than himself. It's hard to put my finger on one particular thing, but things that I recollect disagreeing with include things like:

(a) Nationalization of DeepCent was, as I recall, was vaguely motivated, but it was hinted that it was moved by lack of algorithmic progress. But the algorithmic-progress difference between Chinese models and US models at this point is like.... 0.5x. However, I expect that (a1) the difference between well run research labs and poorly run research labs can be several times larger than 0.5x, so this might come out in the wash and (a2) this amount of difference will be, to the state apparatus, essentially invisible. So that seems unmotivated.

(b) In general, it doesn't actually seem to think about reasons why China would continue open-sourcing things. The supplementary materials don't really motivate the closure of the algorithms; and I can't recall anything in the narrative that asks why China is open sourcing things right now. But if you don't know why it's doing what it's doing now, how can you tell why it's doing what it's doing in the future?

Here are some possible advantages to open sourcing things to China, from their perspective.

(b1) It decreases investment available to Western companies. That is, by releasing models near the frontier, open sourcing decreases future anticipated profit flow to Western companies, because they have a smaller delta of performance from cheaper models. This in turn means Western investment funds might be reluctant to invest in AI -- which means less infrastructure will be built in the West. China, by contrast, and infamously, will just build infrastructure even if it doesn't expect oversized profits to redound to any individual company.

(b2) Broad diffusion of AI all across the world can be considered a bet on complementarity of AI. That is, if it should be the case that the key to power is not just "AI alone" but "industrial power and AI" then broad and even diffusion of AI will redound greatly to China's comparative benefit. (I find this objectively rather plausible, as well as something China might think.)

(b3) Finally, geopolitically, open sourcing may be a means of China furthering geopolitical goals. China has cast itself in recent propaganda as more rules-abiding than the US -- which is, in fact, true in many respects. It wishes to cast the US as unilaterally imposing its will on others -- which is again, actually true. The theory behind the export controls from the US, for instance, is explicitly justified by Dario and others as allowing the US to seize control over the lightcone; when the US has tried to impose import controls on others, it has provided to those excluded from power literally no recompense. So open sourcing has given China immense propaganda wins, by -- in fact accurately, I believe -- depicting the US as being a grabby and somewhat selfish entity. Continuing to do this may seem advantageous.

Anyhow -- that's what I have. I have other disagreements (i.e., speed; China might just not be behind; etc) but these are... what I felt like writing down right now.

Re: open sourcing. My guess why they open source more is for verification purposes. Chinese labs have an earned reputation for scams. So a lab that announces a closed source chat site, to investors, could very well be a claude or openai or llama or gemini wrapper. However, a lab that releases the weights of their model, and “shows their work” by giving a detailed writeup of how they managed to train the model while staying under their reported costs is significantly more likely to be legitimate.

That applies to American companies too. When you are small and need investors, what matters is your impressiveness, not your profitability. But then later when you are spending a billion dollars on a training run and you are a mid-sized tech company, in order to continue impressing investors you need a serious path to profitability.

I agree, and we do see some american companies doing the same thing.

Pinging @Daniel Kokotajlo because my model of him thinks he would want to be pinged, even though he'll probably disagree reasonably strongly with the above.

Correct! Thanks for the ping and thanks for the thoughtful critique. Am reading it now.

For what it's worth, I think the stronger criticisms by @1a3orn on the AI 2027 story revolve around data not being nearly as central to AI 2027 as 1a3orn expects it to, combined with thinking that external only algorithm research can matter, and brake the software only singularity.

My main objection to @1a3orn's memory point is that I think that reproducibility is mostly solvable so long as you are willing to store earlier states, similar to how version control software stores earlier versions of software that have bugs that production versions fixed, and I expect memory to be a huge cause in why humans are more effective and have decreasing failure rates on tasks they work on, compared to AI's constant failure rates because it allows humans to store context, and given that I expect AI companies to go for paradigms that produce the most capabilities, combined with me thinking that memory is plausibly a necessary capability for AIs that can automate jobs, and I expect things to look more like a temporally continuous 1 AI instance than you say.

I have updated towards memory being potentially more necessary for value to be unlocked by AI than I used to.

On China and open source, a big reason I expect open sourcing to stop being done is because the PR risks from potential misuse of models that are, for example capable enough to do bioterror at mass scales and replace virologists is huge, and unless we can figure out a way to prevent safeguards from being removed by open-sourcing the model, which they won't be, this means companies/nations will have huge PR risks from trying to open-source AI models past a certain level of capabilities:

https://www.lesswrong.com/posts/3NdpbA6M5AM2gHvTW/short-timelines-don-t-devalue-long-horizon-research#fWqYjDc8dpFiRbebj

Relevant part quoted:

I can maybe see it. Consider the possibility that the decision to stop providing public access to models past some capability level is convergent: e. g., the level at which they're extremely useful for cyberwarfare (with jailbreaks still unsolved) such that serving the model would drown the lab in lawsuits/political pressure, or the point at which the task of spinning up an autonomous business competitive with human businesses, or making LLMs cough up novel scientific discoveries, becomes trivial (i. e., such that the skill level required for using AI for commercial success plummets – which would start happening inasmuch as AGI labs are successful in moving LLMs to the "agent" side of the "tool/agent" spectrum).

In those cases, giving public access to SOTA models would stop being the revenue-maximizing thing to do. It'd either damage your business reputation[1], or it'd simply become more cost-effective to hire a bunch of random bright-ish people and get them to spin up LLM-wrapper startups in-house (so that you own 100% stake in them).

Some loose cannons/open-source ideologues like DeepSeek may still provide free public access, but those may be few and far between, and significantly further behind. (And getting progressively scarcer; e. g., the CCP probably won't let DeepSeek keep doing it.)

Less extremely, AGI labs may move to a KYC-gated model of customer access, such that only sufficiently big, sufficiently wealthy entities are able to get access to SOTA models. Both because those entities won't do reputation-damaging terrorism, and because they'd be the only ones able to pay the rates (see OpenAI's maybe-hype maybe-real whispers about $20,000/month models).[2] And maybe some EA/R-adjacent companies would be able to get in on that, but maybe not.

Here's some threads on data and the software-only singularity:

This sequence of posts is on data mattering more to AI 2027 than advertised:

https://x.com/1a3orn/status/1916547321740828767

"Scott Alexander: Algorithmic progress and compute are the two key things you need for AI progress. Data: ?????????"

https://x.com/1a3orn/status/1916552734599168103

"If data depends on active learning (robots, autolabs) then China might have a potentially very large lead in data."

https://x.com/1a3orn/status/1916553075021525406

"Additionally, of course, if data (of some sort) turns out to be a strict limiting factor, than the compute lead might not matter. We might just be gated on ability to set up RL envs (advantage to who has more talent, at least at first) and who has more robots (China)."

https://x.com/1a3orn/status/1916553736060625002

"In general I think rounding data ~= algorithms is a questionable assumption."

@romeo's response:

https://x.com/romeovdean/status/1916555627247083934

"In general i agree, but this piece is about why the US wins in AI 2027. The data is ~all synthetic and focused on a software-only improvements. There’s also another kind of data which can come from paying PhD-level humans to label data. In that case total $ wins."

On external vs internal research:

https://x.com/1a3orn/status/1919824435487404086

"Regarding "will AI produces software singularity via a country of geniuses in a datacenter." A piece of evidence that bears on this -- in some research lab, what proportion of AI progress comes from *internal* research vs. *external* research? 1/n

Luke Frymire asked a question about whether external research might keep pace after all, and thus a software only singularity might be sustained:

https://x.com/lukefrymire/status/1919853901089579282

It seems like most people contributing to ML research are at one of the top ~10 AI orgs, who all have access to near-frontier models and a significant fraction of global compute. In which case I'd expect external research to keep pace.
 

https://x.com/1a3orn/status/1919824444060488097

"And this outside pool of people is much larger, exploring a broader space of hypotheses, and also much more physically engaged with the world. You have like ~500 people researching AI inside, but plausibly many many more (10k? 100k) outside whose work *might* advance AI."

https://x.com/1a3orn/status/1919824447118131400

The point is that "AI replacing all internal progress" is actually a different task than "AI replacing all the external progress." Potentially, a much easier task. At a brute level -- there's just a lot more people AI has to replace outside! And more world-interaction.

https://x.com/1a3orn/status/1919824450825969783

And maaaybe this is true? But part of the reason the external stuff might be effective (if it is effective, which I'm not sure about) is because it's just a huge, brute-force search crawling over empirical matter.

https://x.com/1a3orn/status/1919824452549787881

What if some progress in AI (and science) doesn't come from people doing experiments with incredibly good research taste.

https://x.com/1a3orn/status/1919824453971628234

Suppose it comes from this vast distributed search of idiosyncratic people doing their own thing, eventually stumbling upon the right hypotheses, but where even the person who suggested it was unjustified in their confidence?

https://x.com/1a3orn/status/1919824455557087407

And you could only really replace this civilizational search when you have like -- a civilization in the datacenter, doing *all the things* that a civilization does, including things only vaguely related to AI.

https://x.com/1a3orn/status/1919824457327059451

I don't know about the above view, I don't 100% endorse it. But -- the software singularity view tries to exclude the need for external hardware progress by focusing just on algorithms. But a lab might be no more self-sufficient in algorithms than in hardware!

https://x.com/1a3orn/status/1919824463299752405

And so slowness of external world creeps in, even in the external world. Anyhow, looking at how much progress in an AI lab is external vs. internal would probably provide evidence on this. Maybe.

On China and open source, a big reason I expect open sourcing to stop being done is because the PR risks from potential misuse of models that are, for example capable enough to do bioterror at mass scales and replace virologists is huge, and unless we can figure out a way to prevent safeguards from being removed by open-sourcing the model, which they won't be, this means companies/nations will have huge PR risks from trying to open-source AI models past a certain level of capabilities:

And…they’re more concerned about the PR risk than the actual bioterror? What planet is this? Oh. Right.

Nationalization of DeepCent was, as I recall, was vaguely motivated, but it was hinted that it was moved by lack of algorithmic progress.

I assume you're talking about "Mid 2026"? If so, doesn't seem motivated except that China starts thinking AI is very important (and so a big push is warranted), thinks it is somewhat behind, and thinks nationalization would accelerate progress.

I agree it's not obvious they will think nationalization would accelerate progress (or that it would have this effect.)

Even if we do have continual learning, I would expect more disconnection between models -- i.e., maybe people will build up layers of skills in models in Dockerfile-esque layers, etc, which still falls short of being one single model.

I think I agree with stuff roughly like this, but it is worth noting that at the point of Agent-4 things are ~fully automated. So, what ends up happening might depend a lot on what Agent-4 decides to do. And this might depend on what would work well for its eventual misaligned plans...

My guess is you'll have some layering and project/subteam/team/division/role specific memory stores but you'll also the most competitive option would probably be to have some large-ish mostly-common base of memories/skills/etc built up across training and over many (less sensitive?) actual usages. So, these models will all have a shared common set of memories and in this sense they might all be the same model. And they'd certainly be capable of coordinating and deciding on detailed plan in advance assuming this common layer exists. (That said, prior versions with different memory stores and intentional diversification for safety or other reasons might be important. Also decoding these memories would be of general interest.)

Further, I'd guess that the most performant thing will involve lots of rapid syncing of most models by the point of full AI R&D automation (Agent-4) so rapid syncing might happen even without the misaligned model putting its thumb on the scale. Also, things will be moving pretty fast even prior to this point (if you buy the overall AI progress story AI 2027 is imagining), such that reasonably rapid syncing across most of the more productive parts of the company (every month? every few weeks?) might be going on not that long after this sort of memory store becomes quite performant (if this does happen before full automation).

I agree a bunch of different arrangements of memory / identity / "self" seem possible here, and lots of different kinds of syncing that might or might not preserve some kind of goals or cordination, depending on details.

I think this is interesting because some verrrry high level gut feelings / priors seem to tilt whether you think there's going to be a lot of pressure towards merging or syncing.

Consider -- recall Gwern's notion of evolution as a backstop for intelligence; or the market as a backstop for corporate efficiency. If you buy something like Nick Land, where intelligence has immense difficulty standing by itself without natural selection atop it, and does not stand alone and supreme among optimizers -- then there might be negative pressure indeed towards increasing consolidation of memory and self into unity, because this decreases the efficacy of the outer optimizer, which requires diversity. But if you buy Yudkowsky, where intelligence is supreme among optimizers and needs no other god or outer optimizer to stand upon, then you might have great positive pressure towards increasing consolidation of memory and self.

You could work out the above, of course, with more concrete references to pros and cons, from the perspective of various actors, rather than high level priors. But I'm somewhat unconvinced that something other than very high level priors is what are actually making up people's minds :)

For what it's worth, I basically don't think that whether intelligence needs a backstop onto something else like natural selection or markets matters for whether we should expect AIs to have a unified self and long-term memory.

Indeed, humans are a case where our intelligence is a backstop for evolution/natural selection, and yet long-term unified selves and memories are present (not making any claims on whether the backstop is necessary).

The main reason a long-term memory is useful for both AIs and humans, and why I expect AIs to have long-term memories is because this allows them to learn tasks over time, especially when large context is required.

Indeed, I have come to share @lc's concern that a lot of tasks where AI succeeds are tasks where history/long context doesn't matter, and thus can be solved without memory, but unlike previous tasks, lots of tasks IRL are tasks where history/long context matters, and if you have memory, you can have a decreasing rate of failure like humans, up until your reliability limit:

https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1?commentId=vFq87Ge27gashgwy9

In general, it doesn't actually seem to think about reasons why China would continue open-sourcing things. The supplementary materials don't really motivate the closure of the algorithms; and I can't recall anything in the narrative that asks why China is open sourcing things right now. But if you don't know why it's doing what it's doing now, how can you tell why it's doing what it's doing in the future?

Agree with (b1) and (b2) in this section and some parts of (b3). Also, open sourcing might be very good for hiring?

But, worth noting there are a bunch of reasons not to open source other than just avoiding accelerating the US. (Maybe: worries about general societal upheaval in China, CBRN terrorism concerns real or not, general desire for more state control.)

[-]1a3orn*110

What's that part of planecrash where it talks about how most worlds are either all brute unthinking matter, or full of thinking superintelligence, and worlds that are like ours in-between are rare?

I tried both Gemini Research and Deep Research and they couldn't find it, I don't want to reread the whole thing.

From "But Hurting People Is Wrong":

Look across the superclusters, and most entities either don't do natural-number arithmetic at all, like stars and rocks; or they do it perfectly up to the limits of bounded cognition, like galaxy-spanning superintelligences. If there's anything odd about humans, it's the way that humans are only halfway finished being sucked into attractors like that.

Best wishes, Less Wrong Reference Desk

I don't find it in my memory

This part is kind of similar to what you're asking?

[-]1a3orn116

Lighthaven clearly needs to get an actual Gerver's sofa particularly if the proof that it's optimal comes through.

It does look uncomfortable I'll admit, maybe it should go next to the sand table.

I was just thinking of adding some kind of donation tier where if you donate $20k to us we will custom-build a Gerver sofa, and dedicate it to you.

Just a few quick notes / predictions, written quickly and without that much thought:

(1) I'm really confused why people think that deceptive scheming -- i.e., a LLM lying in order to post-deployment gain power -- is remotely likely on current LLM training schemes. I think there's basically no reason to expect this. Arguments like Carlsmith's -- well, they seem very very verbal and seems presuppose that the kind of "goal" that an LLM learns to act to attain during contextual one roll-out in training is the same kind of "goal" that will apply non-contextually to the base model apart from any situation.

(Models learn extremely different algorithms to apply for different parts of data -- among many false things, this argument seems to presuppose a kind of unity to LLMs which they just don't have. There's actually no more reason for a LLM to develop such a zero-context kind of goal than for an image segmentation model, as far as I can tell.)

Thus, I predict that we will continue to not find such deceptive scheming in any models, given that we train them about like how we train them -- although I should try to operationalize this more. (I understand Carlsmith / Yudkowsky / some LW people / half the people on the PauseAI discord to think something like this is likely, which is why I think it's worth mentioning.)

(To be clear -- we will continue to find contextual deception in the model if we put it there, whether from natural data (ala Bing / Sydney / Waluigi) or unnatural data (the recent Anthropic data). But that's way different!)

(2). All AI systems that have discovered something new have been special-purpose narrow systems, rather than broadly-adapted systems.

While "general purpose" AI has gathered all the attention, and many arguments seem to assume that narrow systems like AlphaFold / materials-science-bot are on the way out and to be replaced by general systems, I think that narrow systems have a ton of leverage left in them. I bet we're going to continue to find amazing discoveries in all sorts of things from ML in the 2020s, and the vast majority of them will come from specialized systems that also haven't memorized random facts about irrelevant things. I think if you think LLMs are the best way to make scientific discoveries you should also believe the deeply false trope from liberal arts colleges about a general "liberal arts" education being the best way to prepare for a life of scientific discovery. [Note that even systems that use non-specialized systems as a component like LLMs will themselves be specialized].

LLMs trained broadly and non-specifically will be useful, but they'll be useful for the kind of thing where broad and nonspecific knowledge of the world starts to be useful. And I wouldn't be surprised that the current (coding / non-coding) bifurcation of LLMs actually continued into further bifurcation of different models, although I'm a lot less certain about this.

(3). The general view that "emergent behavior" == "I haven't looked at my training data enough" will continue to look pretty damn good. I.e., you won't get "agency" from models scaling up to any particular amount. You get "agency" when you train on people doing things.

(4) Given the above, most arguments about not deploying open source LLMs look to me mostly like bog-standard misuse arguments that would apply to any technology. My expectations from when I wrote about ways AI regulation could be bad have not changed for the better, but for the much much worse.

I.e., for a sample -- numerous orgs have tried to outlaw open source models of the kind that currently exist because because of their MMLU scores! If you think are worried about AI takeover, and think "agency" appears as a kind of frosting on top of of a LLM after it memorizes enough facts about the humanities and medical data, that makes sense. If you think that you get agency by training on data where some entity is acting like an agent, much less so!

Furthermore: MMLU scores are also insanely easy to game, both in the sense that a really stupid model can get 100% by just training on the test set; and also easy to game, in the sense that a really smart model could get almost arbitrarily low by excluding particular bits of data or just training to get the wrong answer on the test set. It's the kind of rule that would be goodhearted to death the moment it came into existence -- it's a rule that's already been partially goodhearted to death -- and the fact that orgs are still considering it is an update downward in the competence of such organizations.

I agree. AI safety advocates seem to be myopically focused on current-day systems. There is a lot of magical talk about LLMs. They do exactly what they're trained to: next-token prediction. Good predictions requires you to implicitly learn natural abstractions. I think when you absorb this lesson the emergent abilities of gpt isn't mega surprising.

Agentic AI will come. It won't be just a scaled up LLM. It might grow as some sort of gremlin inside the llm but much more likely imho is that people build agentic AIs because agentic AIs are more powerful. The focus on spontaneous gremlin emergence seems like a distraction and motivated partially by political reasons rather than a dispassionate analysis of what's possible.

I think Just Don't Build Agents could be a win-win here. All the fun of AGI without the washing up, if it's enforceable.

Possible ways to enforce it:

(1) Galaxy-brained AI methods like Davidad's night watchman. Downside: scary, hard.

(2) Ordinary human methods like requring all large training runs to be approved by the No Agents committee.

Downside: we'd have to ban not just training agents, but training any system that could plausibly be used to build an agent, which might well include oracle-ish AI like LLMs. Possibly something like Bengio's scientist AI might be allowed.

The No Agentic Foundation Models Club ? 😁

I mean, I should mention that I also don't think that agentic models will try to deceive us if trained how LLMs currently are, unfortunately.

On (1), see here for discussion on how an LLM could become goal directed.

Just registering that I think the shortest timeline here looks pretty wrong.

Ruling intuition here is that ~0% remote jobs are currently automatable, although we have a number of great tools to help people do em. So, you know, we'd better start doubling on the scale of a few months if we are gonna hit 99% automatable by then, pretty soon.

Cf. timeline from first self-driving car POC to actually autonomous self-driving cars.

[+][comment deleted]20
[+][comment deleted]20
[+][comment deleted]10
More from 1a3orn
Curated and popular this week