LESSWRONG
LW

All of jacob_cannell's Comments + Replies

Inference cost limits the impact of ever larger models

The effectiveness of weight sharing (and parameter compression in general) diminishes as you move the domain from physics (simple rules/patterns tiled over all of space/time) up to language/knowledge (downstream facts/knowledge that are far too costly to rederive from simulation).

BNNs cant really take advantage of weight sharing so much, so ANNs that are closer to physics should be much smaller parameter wise, for the same compute and capability. Which is what we observer for lower level sensor/motor modalities.

The Game Board has been Flipped: Now is a good time to rethink what you’re doing

jacob_cannell5mo3-2

The single factor prime causative factor driving the explosive growth in AI demand/revenue is and always has been the exponential reduction in $/flop via moore's law, which simply is jevon's paradox manifested. With more compute everything is increasingly easy and obvious; even idiots can create AGI with enough compute.

1yo-cuddles5mo

I think there's some miscommunication here, on top of a fundamental disagreement on whether more compute takes us to AGI. On miscommunication, we're not talking about the lowering cost per flop, we're talking about a world where openai either does or does not have a price war eating it's margins. On fundamental disagreement, I assume you don't take very seriously the idea that AI labs are seeing a breakdown of scaling laws? No problem if so, reality should resolve that disagreement relatively soon!

How will we update about scheming?

jacob_cannell6mo20

Abilities/intelligence come almost entirely from pretraining, so all the situation awareness and scheming capability that current (and future similar) frontier models possess is thus also mostly present in the base model.

Yes, but for scheming, we care about whether the AI can self-locate itself as an AI using its knowledge. The fact that (at a minimum) sampling from the system is required for it to self-locate as an AI might make a big difference here.

So if your 'yes' above is agreeing that capabilities - including scheming - come mostly from pretr... (read more)

5ryan_greenblatt6mo

I think scheming is less likely to emerge if it is only selected for / reinforced in a small subset of training that doesn't cause much of the capabilities. (As discussed in the post here.) If AIs don't self-locate except in a small post training phase that doesn't substantially increase capabilities, then the risk of scheming would be substantially reduced IMO. That said, it looks like RL is becoming increasingly and increasingly important such that it already substantially increases capabilities.

How will we update about scheming?

jacob_cannell6mo75

Training processes with varying (apparent) situational awareness
1:2.5 The AI seemingly isn't aware it is an AI except for a small fraction of training which isn't where much of the capabilities are coming from. For instance, the system is pretrained on next token prediction, our evidence strongly indicates that the system doesn't know it is an AI when doing next token prediction (which likely requires being confident that it isn't internally doing a substantial amount of general-purpose thinking about what to think about), and there is only a small RL proc

... (read more)

6ryan_greenblatt6mo

Yes, but for scheming, we care about whether the AI can self-locate itself as an AI using its knowledge. The fact that (at a minimum) sampling from the system is required for it to self-locate as an AI might make a big difference here. Yes, reducing situational awareness for experiments could possibly be viable, though it also makes experimentation less valuable if the situation is no longer analogous. E.g., demonstrating measures that mitigate scheming on a model that doesn't understand the situation well enough to scheme analogously doesn't work. (I also think it seems probably very hard to train a system capable of obsoleting top human experts which doesn't understand that it is an AI even if you're willing to take a big competitiveness hit. At lower capability levels this seems more viable, but might make experiments less valuable.)

We are headed into an extreme compute overhang

jacob_cannell1y91

Input vs output tokens are both unique per agent history (prompt + output), so that differentiation doesn't matter for my core argument about the RAM constraint. If you have a model which needs 1TB of KV cache, and you aren't magically sharing that significantly between instances, then you'll need at least 1000 * 1TB of RAM to run 1000 inferences in parallel.

The 3x - 10x cost ratio model providers charge is an economic observation that tells us something about the current cost vs utility tradeoffs, but it's much complicated by oversimpliciation of the cur... (read more)

We are headed into an extreme compute overhang

jacob_cannell1y1-2

Not for transformers, for which training and inference are fundamentally different.

Transformer training parallelizes over time, but that isn't feasible for inference. So transformer inference backends have to parallelize over batch/space, just like RNNs, which is enormously less efficient in RAM and RAM bandwidth use.

So if you had a large attention model that uses say 1TB of KV cache (fast weights) and 1TB of slow weights, transformer training can often run near full efficiency, flop limited, parallelizing over time.

But similar full efficient transformer ... (read more)

2ryan_greenblatt1y

In practice, throughput for generating tokens is only perhaps 3-10x worse than reading (input/prompt) tokens. This is true even while optimizing for latency on generation (rather than throughput). (This is for well optimized workloads: additional inference optimizations are needed for generation.) For instance, see the pricing on various APIs. (OpenAI has output 3x cheaper than input, Anthropic has 5x cheap input than output.) I'm skeptical this will change importantly with future larger models.

We are headed into an extreme compute overhang

jacob_cannell1y90

Due to practical reasons, the compute requirements for training LLMs is several orders of magnitude larger than what is required for running a single inference instance. In particular, a single NVIDIA H100 GPU can run inference at a throughput of about 2000 tokens/s, while Meta trained Llama3 70B on a GPU cluster^[1] of about 24,000 GPUs. Assuming we require a performance of 40 tokens/s, the training cluster can run $\frac{2000}{40} \times 24000 = 1, 200, 000$ concurrent instances of the resulting 70B model.

I agree direction-ally with your headline, bu... (read more)

3devrandom1y

These are good points. But don't the additional GPU requirements apply equally to training and inference? If that's the case, then the number of inference instances that can be run on training hardware (post-training) will still be on the order of 1e6.

Gentleness and the artificial Other

jacob_cannell2y1-3

How is that even remotely relevant? Humans and AIs learn the same way, via language - and its not like this learning process fails just because language undersamples thoughts.

Gentleness and the artificial Other

jacob_cannell2y132

As the article points out, shared biological needs do not much deter the bear or chimpanzee from killing you. An AI could be perfectly human - the very opposite of alien - and far more dangerous than Hitler or Dhamer.

The article is well written but dangerously wrong in its core point. AI will be far more human than alien. But alignment/altruism is mostly orthogonal to human vs alien.

4dr_s2y

Shared biological needs aren't a guarantee of friendliness, but they do restrict the space of possibilities significantly - enough, IMO, to make the hopes of peaceful contact not entirely moot. Also here it comes with more constraints. Again, if we ever meet aliens, it will probably have to be social organisms like us, who were able to coordinate and cooperate like us, and thus can be probably reasoned with somehow. Note that we can coexist with bears and chimpanzees. We just need to not be really fucking stupid about it. Bears aren't going to be all friendly with us, but that doesn't mean they just kill for kicks or have no sense of self-preservation. The communication barrier is a huge issue too. If you could tell the bear "don't eat me and I can bring you tastier food" I bet things might smooth out. AI is not subject to those constraints. "Being optimised to produce human-like text" is a property of LLMs specifically, not all AI, and even then, its mapping to "being human-like" is mostly superficial; they still can fail in weird alien ways. But I also don't expect AGI to just be a souped up LLM. I expect it to contain some core long term reasoning/strategizing RL model more akin to AlphaGo than to GPT-4, and that can be far more alien.

Gentleness and the artificial Other

jacob_cannell2y*42

We are definitely not training AIs on human thoughts because language is an expression of thought, not thought itself.

Even if training on language was not equivalent to training on thoughts, that would also apply to humans.

But it also seems false in the same way that "we are definitely not training AI's on reality because image files are compressed sampled expressions of images, not reality itself" is false.

Approximate bayesian inference (ie DL) can infer the structure of a function through its outputs; the structure of the 3D world through images; and thoughts through language.

3quetzal_rainbow2y

My point is not "language is a different form of thought", it's "most thoughts are not even expressed in language". And "being someone who can infer physics from images is a very different from being physics".

Gentleness and the artificial Other

jacob_cannell2y7-21

Premise 1: AGIs would be like a second advanced species on earth, more powerful than humans.

Distinct alien species arise only from distinct separated evolutionary histories. Your example of the aliens from Arrival are indeed a good (hypothetical) example of truly alien minds resulting from a completely independent evolutionary history on an alien world. Any commonalities between us and them would be solely the result of convergent evolutionary features. They would have completely different languages, cultures, etc.

AI is not alien at all, as we litera... (read more)

5quetzal_rainbow2y

We are definitely not training AIs on human thoughts because language is an expression of thought, not thought itself. Otherwise nobody would struggle to express their thoughts in language. My favorite fictional analog of LLMs is Angels from Evangelion. Relatives, yes, but utterly alien relatives.

2dr_s2y

This is a double edged sword to me. Biological entities might be very different in the details but shaped by similar needs at their core - nutrition, fear of death, need for sociality and reproduction (I don't expect any non-social aliens to ever become space faring in a meaningful way). AIs can ape the details but lack all those pressures at their core - especially those of prosociality. That's why they might end up potentially more hostile than any alien.

1the gears to ascension2y

perhaps you can say aliens who grew up on earth vs aliens who are entirely separate.

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y110

I also not sure of the relevance and not following the thread fully, but the summary of that experiment is that it takes some time (measured in nights of sleep which are rough equivalent of big batch training updates) for the newly sighted to develop vision, but less time than infants - presumably because the newly sighted already have full functioning sensor inference world models in another modality that can speed up learning through dense top down priors.

But its way way more than "grok it really fast with just a few examples" - training their new visual systems still takes non-trivial training data & time

Broad Picture of Human Values

jacob_cannell2y20

I suspect that much of the appeal of shard theory is working through detailed explanations of model-free RL with general value function approximation for people who mostly think of AI in terms of planning/search/consequentialism.

But if you already come from a model-free RL value approx perspective, shard theory seems more natural.

Moment to moment decisions are made based on value-function bids, with little to no direct connection to reward or terminal values. The 'shards' are just what learned value-function approximating subcircuits look like in gory det... (read more)

Contra Yudkowsky on AI Doom

jacob_cannell2y41

TSMC 4N is a little over 1e10 transistors/cm^2 for GPUs and roughly 5e^-18 J switch energy assuming dense activity (little dark silicon). The practical transistor density limit with minimal few electron transistors is somewhere around ~5e11 trans/cm^2, but the minimal viable high speed switching energy is around ~2e^-18J. So there is another 1 to 2 OOM further density scaling, but less room for further switching energy reduction. Thus scaling past this point increasingly involves dark silicon or complex expensive cooling and thus diminishing returns eit... (read more)

Contra Yudkowsky on AI Doom

jacob_cannell2y31

Part of the issue is my post/comment was about moore's law (transistor density for mass produced nodes), which is a major input to but distinct from flops/$. As I mentioned somewhere, there is still some free optimization energy in extracting more flops/$ at the circuit level even if moore's law ends. Moore's law is very specifically about fab efficiency as measured in transistors/cm^2 for large chip runs - not the flops/$ habyrka wanted to bet on. Even when moore's law is over, I expect some continued progress in flops/$.

All that being said, nvidia's n... (read more)

2Mo Putera2y

Thanks Jacob. I've been reading the back-and-forth between you and other commenters (not just habryka) in both this post and your brain efficiency writeup, and it's confusing to me why some folks so confidently dismiss energy efficiency considerations with handwavy arguments not backed by BOTECs. While I have your attention – do you have a view on how far we are from ops/J physical limits? Your analysis suggests we're only 1-2 OOMs away from the ~10^-15 J/op limit, and if I'm not misapplying Koomey's law (2x every 2.5y back in 2015, I'll assume slowdown to 3y doubling by now) this suggests we're only 10-20 years away, which sounds awfully near, albeit incidentally in the ballpark of most AGI timelines (yours, Metaculus etc).

Idealized Agents Are Approximate Causal Mirrors (+ Radical Optimism on Agent Foundations)

jacob_cannell2y132

I don't know who first said it, but the popular saying "Computer vision is the inverse of computer graphics" encompasses much of this viewpoint.

Computer graphics is the study/art of the approximation theory you mention and fairly well developed & understood in terms of how to best simulate worlds & observations in real-time from the perspective of an observer. But of course traditional graphics uses human-designed world models and algorithms.

Diffusion models provide a general framework for learning a generative model in the other direction - in pa... (read more)

2Thane Ruthenis2y

Yeah, I'd looked at computer graphics myself. I expect that field does have some generalizable lessons. Great addition regarding diffusion planning.

Prediction Markets aren't Magic

jacob_cannell2y53

Even if there is no acceptable way to share the data semi-anonymously outside of match group, the arguments for prediction markets still apply within match group. A well designed prediction market would still be a better way to distribute internal resources and rewards amongst competing data science teams within match group.

But I'm skeptical that the value of match group's private data is dominant even in the fully private data scenario. People who actually match and meetup with another user will probably have important inside view information inaccessib... (read more)

2SimonM2y

I used to think things like this, but now I disagree, and actually think it's fairly unlikely this is the case. 1. Internal prediction markets have tried (and failed) at multiple large organisations who made serious efforts to create them 2. As I've explained in this post, prediction markets are very inefficient at sharing rewards. Internal to a company you are unlikely to have the right incentives in place as much as just subsidising a single team who can share models etc. The added frictions of a market are substantial. 3. The big selling points of prediction markets (imo) come from: 1. Being able to share results without sharing information (ie I can do some research, keep the information secret, but have people benefit from the conclusions) 2. Incentivising a wider range of people. At an orgasation, you'd hire the most appropriate people into your data science team and let them run. There's no need to wonder if someone from marketing is going to outperform their algorithm. I strongly agree. I think people often confuse "market" and "prediction market". There is another (arguably better) model of dating apps which is that the market participants are the users, and the site is actually acting as a matching engine. Since I (generally) think markets are great, this also seems pretty great to me.

5mako yass2y

Yeah, additionally, a prediction market may be a good way to aggregate the predictions of heterogenous prediction models. It's important to note that manifold.love basically still has not been tried; has only existed for a few months and is still missing some core features. Though I do expect most of its matchmaking strength wont come from its prediction features. Prediction markets could still be used in this space to appoint paid professional matchmakers (who would place bets like, "there will be a second date conditional on a first date", or "if these two read my introductions, 8% probability they'll eventually get married"), and to filter spam or abuse (to send a message, you have to bet that the recipient probably wont hate it). This question can probably be operationalized as "how much richer will CupidBot be than all human matchmakers combined."

[Valence series] 5. “Valence Disorders” in Mental Health & Personality

jacob_cannell2y40

Certainly mood disorders like bipolar,depression,mania can have multiple causes - for examle simply doing too much dopaminergic simulants (cocaine, meth etc) can cause mania directly.

But the modern increased prevalence of mood disorders is best explained by a modern divergence from conditions in the ancestral environment, and sleep disorder due to electric lighting disrupting circadian rhythms is a good fit to the evidence.

The evidence for each of my main points is fairly substantial and now mainstream, the only part which isn't mainstream (yet) is the spe... (read more)

[Valence series] 5. “Valence Disorders” in Mental Health & Personality

jacob_cannell2y251

From my own study of mood disorders I generally agree with your valence theory of depression/mania.

However I believe the primary cause (at least for most people today) is disrupted sleep architecture.

To a first order approximation, the brain accumulates batch episodic training data during the day through indexing in the hippocampus (which is similar-ish to upper cortex, but more especially adapted to medium term memory & indexing). The brain's main episodic replay training then occurs during sleep, with alternation of several key phases (REM and sever... (read more)

2Steven Byrnes2y

Very interesting, thanks! I hadn’t thought that before, but now I agree with parts of what you said. One thing is: I think you present some suggestive evidence that REM sleep is an important intervention point that can help mitigate depression / mania. But I think you haven’t presented much evidence that REM sleep abnormalities are (usually) the root cause that led to the depression / mania starting in the first place. Maybe they are, maybe not, I dunno.

A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans

jacob_cannell2y20

Sure, but how often do the colonized end up better off for it, especially via trying to employ clever play-both-sides strategies?

I didn't say the colonized generally ended up better off, but outcomes did vary greatly. Just in the US the cherokees faired much better than say the Susquehannock and Pequot, and if you dig into that history it seems pretty likely that decisions on which colonizer(s) to ally with (british, french, dutch, later american etc) were important, even if not "clever play-both-sides strategies" (although I'd be surprised if that wasn't also tried somewhere at least once)

A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans

jacob_cannell2y242

An idea sometimes floated around is to play them off against each other. If they're misaligned from humanity, they're likely mutually misaligned as well. We could put them in game-theoretic situations in which they're incentivized to defect against each other and instead cooperate with humans.

You are arguing against a strawman. The optimistic game-theoretic argument you should focus on is:

Misaligned AIs are - almost by definition - instrumental selfish power seeking agents (with random long term goals) and thus intrinsically misaligned with each other.... (read more)

2Thane Ruthenis2y

By the way, mind elaborating on which parts you're skeptical of?

3Thane Ruthenis2y

Mm, I don't think it's a strawman – pretty sure a lot of people hold this specific position. But fair: that was not the strongest argument in the category of arguments I'd aimed to refute. Sure, but how often do the colonized end up better off for it, especially via trying to employ clever play-both-sides strategies?

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y50

Of course a massive advance is possible, but mostly just in terms of raw speed. The brain seems reasonably close to pareto efficiency in intelligence per watt for irreversible computers, but in the next decade or so I expect we'll close that gap as we move into more 'neuromorphic' or PIM computing (computation closer to memory). If we used the ~1e16w solar energy potential of just the Saraha desert that would support a population of trillions of brain-scale AIs or uploads running 1000x real-time.

especially as our NN can use stuff such as backprop,

The... (read more)

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y121

The paper which more directly supports the "made them smarter" claim seems to be this. I did somewhat anticipate this - "not much special about the primate brain other than ..", but was not previously aware of this particular line of research and certainly would not have predicted their claimed outcome as the most likely vs various obvious alternatives. Upvoted for the interesting link.

Specifically I would not have predicted that the graft of human glial cells would have simultaneously both 1.) outcompeted the native mouse glial cells, and 2.) resulted i... (read more)

"Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity

jacob_cannell2y30

Suffering, disease and mortality all have a common primary cause - our current substrate dependence. Transcending to a substrate-independent existence (ex uploading) also enables living more awesomely. Immortality without transcendence would indeed be impoverished in comparison.

Like, even if they 'inherit our culture' it could be a "Disneyland with no children"

My point was that even assuming our mind children are fully conscious 'moral patients', it's a consolation prize if the future can not help biological humans.

3Oliver Sourbut2y

It looks like we basically agree on all that, but it pays to be clear (especially because plenty of people seem to disagree). 'Transcending' doesn't imply those nice things though, and those nice things don't imply transcending. Immortality is similarly mostly orthogonal.

"Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity

jacob_cannell2y309

The AIs most capable of steering the future will naturally tend to have long planning horizons (low discount rates), and thus will tend to seek power(optionality). But this is just as true of fully aligned agents! In fact the optimal plans of aligned and unaligned agents will probably converge for a while - they will take the same/similar initial steps (this is just a straightforward result of instrumental convergence to empowerment). So we may not be able to distinguish between the two, they both will say and appear to do all the right things. Thus it... (read more)

3Matthew Barnett1y

I agree with the first part of your AI doom scenario (the part about us adopting AI technologies broadly and incrementally), but this part of the picture seems unrealistic to me. When AIs start to influence culture, it probably won't be a big conspiracy. It won't really be "mysterious" if things start trending away from what most humans want. It will likely just look like how cultural drift generally always looks: scary because it's out of your individual control, but nonetheless largely decentralized, transparent, and driven by pretty banal motives. AIs probably won't be "out to get us", even if they're unaligned. For example, I don't anticipate them blocking funding for uploading and life extension, although maybe that could happen. I think human influence could simply decline in relative terms even without these dramatic components to the story. We'll simply become "old" and obsolete, and our power will wane as AIs becomes increasingly autonomous, legally independent, and more adapted to the modern environment than we are. Staying in permanent control of the future seems like a long, hard battle. And it's not clear to me that this is a battle we should even try to fight in the long run. Gradually, humans may eventually lose control—not because of a sudden coup or because of coordinated scheming against the human species—but simply because humans won't be the only relevant minds in the world anymore.

2Oliver Sourbut2y

Leaving aside s-risks, this could very easily be the emptiest possible future. Like, even if they 'inherit our culture' it could be a "Disneyland with no children" (I happen to think this is more likely than not but with huge uncertainty). ---------------------------------------- Separately, this anti-deathist vibe has always struck me as very impoverished and somewhat uninspiring. The point should be to live, awesomely! which includes alleviating suffering and disease, and perhaps death. But it also ought to include a lot more positive creation and interaction and contemplation and excitement etc.!

4Oliver Sourbut2y

This is a minor fallacy - if you're aligned, powerseeking can be suboptimal if it causes friction/conflict. Deception bites, obviously, making the difference less.

3Mo Putera2y

In other words slow multipolar failure. Critch might point out that the disanalogy in "AI won't need to kill humans just as the US doesn't need to kill the sentinelese" lies in how AIs can have much wider survival thresholds than humans, leading to (quoting him)

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y*8-1

But on your model, what is the universal learning machine learning, at runtime? ..

On my model, one of the things it is learning is cognitive algorithms. And different classes of training setups + scale + training data result in it learning different cognitive algorithms; algorithms that can implement qualitatively different functionality.

Yes.

And my claim is that some setups let the learning system learn a (holistic) general-intelligence algorithm.

I consider a ULM to already encompass general/universal intelligence in the sense that a properly sca... (read more)

2Thane Ruthenis2y

Thanks for detailed answers, that's been quite illuminating! I still disagree, but I see the alternate perspective much clearer now, and what would look like notable evidence for/against it.

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y10-4

My argument for the sharp discontinuity routes through the binary nature of general intelligence + an agency overhang, both of which could be hypothesized via non-evolution-based routes. Considerations about brain efficiency or Moore's law don't enter into it.

You claim later to agree with ULM (learning from scratch) over evolved-modularity, but the paragraph above and statements like this in your link:

The homo sapiens sapiens spent thousands of years hunter-gathering before starting up civilization, even after achieving modern brain size.

It would s

... (read more)

the gears to ascension2y162

not only is there nothing special about the human brain architecture, there is not much special about the primate brain other than hyperpameters better suited to scaling up to our size

I don't think this is entirely true. Injecting human glial cells into mice made them smarter. certainly that doesn't provide evidence for any sort of exponential difference, and you could argue it's still just hyperparams, but it's hyperparams that work better small too. I think we should be expecting sub linear growth in quality of the simple algorithms but should also ... (read more)

1RussellThor2y

I agree with this However how do you know that a massive advance isn't still possible, especially as our NN can use stuff such as backprop, potentially quantum algorithm to train weights and other potential advances, that simply aren't possible for nature to use? Say we figure out the brain learning algorithm, get AGI then quickly get something that uses the best of both nature and tech stuff not assessable to nature.

5Thane Ruthenis2y

We still seem to have some disconnect on the basic terminology. The brain is a universal learning machine, okay. The learning algorithms that govern it and its architecture are simple, okay, and the genome specifies only them. On our end, we can similarly implement the AGI-complete learning algorithms and architectures with relative ease, and they'd be pretty simple. Sure. I was holding the same views from the beginning. But on your model, what is the universal learning machine learning, at runtime? Look-up tables? On my model, one of the things it is learning is cognitive algorithms. And different classes of training setups + scale + training data result in it learning different cognitive algorithms; algorithms that can implement qualitatively different functionality. Scale is part of it: larger-scale brains have the room to learn different, more sophisticated algorithms. And my claim is that some setups let the learning system learn a (holistic) general-intelligence algorithm. You seem to consider the very idea of "algorithms" or "architectures" mattering silly. But what happens when a human groks how to do basic addition, then? They go around memorizing what sum each set of numbers maps to, and we're more powerful than animals because we can memorize more numbers? Shrug, okay, so let's say evolution had to hit upon the Mammalia brain architecture. Would you agree with that? Or we can expand further. Is there any taxon X for which you'd agree that "evolution had to hit upon the X brain architecture before raw scaling would've let it produce a generally intelligent species"?

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2y123

No, and that's a reasonable ask.

To a first approximation my futurism is time acceleration; so the risks are the typical risks sans AI, but the timescale is hyperexponential ala roodman. Even a more gradual takeoff would imply more risk to global stability on faster timescales than anything we've experience in history; the wrong AGI race winners could create various dystopias.

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2yΩ418-4

Yes, but it's because the things you've outlined seem mostly irrelevant to AGI Omnicide Risk to me? It's not how I delineate the relevant parts of the classical view, and it's not what's been centrally targeted by the novel theories

They are critically relevant. From your own linked post ( how I delineate ) :

We only have one shot. There will be a sharp discontinuity in capabilities once we get to AGI, and attempts to iterate on alignment will fail. Either we get AGI right on the first try, or we die.

If takeoff is slow (1) because brains are highly ... (read more)

7Thane Ruthenis2y

My argument for the sharp discontinuity routes through the binary nature of general intelligence + an agency overhang, both of which could be hypothesized via non-evolution-based routes. Considerations about brain efficiency or Moore's law don't enter into it. Brains are very different architectures compared to our computers, in any case, they implement computations in very different ways. They could be maximally efficient relative to their architectures, but so what? It's not at all obvious that FLOPS estimates of brainpower are highly relevant to predicting when our models would hit AGI, any more than the brain's wattage is relevant. They're only soundly relevant if you're taking the hard "only compute matters, algorithms don't" position, which I reject. I think both are load-bearing, in a fairly obvious manner, and that which specific mixture is responsible matters comparatively little. * Architecture obviously matters. You wouldn't get LLM performance out of a fully-connected neural network, certainly not at realistically implementable scales. Even more trivially, you wouldn't get LLM performance out of an architecture that takes in the input, discards it, spends 10^25 FLOPS generating random numbers, then outputs one of them. It matters how your system learns. * So evolution did need to hit upon, say, the primate architecture, in order to get to general intelligence. * Training data obviously matters. Trivially, if you train your system on randomly-generated data, it's not going to learn any useful algorithms, no matter how sophisticated its architecture is. More realistically, without the exposure to chemical experiments, or any data that hints at chemistry in any way, it's not going to learn how to do chemistry. * Similarly, a human not exposed to stimuli that would let them learn the general-intelligence algorithms isn't going to learn them. You'd brought up feral children before, and I agree it's a relevant data point. So, yes, there would be n

8the gears to ascension2y

There probably really is a series of core of generality insights in the difference between general mammal brain scaled to human size -> general primate brain scaled to human size -> actual human brain. Also, much of what matters is learned from culture. Both can be true at once. But more to the point, I think you're jumping to conclusions about what OP thinks. They haven't said anything that sounds like EMH nonsense to me. Modularity is generated by runtime learning, and mechinterp studies it; there's plenty of reason to think there might be ways to increase it, as you know. And that doesn't even touch on the question of what training data.

Current AIs Provide Nearly No Data Relevant to AGI Alignment

jacob_cannell2yΩ1533-3

Said pushback is based on empirical studies of how the most powerful AIs at our disposal currently work, and is supported by fairly convincing theoretical basis of its own. By comparison, the "canonical" takes are almost purely theoretical.

You aren't really engaging with the evidence against the purely theoretical canonical/classical AI risk take. The 'canonical' AI risk argument is implicitly based on a set of interdependent assumptions/predictions about the nature of future AI:

fast takeoff is more likely than slow, downstream dependent on some comb

... (read more)

6Seth Herd2y

All of the points you listed make AGI risk worse, but none are necessary to have major concerns about it. That's why they didn't appear in the post's summary of AGI x-risk logic. I think this is a common and dangerous misconception. The original AGI x-risk story was wrong in many places. But that does not mean x-risk isn't real.

Thane Ruthenis2y*Ω6245

You aren't really engaging with the evidence against the purely theoretical canonical/classical AI risk take

Yes, but it's because the things you've outlined seem mostly irrelevant to AGI Omnicide Risk to me? It's not how I delineate the relevant parts of the classical view, and it's not what's been centrally targeted by the novel theories. The novel theories' main claims are that powerful cognitive systems aren't necessarily (isomorphic to) utility-maximizers, that shards (i. e., context-activated heuristics) reign supreme and value reflection can't arbitr... (read more)

6[anonymous]2y

Do you have a post or blog post on the risks we do need to worry about?

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y30

[Scaling law theories]

I'm not aware of these -- do you have any references?

Sure: here's a few: quantization model, scaling laws from the data manifold, and a statistical model.

True but misleading? Isn't the brain's "architectural prior" a heckuva lot more complex than the things used in DL?

The full specification of the DL system includes the microde, OS, etc. Likewise much of the brain complexity is in the smaller 'oldbrain' structures that are the equivalent of a base robot OS. The architectural prior I speak of is the complexity on top of that,... (read more)

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y30

Sure - I'm not saying no improvement is possible. I expect that the enhancements from adult gene editing should encompass most all of the brain tweaks you can get from drugs/diet. But those interventions will not convert an average brain into an Einstein.

The brain - or more specifically the brains of very intelligent people - are already very efficient, so I'm also just skeptical in general that there are many remaining small tweaks that take you past the current "very intelligent". Biological brains beyond the human limit are of course possible, but pr... (read more)

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y6-1

ANNs and BNNs operate on the same core principles; the scaling laws apply to both and IQ in either is a mostly function of net effective training compute and data quality.

How do you know this?

From study of DL and neuroscience of course. I've also written on this for LW in some reasonably well known posts: starting with The Brain as a Universal Learning Machine, and continuing in Brain Efficiency, and AI Timelines specifically see the Cultural Scaling Criticality section on the source of human intelligence, or the DL section of simboxes. Or you co... (read more)

1kman2y

I'm not aware of these -- do you have any references? True but misleading? Isn't the brain's "architectural prior" a heckuva lot more complex than the things used in DL? Sure. The big crux here is whether plasticity of stuff which is normally "locked down" in adulthood is needed to significantly increase "fluid intelligence" (by which I mean, something like, whatever allows people to invent useful new concepts and find ingenious applications of existing concepts). I'm not convinced these DL analogies are useful -- what properties do brains and deepnets share that renders the analogies useful here? DL is a pretty specific thing, so by default I'd strongly expect brains to differ in important ways. E.g. what if the structures whose shapes determine the strength of fluid intelligence aren't actually "locked down", but reach a genetically-influenced equilibrium by adulthood, and changing the genes changes the equilibrium? E.g. what if working memory capacity is limited by the noisiness of neural transmission, and we can reduce the noisiness through gene edits? FOOM isn't necessary for doom -- the convergent endpoint is that you have dangerously capable minds around: minds which can think much faster and figure out things we can't. FOOM is one way to get there.

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y42

We can roughly bin brain tissue into 3 developmental states:

juvenile: macro structure formation - brain expanding, neural tissue morphogenesis, migration, etc
maturing: micro synaptic structure formation, irreversible pruning and myelination
mature: fully myelinated, limited remaining plasticity

Maturation proceeds inside out with the regions closest to the world (lower sensory/motor) maturing first, proceeding up the processing hierarchy, and ending with maturation of the highest levels (some prefrontal, associative etc) around age ~20.

The human ... (read more)

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y20

Current AI is less sample efficient, but that is mostly irrelevant as the effective speed is 1000x to 10000x greater.

By the time current human infants finish ~30 year biological training, we'll by long past AGI and approaching singularity (in hyperexpoential models).

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y30

heritability of IQ increases with age (up until age 20, at least)

Straight forward result of how the brain learns. Cortical/cerebellar modules start out empty and mature inwards out - starting with the lowest sensory/motor levels closest to the world and proceeding up the hierarchy ending with the highest/deepest modules like prefrontal and associative cortex. Maturation is physically irreversible as it involves pruning most long-range connections and myelinating&strengthening the select few survivors. Your intelligence potential is constrained pr... (read more)

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y20

It would matter in a world without AI, but that is not the world we live in. Yes if you condition on some indefinite AI pause or something then perhaps, but that seems extremely unlikely. It takes about 30 years to train a new brain - so the next generation of humans won't reach their prime until around the singularity, long after AGI.

Though I do agree that a person with the genes of a genius for 2 years

Most genius is determined prenatally and during 'training' when cortex/cerebellum modules irreversibly mature, just as the capabilities of GPT4 are determined by the initial code and the training run.

2Garrett Baker2y

I think I agree with everything that you said except that it won’t matter. It seems like it’d very much matter if in the most successful case we make people learn skills, facts, and new fields 6x faster. Maybe you think 6x is too high, and its more like 1.05x, enough to notice a difference over 30 years, but not over 2-5.

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y40

Too slow too matter now due to the slow speed of neurons and bio learning combined with where we are in AI.

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y51

It does not. Despite the title of that section it is focused on adult expression factors. The post in general lacks a realistic mechanistic model of how tweaking genes affects intelligence.

genes are likely to have an effect if edited in adults: the size of the effect of a given gene at any given time is likely proportional to its level of expression

Is similar to expecting that a tweak to the hyperparams (learning rate) etc of trained GPT4 can boost its IQ (yes LLMs have their IQ or g factor). Most all variables that affect adult/trained performance do... (read more)

4kman2y

We thought a bit about this, though it didn't make the post. Agree that it increases the chance of the editing having a big effect.

2jmh2y

Maybe it's the lack of sleep for me but is "Brains are too slow." referring to something like growth/formation of structures that support some level of intelligence or to to human brain's just being slower than and AGI?

Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible

jacob_cannell2y*345

ANNs and BNNs operate on the same core principles; the scaling laws apply to both and IQ in either is a mostly function of net effective training compute and data quality. Genes determine a brain's architectural prior just as a small amount of python code determines an ANN's architectural prior, but the capabilities come only from scaling with compute and data (quantity and quality).

So you absolutely can not take datasets of gene-IQ correlations and assume those correlations would somehow transfer to gene interventions on adults (post training in DL lingo... (read more)

5rotatingpaguro2y

Couldn't there be genetic effects on things that can improve the brain even once its NN structure is mostly fixed? Maybe it's possible to have neurons work faster, or for the brain to wear less with abstract thinking, or to need less sleep. This kind of thing is not a full intelligence improvement because it does not allow you to notice more patterns or to think with new schemes. So maybe yes, it won't make a difference for AI timelines, though it would still be a very big deal.

9kman2y

How do you know this? In comparing human brains to DL, training seems more analogous to natural selection than to brain development. Much simpler "architectural prior", vastly more compute and data. We're really uncertain about how much would transfer! It would probably affect some aspects of intelligence more than others, and I'm afraid it might just not work at all if g is determined by the shape of structures that are ~fixed in adults (e.g. long range white matter connectome). But it's plausible to me that the more plastic local structures and the properties of individual neurons matter a lot for at least some aspects of intelligence (e.g. see this). There's a lot more to intelligence than learning. Combinatorial search, unrolling the consequences of your beliefs, noticing things, forming new abstractions. One might consider forming new abstractions as an important part of learning, which it is, but it seems possible to come up with new abstractions 'on the spot' in a way that doesn't obviously depend on plasticity that much; plasticity would more determine whether the new ideas 'stick'. I'm bottlenecked by the ability to find new abstractions that usefully simplify reality, not having them stick when I find them. My model is there's this thing lurking in the distance, I'm not sure how far out: dangerously capable AI (call it DCAI). If our current civilization manages to cough up one of those, we're all dead, essentially by definition (if DCAI doesn't kill everyone, it's because technical alignment was solved, which our current civilization looks very unlikely to accomplish). We look to be on a trajectory to cough one of those up, but It isn't at all obvious to me that it's just around the corner: so stuff like this seems worth trying, since humans qualitatively smarter than any current humans might have a shot at thinking of a way out that we didn't think of (or just having the mental horsepower to quickly get working something we have thought of, e.g. getti

1bbartlog2y

I am not so sure about that. I am thinking back to the Minnesota Twin Study here, and the related fact that heritability of IQ increases with age (up until age 20, at least). Now, it might be that we're just not great at measuring childhood IQ, or that childhood IQ and adult IQ are two subtly different things. But it certainly looks as if there's factors related to adult brain plasticity, motivation (curiosity, love of reading, something) that continue to affect IQ development at least until the age of 18.

1quetzal_rainbow2y

Is it true? We need to pour lifetimes of information to get moderate expertize-level performance in SOTA models. I have no significant doubt that we can overcome this via scaling, but with correction on available compute, brains seem to be decent learners. In addition, I would say that here is a difference between learning capability and elicting it: current models seem to be very sensitive to prompts, wrappnigs and other conditions. It's possible that intelligence gains can come from easier eliciting of already learned capabilities but blocked by, say, social RLHF.

4Garrett Baker2y

Why think the brain learns too slowly? If I can boost my sample efficiency I can learn new subjects quicker, remember more facts, and do better thought-result attribution. All these seem short-term beneficial. Unless you think there’s major critical period barriers here, these all seem likely results. Though I do agree that a person with the genes of a genius for 2 years will be far less a genius than a person with the genes of a genius for 25 years. It seems a stretch to say the first change can be rounded off as not matteringthough.

6GeneSmith2y

So I agree with your general point that genetic interventions made in adults would have a lesser effect than those same interventions made in embryos, which is why our model assumes that the average genetic change would have just half the normal effect. The exact relative size of edits made in the adult brain vs an embryo IS one of the major unknown factors in this project but like... if brain size were the only thing affecting intelligence we'd expect a near perfect correlation between it and intelligence. But that's not what we see. Brain size only correlates with intelligence at 0.3-0.4. So there's obviously a lot more going on. It's not post-training. Brains are constantly evolving and adapting throughout the lifespan. If this was actually the case then none of the stuff people are doing in AI safety or anything else would matter. That's clearly not true.

5Zach Stein-Perlman2y

See Would edits to the adult brain even do anything?. (Not endorsing the post or that section, just noticing that it seems relevant to your complaint.)

jacob_cannell2y167

If it only requires a simple hack to existing public SOTA, many others will have already thought of said hack and you won't have any additional edge. Taboo superintelligence and think through more specifically what is actually required to outcompete the rest of the world.

Progress in DL is completely smooth as it is driven mostly from hardware and enormous number of compute-dependent small innovations (yes transformers were a small innovation on top of contemporary alternatives such as memory networks, NTMs etc and quite predictable in advance ).

3Lucius Bushnaq2y

I don't recall assuming the edge to be unique? That seems like an unneeded condition for Tamsin's argument, it's enough to believe the field consensus isn't completely efficient by default and all relevant actors are sure of all currently deducable edges at all times. Right, if you think it's completely smooth and thus basically not meaningfully influenced by the actions of individual researchers whatsoever, I see why you would not buy Tamsin's argument here. But then the reason you don't buy it would seem to me to be that you think meaningful new ideas in ML capability research basically don't exist, not because you think there is some symmetric argument to Tamsin's for people to stay quiet about new alignment research ideas.

jacob_cannell2y61

Its easy to say something is "not that hard", but ridiculous to claim that when the something is build an AI that takes over the world. The hard part is building something more intelligent/capable than humanity, not anything else conditioned on that first step.

6Lucius Bushnaq2y

I don't see why this would be ridiculous. To me, e.g. "Superintelligence only requires [hacky change to current public SOTA] to achieve with expected 2025 hardware, and OpenAI may or may not have realised that already" seems like a perfectly coherent way the world could be, and is plenty of reason for anyone who suspects such a thing to keep their mouth shut about gears-level models of [] that might be relevant for judging how hard and mysterious the remaining obstacles to superintelligence actually are.

Summary of and Thoughts on the Hotz/Yudkowsky Debate

jacob_cannell2y20

Robust to the "trusting trust" problem (i.e. the issue of "how do you know that the source code you received is what the other agent is actually running"). ''

This is the crux really, and I'm surprised that many LW's seem to believe the 'robust cooperation' research actually works sans a practical solution to 'trusting trust' (which I suspect doesn't actually exist), but in that sense it's in good company (diamonoid nanotech, rapid takeoff, etc)

jacob_cannell2y75

It's not that hard to build an AI that kills everyone: you just need to solve [some problems] and combine the solutions. Considering how easy it is compared to what you thought, you should increase your P(doom) / shorten your timelines.

It's not that hard to build an AI that saves everyone: you just need to solve [some problems] and combine the solutions. Considering how easy it is compared to what you thought, you should decrease your P(doom) / shorten your timelines.

They do a value-handshake and kill everyone together.

Any two AIs unaligned with hum... (read more)

8Lucius Bushnaq2y

I'm not sure what you're saying here exactly. It seems to me like you're pointing to a symmetric argument favoring low doom, but if someone had an idea for how to do AI alignment right, why wouldn't they just talk about it? Doesn't seem symmetrical to me.

Multinational corporations as optimizers: a case for reaching across the aisle

jacob_cannell2y20

Corporations only exist within a legal governance infrastructure that permits incorporation and shapes externalities into internalities. Without such infrastructure you have warring tribes/gangs, not corporations.

The ways in which this shareholder value maximization has already seriously damaged the world and compromised the quality of human life are myriad and easily observable: pollution, climate change, and other such externalities. Companies' disregard for human suffering further enhances this comparison.

This is the naive leftist/marxist take. In... (read more)

1sudo-nym2y

Skepticism about the alignment of government and the incentives thereof has existed for almost as long as governments have. Elections, for example, are a crude but better-than-nothing attempt to align political interests with public interests, and much ink has been spilled on the subject of improving this alignment and even whether alignment to the general public opinion is a good idea. Far less such discussion has occurred in the case of extremely large companies, as they are a relatively newer concept.

Unpicking Extinction

jacob_cannell2y33

Surely they must mean something like extropy/complexity? Maximizing disorder doesn't seem to fit their vibe.

How do you feel about LessWrong these days? [Open feedback thread]

jacob_cannell2y62

Yes, but: whales and elephants have brains several times the size of humans, and they're yet to build an industrial civilization.

Size/capacity isn't all, but In terms of the capacity which actually matters (synaptic count, and upper cortical neuron count) - from what I recall elephants are at great ape cortical capacity, not human capacity. A few specific species of whales may be at or above human cortical neuron capacity but synaptic density was still somewhat unresolved last I looked.

Then, once a certain compute threshold was reached, it took a sha

... (read more)

How do you feel about LessWrong these days? [Open feedback thread]

jacob_cannell2y51

Indeed, that's basically what happened in the human case: the distributed optimization process of evolution searched over training architectures, and eventually stumbled upon one that was able to bootstrap itself into taking off.

Actually I think the evidence is fairly conclusive that the human brain is a standard primate brain with the only change being nearly a few compute scale dials increased (the number of distinct gene changes is tiny - something like 12 from what I recall). There is really nothing special about the human brain other than 1.) 3x l... (read more)

4Thane Ruthenis2y

Yes, but: whales and elephants have brains several times the size of humans, and they're yet to build an industrial civilization. I agree that hitting upon the right architecture isn't sufficient, you also need to scale it up – but scale alone doesn't suffice either. You need a combination of scale, and an architecture + training process that would actually transmute the greater scale into more powerful cognitive algorithms. Evolution stumbled upon the human/primate template brain. One of the forks of that template somehow "took off" in the sense of starting to furiously select for larger brain size. Then, once a certain compute threshold was reached, it took a sharp left turn and started a civilization. The ML-paradigm analogue would, likewise, involve researchers stumbling upon an architecture that works well at some small scales and has good returns on compute. They'll then scale it up as far as it'd go, as they're wont to. The result of that training run would spit out an AGI, not a mere bundle of sophisticated heuristics. And we have no guarantees that the practical capabilities of that AGI would be human-level, as opposed to vastly superhuman. (Or vastly subhuman. But if the maximum-scale training run produces a vastly subhuman AGI, the researchers would presumably go back to the drawing board, and tinker with the architectures until they selected for algorithms with better returns on intelligence per FLOPS. There's likewise no guarantees that this higher-level selection process would somehow result in an AGI of around human level, rather than vastly overshooting it the first time they properly scale it up.)

How do you feel about LessWrong these days? [Open feedback thread]

jacob_cannell2y42

I'd say this still applies even to non-LLM architectures like RL, which is the important part, but Jacob Cannell and 1a3orn will have to clarify.

We've basically known how to create AGI for at least a decade. AIXI outlines the 3 main components: a predictive world model, a planning engine, and a critic. The brain also clearly has these 3 main components, and even somewhat cleanly separated into modules - that's been clear for a while.

Transformers LLMs are pretty much exactly the type of generic minimal ULM arch I was pointing at in that post (I obvious... (read more)

We're all in this together

jacob_cannell2y1-8

I don't view ASI as substantially different than an upload economy.

I'm very confused about why you think that.

You ignored most of my explanation so I'll reiterate a bit differently. But first taboo the ASI fantasy.

any good post-AGI future is one with uploading - humans will want this
uploads will be very similar to AI, and become moreso as they transcend
the resulting upload economy is one of many agents with different values
the organizational structure of any pareto optimal multi-agent system is necessarily market-like
it is a provable fact th

... (read more)