Steven Byrnes

I'm an AGI safety / AI alignment researcher in Boston with a particular focus on brain algorithms. Research Fellow at Astera. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed, X/Twitter, Bluesky, LinkedIn, and more at my website.

Sequences

Intuitive Self-Models

Valence

Intro to Brain-Like-AGI Safety

Wikitag Contributions

Wanting vs Liking

(+139/-26)

Waluigi Effect

(+2087)

Comments

Sorted by

Newest

Recent AI model progress feels mostly like bullshit

Steven Byrnes14h20

But isn’t this exactly the OPs point?

Yup, I expected that OP would generally agree with my comment.

First off, you just posted them online

They only posted three questions, out of at least 62 (=1/(.2258-.2097)), perhaps much more than 62. For all I know, they removed those three from the pool when they shared them. That’s what I would do—probably some human will publicly post the answers soon enough. I dunno. But even if they didn’t remove those three questions from the pool, it’s a small fraction of the total.

You point out that all the questions would be in the LLM company user data, after kagi has run the benchmark once (unless kagi changes out all their questions each time, which I don’t think they do, although they do replace easier questions with harder questions periodically). Well:

If an LLM company is training on user data, they’ll get the questions without the answers, which probably wouldn’t make any appreciable difference to the LLM’s ability to answer them;
If an LLM company is sending user data to humans as part of RLHF or SFT or whatever, then yes there’s a chance for ground truth answers to sneak in that way—but that’s extremely unlikely to happen, because companies can only afford to send an extraordinarily small fraction of user data to actual humans.

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes2d40

What I'm not clear on is how those two numbers (20,000 genes and a few thousand neuron types) specifically relate to each other in your model of brain functioning.

Start with 25,000 genes, but then reduce it a bunch because they also have to build hair follicles and the Golgi apparatus and on and on. But then increase it a bit too because each gene has more than one design degree of freedom, e.g. a protein can have multiple active sites, and there’s some ability to tweak which molecules can and cannot reach those active sites and how fast etc. Stuff like that.

Putting those two factors together, I dunno, I figure it’s reasonable to guess that the genome can have a recipe for a low-thousands of distinct neuron types each with its own evolutionarily-designed properties and each playing a specific evolutionarily-designed role in the brain algorithm.

And that “low thousands” number is ballpark consistent with the slide-seq thing, and also ballpark consistent with what you get by counting the number of neuron types in a random hypothalamus nucleus and extrapolating. High hundreds, low thousands, I dunno, I’m treating it as a pretty rough estimate.

Hmm, I guess when I think about it, the slide-seq number and the extrapolation number are probably more informative than the genome number. Like, can I really rule out “tens of thousands” just based on the genome size? Umm, not with extreme confidence, I’d have to think about it. But the genome size is at least a good “sanity check” on the other two methods.

Is the idea that each neuron type roughly corresponds to the expression of one or two specific genes, and thus you'd expect <20,000 neuron types?

No, I wouldn’t necessarily expect something so 1-to-1. Just the general information theory argument. If you have N “design degree of freedom” and you’re trying to build >>N specific machines that each does a specific thing, then you get stuck on the issue of crosstalk.

For example, suppose that some SNP changes which molecules can get to the active site of some protein. It makes Purkinje cells more active, but also increases the ratio of striatal matrix cells to striosomes, and also makes auditory cortex neurons more sensitive to oxytocin. Now suppose there’s very strong evolutionary pressure for Purkinje cells to be more active. Then maybe that SNP is going to spread through the population. But it’s going to have detrimental side-effects on the striatum and auditory cortex. Ah, but that’s OK, because there’s a different mutation to a different gene which fixes the now-suboptimal striatum, and yet a third mutation that fixes the auditory cortex. Oops, but those two mutations have yet other side-effects on the medulla and … Etc. etc.

…Anyway, if that’s what’s going on, that can be fine! Evolution can sort out this whole system over time, even with crazy side-effects everywhere. But only as long as there are enough “design degrees of freedom” to actually fix all these problems simultaneously. There do have to be more “design degrees of freedom” in the biology / genome than there are constraints / features in the engineering specification, if you want to build a machine that actually works. There doesn’t have to be a 1-to-1 match between design-degrees-of-freedom and items on your engineering blueprint, but you do need that inequality to hold. See what I mean?

Interestingly, the genome does do this! Protocadherins in vertebrates and DSCAM1 are expressed in exactly this way, and it's thought to help neurons to distinguish themselves from other neurons…
Of course in an emulation you could probably just tell the neurons to not interact with themselves

Cool example, thanks! Yeah, that last part is what I would have said. :)

Tabula Bio: towards a future free of disease (& looking for collaborators)

Steven Byrnes2d*30

My take on missing heritability is summed up in Heritability: Five Battles, especially §4.3-4.4. Mental health and personality have way more missing heritability than things like height and blood pressure. I think for things like height and blood pressure etc., you’re limited by sample sizes and noise, and by SNP arrays not capturing things like copy number variation. Harris et al. 2024 says that there exist methods to extract CNVs from SNP data, but that they’re not widely used in practice today. My vote would be to try things like that, to try to squeeze a bit more predictive power in the cases like height and blood pressure where the predictors are already pretty good.

On the other hand, for mental health and personality, there’s way more missing heritability, and I think the explanation is non-additivity. I humbly suggest my §4.3.3 model as a good way to think about what’s going on.

If I were to make one concrete research suggestion, it would be: try a model where there are 2 (or 3 or whatever) latent schizophrenia subtypes. So then your modeling task is to jointly (1) assign each schizophrenic patient to one of the 2 (or 3 or whatever) latent subtypes, and (2) make a simple linear SNP predictor for each subtype. I’m not sure if anyone has tried this already, and I don’t personally know how to solve that joint optimization problem, but it seems like the kind of problem that a statistics-savvy person or team should be able to solve.

I do definitely think there are multiple disjoint root causes for schizophrenia, as evidenced for example by the fact that some people get the positive symptoms without the cognitive symptoms, IIUC. I have opinions (1,2) about exactly what those disjoint root causes are, but maybe that’s not worth getting into here. Ditto with autism having multiple disjoint root causes—for example, I have a kid who got an autism diagnosis despite having no sensory sensitivities, i.e. the most central symptom of autism!! Ditto with extroversion, neuroticism, etc. having multiple disjoint root causes, IMO.

Good luck! :)

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes3d20

As for the philosophical objections, it is more that whatever wakes up won't be me if we do it your way. It might act like me and know everything I know but it seems like I would be dead and something else would exist.

Ah, but how do you know that the person that went to bed last night wasn’t a different person, who died, and you are the “something else” that woke up with all of that person’s memories? And then you’ll die tonight, and tomorrow morning there will be a new person who acts like you and knows everything you know but “you would be dead and something else would exist”?

…It’s fine if you don’t want to keep talking about this. I just couldn’t resist. :-P

If you have a good theory of what all those components are individually you would still be able to predict something like voltage between two arbitrary points.

I agree that, if you have a full SPICE transistor model, you’ll be able to model any arbitrary crazy configuration of transistors. If you treat a transistor as a cartoon switch, you’ll be able to model integrated circuits perfectly, but not to model transistors in very different weird contexts.

By the same token, if you have a perfect model of every aspect of a neuron, then you’ll be able to model it in any possible context, including the unholy mess that constitutes an organoid. I just think that getting a perfect model of every aspect of a neuron is unnecessary, and unrealistic. And in that framework, successfully simulating an organoid is neither necessary nor sufficient to know that your neuron model is OK.

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes3d30

Yeah I think “brain organoids” are a bit like throwing 1000 transistors and batteries and capacitors into a bowl, and shaking the bowl around, and then soldering every point where two leads are touching each other, and then doing electrical characterization on the resulting monstrosity. :)

Would you learn anything whatsoever from this activity? Umm, maybe? Or maybe not. Regardless, even if it’s not completely useless, it’s definitely not a central part of understanding or emulating integrated circuits.

(There was a famous paper where it’s claimed that brain organoids can learn to play Pong, but I think it’s p-hacked / cherry-picked.)

There’s just so much structure in which neurons are connected to which in the brain—e.g. the cortex has 6 layers, with specific cell types connected to each other in specific ways, and then there’s cortex-thalamus-cortex connections and on and on. A big ball of randomly-connected neurons is just a totally different thing.

Also, I am not sure if you're proposing we compress multiple neurons down into a simpler computational block, the way a real arrangement of transistors can be abstracted into logic gates or adders or whatever. I am not a fan of that for WBE for philosophical reasons and because I think it is less likely to capture everything we care about especially for individual people.

Yes and no. My WBE proposal would be to understand the brain algorithm in general, notice that the algorithm has various adjustable parameters (both because of inter-individual variation and within-lifetime learning of memories, desires, etc.), do a brain-scan that records those parameters for a certain individual, and now you can run that algorithm, and it’s a WBE of that individual.

When you run the algorithm, there is no particular reason to expect that the data structures you want to use for that will superficially resemble neurons, like with a 1-to-1 correspondence. Yes you want to run the same algorithm, producing the same output (within tolerance, such that “it’s the same person”), but presumably you’ll be changing the low-level implementation to mesh better with the affordances of the GPU instruction set rather than the affordances of biological neurons.

The “philosophical reasons” are presumably that you think it might not be conscious? If so, I disagree, for reasons briefly summarized in §1.6 here.

“Less likely to capture everything we care about especially for individual people” would be a claim that we didn’t measure the right things or are misunderstanding the algorithm, which is possible, but unrelated to the low-level implementation of the algorithm on our chips.

I definitely am NOT an advocate for things like training a foundation model to match fMRI data and calling it a mediocre WBE. (There do exist people who like that idea, just I’m not one of them.) Whatever the actual information storage is, as used by the brain, e.g. synapses, that’s what we want to be measuring individually and including in the WBE. :)

On (Not) Feeling the AGI

Steven Byrnes3d74

I second the general point that GDP growth is a funny metric … it seems possible (as far as I know) for a society to invent every possible technology, transform the world into a wild sci-fi land beyond recognition or comprehension each month, etc., without quote-unquote “GDP growth” actually being all that high — cf. What Do GDP Growth Curves Really Mean? and follow-up Some Unorthodox Ways To Achieve High GDP Growth with (conversely) a toy example of sustained quote-unquote “GDP growth” in a static economy.

This is annoying to me, because, there’s a massive substantive worldview difference between people who expect, y’know, the thing where the world transforms into a wild sci-fi land beyond recognition or comprehension each month, or whatever, versus the people who are expecting something akin to past technologies like railroads or e-commerce. I really want to talk about that huge worldview difference, in a way that people won’t misunderstand. Saying “>100%/year GDP growth” is a nice way to do that … so it’s annoying that this might be technically incorrect (as far as I know). I don’t have an equally catchy and clear alternative.

(Hmm, I once saw someone (maybe Paul Christiano?) saying “1% of Earth’s land area will be covered with solar cells in X number of years”, or something like that. But that failed to communicate in an interesting way: the person he was talking to treated the claim as so absurd that he must have messed up by misplacing a decimal point :-P ) (Will MacAskill has been trying “century in a decade”, which I think works in some ways but gives the wrong impression in other ways.)

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes3d30

Good question! The idea is, the brain is supposed to do something specific and useful—run a certain algorithm that systematically leads to ecologically-adaptive actions. The size of the genome limits the amount of complexity that can be built into this algorithm. (More discussion here.) For sure, the genome could build a billion different “cell types” by each cell having 30 different flags which are on and off at random in a collection of 100 billion neurons. But … why on earth would the genome do that? And even if you come up with some answer to that question, it would just mean that we have the wrong idea about what’s fundamental; really, the proper reverse-engineering approach in that case would be to figure out 30 things, not a billion things, i.e. what is the function of each of those 30 flags.

A kind of exception to the rule that the genome limits the brain algorithm complexity is that the genome can (and does) build within-lifetime learning algorithms into the brain, and then those algorithms run for a billion seconds, and create a massive quantity of intricate complexity in their “trained models”. To understand why an adult behaves how they behave in any possible situation, there are probably billions of things to be reverse-engineered and understood, rather than low-thousands of things. However, as a rule of thumb, I claim that:

when the evolutionary learning algorithm adds a new feature to the brain algorithm, it does so by making more different idiosyncratic neuron types and synapse types and neuropeptide receptors and so on,
when one of the brain’s within-lifetime learning algorithm adds a new bit of learned content to its trained model, it does so by editing synapses.

Again, I only claim that these are rules-of-thumb, not hard-and-fast rules, but I do think they’re great starting points. Even if there’s a nonzero amount of learned content storage via gene expression, I propose that thinking of it as “changing the neuron type” is not a good way to think about it; it’s still “the same kind of neuron”, and part of the same subproject of the “understanding the brain” megaproject, it’s just that the neuron happens to be storing some adjustable parameter in its nucleus and acting differently in accordance with that.

By contrast, medium spiny neurons versus Purkinje cells versus cortical pyramidal neurons versus magnocellular neurosecretory cells etc. etc. are all just wildly different from each other—they look different, they act different, they play profoundly different roles in the brain algorithm, etc. The genome clearly needs to be dedicating some of its information capacity to specifying how to build each and every of those cell types, individually, such that each of them can play its own particular role in the brain algorithm.

Does that help explain where I’m coming from?

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes3d20

you believe a neuron or a small group of neurons are fundamentally computationally simple and I don't

I guess I would phrase it as “there’s a useful thing that neurons are doing to contribute to the brain algorithm, and that thing constitutes a tiny fraction of the full complexity of a real-world neuron”.

(I would say the same thing about MOSFETs. Again, here’s how to model a MOSFET, it’s a horrific mess. Is a MOSFET “fundamentally computationally simple”? Maybe?—I’m not sure exactly what that means. I’d say it does a useful thing in the context of an integrated circuit, and that useful thing is pretty simple.

The trick is, “the useful thing that a neuron is doing to contribute to the brain algorithm” is not something you can figure out by studying the neuron, just as “the useful thing that a MOSFET is doing to contribute to IC function” is not something you can figure out by studying the MOSFET. There’s no such thing as “Our model is P% accurate” if you don’t know what phenomenon you’re trying to capture. If you model the MOSFET as a cartoon switch, that model will be extremely inaccurate along all kinds of axes—for example, its thermal coefficients will be wrong by 100%. But that doesn’t matter because the cartoon switch model is accurate along the one axis that matters for IC functioning.

The brain is generally pretty noise-tolerant. Indeed, if one of your neurons dies altogether, “you are still you” in the ways that matter. But a dead neuron is a 0% accurate model of a live neuron. ¯\_(ツ)_/¯

In parallel with that there should be a project trying to characterize how error tolerant real neurons and neural networks can be so we can find the lower bound of P. I actually tried something like that for synaptic weight (how does performance degrade when adding noise to the weights of a spiking neural network) but I was so disillusioned with the learning rules that I am not confident in my results.

Just because every part of the brain has neurons and synapses doesn’t mean every part of the brain is a “spiking neural network” with the connotation that that term has in ML, i.e. a learning algorithm. The brain also needs (what I call) “business logic”—just as every ML github repository has tons of code that is not the learning algorithm itself. I think that the low-thousands of different neuron types are playing quite different roles in quite different parts of the brain algorithm, and that studying “spiking neural networks” is the wrong starting point.

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes3d40

Having just read your post on pessimism, I am confused as to why you think low thousands of separate neuron models would be sufficient. I agree that characterizing billions of neurons is a very tall order (although I really won't care how long it takes if I'm dead anyway). But when you say '“...information storage in the nucleus doesn’t happen at all, or has such a small effect that we can ignore it and still get the same high-level behavior” (which I don’t believe).' it sounds to me like an argument in favor of looking at the transcriptome of each cell.

I think the genome builds a brain algorithm, and the brain algorithm (like practically every algorithm in your CS textbook) includes a number of persistent variables that are occasionally updated in such-and-such way under such-and-such circumstance. Those variables correspond to what the neuro people call plasticity—synaptic plasticity, gene expression plasticity, whatever. Some such occasionally-updated variables are parameters in within-lifetime learning algorithms that are part of the brain algorithm (akin to ML weights). Other such variables are not, instead they’re just essentially counter variables or whatever (see §2.3.3 here). The “understanding the brain algorithm” research program would be figuring out what the brain algorithm is, how and why it works, and thus (as a special case) what are the exact set of “persistent variables that are occasionally updated”, and how are they stored in the brain. If you complete this research program, you get brain-like AGI, but you can’t upload any particular adult human. Then a different research program is: take an adult human brain, and go in with your microtome etc. and actually measure all those “persistent variables that are occasionally updated”, which comprise a person’s unique memories, beliefs, desires, etc.

I think the first research program (understanding the brain algorithm) doesn’t require a thorough understanding of neuron electrophysiology. For example (copying from §3.1 here), suppose that I want to model a translator (specifically, a MOSFET). And suppose that my model only needs to be sufficient to emulate the calculations done by a CMOS integrated circuit. Then my model can be extremely simple—it can just treat the transistor as a cartoon switch. Next, again suppose that I want to model a transistor. But this time, I want my model to accurately capture all measurable details of the transistor. Then my model needs to be mind-bogglingly complex, involving many dozens of obscure SPICE modeling parameters. The point is: I’m suggesting an analogy between this transistor and a neuron with synapses, dendritic spikes, etc. The latter system is mind-bogglingly complex when you study it in detail—no doubt about it! But that doesn’t mean that the neuron’s essential algorithmic role is equally complicated. The latter might just amount to a little cartoon diagram with some ANDs and ORs and IF-THENs or whatever. Or maybe not, but we should at least keep that possibility in mind.

In the “understanding the brain algorithm” research program, you’re triangulating between knowledge of algorithms in general, knowledge of what actual brains actually do (including lesion studies, stimulation studies, etc.), knowledge of evolution and ecology, and measurements of neurons. The first three can add so much information that it seems possible to pin down the fourth without all that much measurements, or even with no measurements at all beyond the connectome. Probably gene expression stuff will be involved in the implementations in certain cases, but we don’t really care, and don’t necessarily need to be measuring that. At least, that’s my guess.

In the “take the adult brain and measure all the ‘persistent variables that are occasionally updated’ research program, yes it’s possible that some of those persistent variables are stored in gene expressions, but my guess is very few, and if we know where they are and how they work then we can just measure the exact relevant RNA in the exact relevant cells.

…To be clear, I think working on the “understanding the brain algorithm” research program is very bad and dangerous when it focuses on the cortex and thalamus and basal ganglia, but good when it focuses on the hypothalamus and brainstem, and it’s sad that people in neuroscience, especially AI-adjacent people with a knack for algorithms, are overwhelmingly are working on the exact worst possible thing :( But I think doing it in the right order (cortex last, long after deeply understanding everything about the hypothalamus & brainstem) is probably good, and I think that there’s realistically no way to get WBE without completing the “understanding the brain algorithm” research program somewhere along the way.

Straightforward Steps to Marginally Improve Odds of Whole Brain Emulation

Steven Byrnes4d40

I’m not an expert myself (this will be obvious), but I was just trying to understand slide-seq—especially this paper which sequenced RNA from 4,000,000 neurons around the mouse brain.

They found low-thousands of neuron types in the mouse, which makes sense on priors given that there are only like 20,000 genes encoding the whole brain design and everything in it, along with the rest of the body. (Humans are similar.)

I’m very mildly skeptical of the importance & necessity of electrophysiology characterization for reasons here, but such a project seems more feasible if you think of it as characterizing the electrophysiology properties of low-thousands of discrete neuron types, each of which (hopefully) can also be related to morphology or location or something else that would be visible in a connectomics dataset, as opposed to characterizing billions of neurons that are each unique.

Sorry if this is stupid or I’m misunderstanding.

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments