1 min read

3

This is a special post for quick takes by Emrik. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Where I say things, but quick. Or at least aspire to.

18 comments, sorted by Click to highlight new comments since:
[-]Emrik181

Selfish neuremes adapt to prevent you from reprioritizing

  • "Neureme" is my most general term for units of selection in the brain.[1] 
    • The term is agnostic about what exactly the physical thing is that's being selected. It just refers to whatever is implementing a neural function and is selected as a unit.
    • So depending on use-case, a "neureme" can semantically resolve to a single neuron, a collection of neurons, a neural ensemble/assembly/population-vector/engram, a set of ensembles, a frequency, or even dendritic substructure if that plays a role.
  • For every activity you're engaged with, there are certain neuremes responsible for specializing at those tasks.
  • These neuremes are strengthened or weakened/changed in proportion to how effectively they can promote themselves to your attention.
    • "Attending to" assemblies of neurons means that their firing-rate maxes out (gamma frequency), and their synapses are flushed with acetylcholine, which is required for encoding memories and queuing them for consolidation during sleep.
  • So we should expect that neuremes are selected for effectively keeping themselves in attention, even in cases where that makes you less effective at tasks which tend to increase your genetic fitness.
  • Note that there's hereditary selection going on at the level of genes, and at the level of neuremes. But since genes adapt much slower, the primary selection-pressures neuremes adapt to arise from short-term inter-neuronal competitions. Genes are limited to optimizing the general structure of those competitions, but they can only do so in very broad strokes, so there's lots of genetically-misaligned neuronal competition going on.
    • A corollary of this is that neuremes are stuck in a tragedy of the commons: If all neuremes "agreed to" never develop any misaligned mechanisms for keeping themselves in attention—and we assume this has no effect on the relative proportion of attention they receive—then their relative fitness would stay constant at a lower metabolic cost overall. But since no such agreement can be made, there's some price of anarchy wrt the cost-efficiency of neuremes.
  • Thus, whenever some neuremes uniquely associated with a cognitive state are *dominant* in attention, whatever mechanisms they've evolved for persisting the state are going to be at maximum power, and this is what makes the brain reluctant to gain perspective when on stimulants.

A technique for making the brain trust prioritization/perspectivization

So, in conclusion, maybe this technique could work:

  • If I feel like my brain is sucking me into an unproductive rabbit-hole, set a timer for 60 seconds during which I can check my todo-list and prioritize what I ought to do next.
  • But, before the end of that timer, I will have set another timer (e.g. 10 min) during which I commit to the previous task before I switch to whatever I decided.
  • The hope is that my brain learns to trust that gaining perspective doesn't automatically mean we have to abandon the present task, and this means it can spend less energy on inhibiting signals that try to gain perspective.

By experience, I know something like this has worked for:

  • Making me trust my task-list
    • When my brain trusts that all my tasks are in my todo-list, and that I will check my todo-list every day, it no longer bothers reminding me about stuff at random intervals.
  • Reducing dystonic distractions
    • When I deliberately schedule stuff I want to do less (e.g. masturbation, cooking, twitter), and committing to actually *do* those things when scheduled, my brain learns to trust that, and stops bothering me with the desires when they're not scheduled.

So it seems likely that something in this direction could work, even if this particular technique fails.

  1. ^

    The "-eme" suffix inherits from "emic unit", e.g. genes, memes, sememes, morphemes, lexemes, etc. It refers to the minimum indivisible things that compose to serve complex functions. The important notion here is that even if the eme has complex substructure, all its components are selected as a unit, which means that all subfunctions hitchhike on the net fitness of all other subfunctions.

This comment is making me wish I could bookmark comments on LW. @habryka,

Bonus point: neuronal "voting power" is capped at around ~100Hz, so neurons "have an incentive" (ie, will be selected based on the extent to which they) vote for what related neurons are likely to vote for. It's analogous to a winner-takes-all-election where you don't want to waste your vote on third-party candidates who are unlikely to be competitive at the top. And when most voters also vote this way, it becomes Keynesian in the sense that you have to predict[1] what other voters predict other voters will vote for, and the best candidates are those who seem the most like good Schelling-points.

That's why global/conscious "narratives" are essential in the brain—they're metabolically efficient Schelling-points.

  1. ^

    Neuron-voters needn't "make predictions" like human-voters do. It just needs to be the case that their stability is proportional to their ability to "act as if" they predicted other neurons' predictions (and so on).

I messed up. I meant to comment on another comment of yours, the one replying to niplav's post about fat tails disincentivizing compromise. That was the one I really wished I could bookmark. 

Oh! Well, I'm as happy about receiving a compliment for that as I am for what I thought I got the compliment for, so I forgive you. Thanks! :D

I think hastening of subgoal completion[1] is some evidence for the notion that competitive inter-neuronal selection pressures are frequently misaligned with genetic fitness. People (me included) routinely choose to prioritize completing small subtasks in order to reduce cognitive load, even when that strategy predictably costs more net metabolic energy. (But I can think of strong counterexamples.)

The same pattern one meta-level up is "intragenomic conflict"[2], where genetic lineages have had to spend significant selection-power to prevent genes from fighting dirty. For example, the mechanism of meiosis itself may largely be maintained in equilibrium due to the longer-term necessity of preventing stuff like meiotic drives. An allele (or a collusion of them) which successfwly transfer to offspring at a probability of >50%, may increase their relative fitness even if it marginally reduces their phenotype's viability.

My generalized term for this is "intra-emic conflict" (pinging the concept of an "eme" as defined in the above comment).

  1. ^

    We asked university students to pick up either of two buckets, one to the left of an alley and one to the right, and to carry the selected bucket to the alley’s end. In most trials, one of the buckets was closer to the end point. We emphasized choosing the easier task, expecting participants to prefer the bucket that would be carried a shorter distance. Contrary to our expectation, participants chose the bucket that was closer to the start position, carrying it farther than the other bucket.
     Pre-Crastination: Hastening Subgoal Completion at the Expense of Extra Physical Effort

  2. ^

    Intragenomic conflict refers to the evolutionary phenomenon where genes have phenotypic effects that promote their own transmission in detriment of the transmission of other genes that reside in the same genome.

So we should expect that neuremes are selected for effectively keeping themselves in attention, even in cases where that makes you less effective at tasks which tend to increase your genetic fitness.

Furthermore, the neuremes (association-clusters) you are currently attending to, have an incentive to recruit associated neuremes into attention as well, because then they feed each others' activity recursively, and can dominate attention for longer. I think of it like association-clusters feeding activity into their "friends" who are most likely to reciprocate.

And because recursive connections between association-clusters tend to reflect some ground truth about causal relationships in the territory, this tends to be highly effective as a mechanism for inference. But there must be edge-cases (though I can't recall any atm...).

Imagining agentic behaviour in (/taking intentional stance wrt) individual brain-units is great for generating high-level hypotheses about mechanisms, but obviously misfires and don't try this at home etc etc.

  • Unhook is a browser extension for YouTube (Chrome/Edge) which disables the homepage and lets you hide all recommendations. It also lets you disable other features (e.g. autoplay, comments), but doesn't have so many customizations that I get distracted.
    • Setup time: 2m-10m (depending on whether you customize).
  • CopyQ.exe (Linux/Windows, portable, FOSS) is a really good clipboard manager.
    • Setup time: 5m-10h
    • Setup can be <5m if you precommit to only using the clipboard-storing feature and learning the shortcut to browse it. But it's extremely extensible and risks distracting you for a day or more...
    • You can use a shortcut to browse the most recent copies (including editing, deleting), and the window hides automatically when unfocused.
    • It can save images to a separate tab, and lets you configure shortcuts for opening them in particular programs (e.g. editor).
    • (LINUX): It has plugins/commands for snipping a section of the screen, and you can optionally configure a shortcut to send that snip to an OCR engine, which quietly sends the recognized text into the clipboard.
      • Setup time: probably exceeding >2h due to shiny things to explore
    • (WINDOWS): EDIT: ShareX (FOSS) can do OCR-to-clipboard, snipping, region-recording, scripting, and everything is configurable.  Setup took me 36m, but I also configured it to my preferences and explored all features.  Old text below:
      • (WINDOWS): Can use Text-Grab (FOSS) instead. Much simpler. Use a configurable hotkey (the one for Fullscreen Grab) to snip a section of the screen, and it automatically does OCR on it and sends it to your clipboard. Install it and trigger the hotkey to see what it does.
        • Setup time: 2m-15m
        • Alternatively Greenshot (FOSS) is much more extensible, but you have to use a trick to set it up to use OCR via Tesseract (or configure your own script).
        • Also if you use Windows, you can use the native Snipping-Tool to snip cutouts from the screen into the clipboard via shortcut, including recordings.
  • LibreChat (docs) (FOSS) is the best LLM interface I've found for general conversation, but its (putative) code interpreter doesn't work off-the-shelf, so I still use the standard ChatGPT-interface for that.
    • Setup time: 30m-5h (depending on customization and familiarity with Docker)
    • It has no click-to-install .exe file, but you can install it via npm or Docker
      • Docker is much simpler, especially since it automatically configures MongoDB database and Meilisearch for you
    • Lets you quickly swap between OpenAI, Anthropic, Assistants API, and more in the menu
      • (Obviously you need to use your own API keys for this)
    • Can have two LLMs respond to your prompt at the same time
    • For coding, probably better to use a vscode extension, but idk which to recommend yet...
    • For a click-to-install generalized LLM interface, ChatBox (FOSS) is excellent unless you need more advanced features.
  • Vibe (FOSS) is a simple tool for transcribing audio files locally using Whisper.
    • Setup time: 5m-30m (you gotta download the Whisper weights, but should be fast if you just follow the instructions)
  • Windows Voice Access (native) is actually pretty good
    • You can define custom commands for it, including your own scripts
    • I recommend using pythonw.exe for this (normal python, but launches in the background)

Somebody commented on my YT vid that they found my explanations easy to follow. This surprised me. My prior was/is tentatively that I'm really bad at explaining anything to other people, since I almost never[1] speak to anybody in real-time other than myself and Maria (my spirit animal).

And when I do speak to myself (eg₁, eg₂, eg₃), I use heavily modified English and a vocabulary of ~500 idiolectic jargon-words (tho their usage is ~Zipfian, like with all other languages).

I count this as another datapoint to my hunch that, in many situations:

Your ability to understand yourself is a better proxy for whether other people will understand you compared to the noisy feedback you get from others.

And by "your ability to understand yourself", I don't mean just using internal simulations of other people to check whether they understand you. I mean, like, check for whether the thing you think you understand, actually make sense to you, independent of whatever you believe ought to make sense to you. Whatever you believe ought to make sense is often just a feeling based on deference to what you think is true (which in turn is often just a feeling based on deference to what you believe other people believe).

  1. ^

    To make this concrete: the last time I spoke to anybody irl was 2022 (at EAGxBerlin)—unless we count the person who sold me my glasses, that one plumber, a few words to the apothecarist, and 5-20 sentences to my landlord. I've had 6 video calls since February (all within the last month). I do write a lot, but ~95-99% to myself in my own notes.

Basically, if you are confused about some topic, it does not matter how good communicator you are -- your explanation will still be confusing.

Being a very good communicator can actually make it worse; for example you may invent a fascinating -- but completely wrong -- metaphor for something. You can create an illusion of understanding, without the understanding.

Hmm, I'm noticing that a surprisingly large portion of my recent creative progress can be traced down to a single "isthmus" (a key pattern that helps you connect many other patterns). It's the trigger-action-plan of

IF you see an interesting pattern that doesn't have a name
THEN invent a new word and make a flashcard for it

This may not sound like much, and it wouldn't to me either if I hadn't seen it make a profound difference.

Interesting patterns are powerups, and if you just go "huh, that's interesting" and then move on with your life, you're totally wasting their potential. Making a name for it makes it much more likely that you'll be able to spontaneously see the pattern elsewhere (isthmus-passing insights). And making a flashcard for it makes sure you access it when you have different distributions of activation levels over other ideas, making it more likely that you'll end up making synthetic (isthmus-centered) insights between them. (For this reason, I'm also strongly against the idea of dissuading people from using jargon as long as the jargon makes sense. I think people should use more jargon, even if it seems embarrassingly supercilious and perhaps intimidating to outsiders).

I dig the word isthmus and I like your linked comment about it being somehow dual to a bottleneck (i.e. constraint).

It seems quite related to Wentworth's mazes. I.e. see the picture https://www.lesswrong.com/posts/nEBbw2Bc2CnN2RMxy/gears-level-models-are-capital-investments

There are two ways to try and solve the maze: find a path (search, prime &babble) or find constraints by finding walls.

If I need to go from the left to the right In a maze then highlighting continuous vertical stretches of wall from top to bottom will yield bottlenecks while horizontal highlighting along paths yield isthmuses. Both techniques can be used together

Quite cool imho

owo thanks

Seems like Andy Matuschak feels the same way about spaced repetition being a great tool for innovation.

I wrote a comment on {polytely, pleiotropy, market segmentation, conjunctive search, modularity, and costs of compromise} that I thought people here might find interesting, so I'm posting it as a quick take:

I think you're using the term a bit differently from how I use it! I usually think of polytely (which is just pleiotropy from a different perspective, afaict) as an *obstacle*. That is, if I'm trying to optimize a single pasta sauce to be the most tasty and profitable pasta sauce in the whole world, my optimization is "polytelic" because I have *compromise* between maximizing its tastiness for [people who prefer sour taste], [people who prefer sweet], [people who have some other taste-preferences], etc. Another way to say that is that I'm doing "conjunctive search" (neuroscience term) for a single thing which fits multiple ~independent criteria.

Still in the context of pasta sauce: if you have the logistical capacity to instead be optimizing *multiple* pasta sauces, now you are able to specialize each sauce for each cluster of taste-preferences, and this allows you to net more profit in the end. This is called "horizontal segmentation".

Likewise, a gene which has several functions that depend on it will be evolutionarily selected for the *compromise* between all those functions. In this case, the gene is "pleiotropic" because its evolving in the direction of multiple niches at once; and it is "polytelic" because—from the gene's perspective—you can say that "it is optimizing for several goals at once" (if you're willing to imagine the gene as an "optimizer" for a moment).

For example, the recessive allele that causes sickle cell disease (SCD) *also* causes some resistance against malaria. But SCD only occurs in people who are homozygous in it, so the protective effect against malaria (in heterozygotes) is common enough to keep it in the gene pool. It would be awesome if, instead, we could *horizontally segment* these effects so that SCD is caused by variations in one gene locus, and malaria-resistance is caused by variations in another locus. That way, both could be optimized for separately, and you wouldn't have to choose between optimizing against SCD or Malaria.

Maybe the notion you're looking for is something like "modularity"? That is approximately something like the opposite of pleiotropy. If a thing is modular, it means you can flexibly optimize subsets of it for different purposes. Like, rather writing an entire program within a single function call, you can separate out the functions (one function for each subtask you can identify), and now those functions can be called separately without having to incur the effects of the entire unsegmented program.

You make me realize that "polytelic" is too vague of a word. What I usually mean by it may be more accurately referred to as "conjunctively polytelic". All networks trained with something-like-SGD will evolve features which are conjunctively polytelic to some extent (this is just conjecture from me, I haven't got any proof or anything), and this is an obstacle for further optimization. But protein-coding genes are much more prone to this because e.g. the human genome only contains ~20k of them, which means each protein has to pack many more functions (and there's no simple way to refactor/segment so there's only one protein assigned to each function).

KOAN:
The probability of rolling 60 if you toss ten six-sided dice disjunctively is 1/6^10. Whereas if you glom all the dice together and toss a single 60-sided die, the probability of rolling 60 is 1/60.

Repeated voluntary attentional selection for a stimulus reduces voluntary attentional control wrt that stimulus

From Investigating the role of exogenous cueing on selection history formation (2019):

An abundance of recent empirical data suggest that repeatedly allocating visual attention to task-relevant and/or reward-predicting features in the visual world engenders an attentional bias for these frequently attended stimuli, even when they become task irrelevant and no longer predict reward. In short, attentional selection in the past hinders voluntary control of attention in the present. […] Thus, unlike voluntarily directed attention, involuntary attentional allocation may not be sufficient to engender historically contingent selection biases.

It's sorta unsurprising if you think about it, but I don't think I'm anywhere near having adequately propagated its implications.

Some takeaways:

  • "Beware of what you attend"
  • WHEN: You notice that attending to a specific feature of a problem-solving task was surprisingly helpfwl…
    • THEN: Mentally simulate attending to that feature in a few different problem-solving situations (ie, hook into multiple memory-traces to generalize recall to the relevant class of contexts)
    • My idionym for specific simple features that narrowly help connect concepts is "isthmuses". I try to pay attention to generalizable isthmuses when I find them (commit to memory).

I interpret this as supporting the idea that voluntary-ish allocation of attention is one of the strongest selection-pressures neuremes adapt to, and thus also one of your primary sources of leverage wrt gradually shaping your brain / self-alignment.

Key terms: attentional selection history, attentional selection bias

I struggle with prioritising what to read. Additionally, but less of a problem, I struggle to motivate myself to read things. Some introspection:

The problem is that my mind desires to "have read" something more than desiring the state of "reading" it. Either because I imagine the prestige or self-satisfaction that comes with thinking "hehe, I read the thing," or because I actually desire the the knowledge for its own sake, but I don't desire the attaining of it, I desire the having of it.[1]

Could I goodhart-hack this by rewarding myself for reading and feeling ashamed of myself for actually finishing a whole post? Probably not. I think perhaps my problem is that I'm always trying to cut the enemy, so I can't take my eyes off it for long enough to innocently experience the inherent joy of seeing interesting patterns. When I do feel the most joy, I'm usually descending unnecessarily deep into a rabbit hole.

"What are all the cell adhesion molecules, how are they synthesised, and is the synthesis bottlenecked by a particular nutrient I can supplement?!"

Nay, I think my larger problem is always having a million things that I really want to read, and I feel a desparate urge to go through all of them--yesterday at the latest! So when I do feel joy at the nice patterns I learn, I feel a quiet unease at the back of my mind calling me to finish this as soon as possible so we can start on the next urgent thing to read.

(The more I think about it, the more I realise just how annoying that constant impatient nagging is when I'm trying to read something. It's not intense, but it really diminishes the joy. While I do endorse impatience and always trying to cut the enemy, I'm very likely too impatient for my own good. On the margin, I'd make speedier progress with more slack.)

If this is correct, then maybe what I need to do is to--well, close all my tabs for a start--separate out the process of collecting from the process of reading. I'll make a rule: If I see a whole new thing that I want to read, I'm strictly forbidden to actually read it until at least a day has passed. If I'm already engaged in a particular question/topic, then I can seek out and read information about it, but I can only start on new topics if it's in my collection from at least a day ago.

I'm probably intuitively overestimating the a new thing's value relative to the things in my collections anyway, just because it feels more novel. If instead I only read things from my collection, I'll gradually build up an enthusiasm for it that can compete with my old enthusiasm for aimless novelty--especially as I experience my new process outperforming my old.

My enthusiasm for "read all the newly-discovered things!" is not necessarily the optimal way to experience the most enthusiasm for reading, it's just stuck in a myopic equilibrium I can beat with a little activation energy.

  1. ^

    What this ends up looking like is frantically skimming through the paper until I find the patterns I'm looking for, and I end up being so frustrated that I can't immediately find it that the experience ends up being unpleasant.

Here's my definitely-wrong-and-overly-precise model of productivity. I'd be happy if someone pointed out where it's wrong.

It has three central premises: a) I have proximal (basal; hardcoded) and distal (PFC; flexible) rewards. b) Additionally, or perhaps for the same reasons, my brain uses temporal-difference learning, but I'm unclear on the details. c) Hebbian learning: neurons that fire together, wire together.

If I eat blueberry muffins, I feel good. That's a proximal reward. So every time my brain produces a motivation to eat blueberry muffins, and I take steps that makes me *predict* that I am closer to eating blueberry muffins, the synapses that produced *that particular motivation* gets reinforced and are more likely to fire again next time.

The brain gets trained to produce the motivations that more reliably produce actions that lead to rewards.

If I get out of bed quickly after the alarm sounds, there are no hardcoded rewards for that. But after I get out of bed, I predict that I am better able to achieve my goals, and that prediction itself is the reward that reinforces the behaviour. It's a distal reward. Every time the brain produces motivations that in fact get me to take actions that I in fact predict will make me more likely to achieve my goals, those motivations get reinforced.

But I have some marginal control over *which motivations I choose to turn into action*, and some marginal control over *which predictions I make* about whether those actions take me closer to my goals. Those are the two levers with which I am able to gradually take control over which motivations my brain produces, as long as I'm strategic about it. I'm a fledgling mesa-optimiser inside my own brain, and I start out with the odds against me.

I can also set myself up for failure. If I commit to, say, study math for 12 hours a day, then... I'm able to at first feel like I've committed to that as long as I naively expect, right then and there, that the commitment takes me closer to my goals. But come the next day when I actually try to achieve this, I run out of steam, and it becomes harder and harder to resist the motivations to quit. And when I quit, *the motivations that led me to quit get reinforced because I feel relieved* (proximal reward). Trying-and-failing can build up quitting-muscles.

If you're a sufficiently clever mesa-optimiser, you *can* make yourself study math for 12 hours a day or whatever, but you have to gradually build up to it. Never make a large ask of yourself before you've sufficiently starved the quitting-pathways to extinction. Seek to build up simple well-defined trigger-action rules that you know you can keep to every single time they're triggered. If more and more of input-space gets gradually siphoned into those rules, you starve alternative pathways out of existence.

Thus, we have one aspect of the maxim: "You never make decisions, you only ever decide between strategies."