Flowers are selective about the pollinators they attract. Diurnal flowers must compete with each other for visual attention, so they use diverse colours to stand out from their neighbours. But flowers with nocturnal anthesis are generally white, as they aim only to outshine the night.
I've omitted some steps from the checklists below, especially related to mindset / specific thinking-habits. They're an important part of this, but hard to explain and will vary a lot more between people.
This is where I prepare the steps for my morning routine. I won't list it all, but some important steps:
Approximately how my bed setup looks now (2 weeks ago). The pillows are from experimenting with ways to cocoon myself ergonomically. :p
It seems generally quite bad for somebody like John to have to justify his research in order to have an income. A mind like this is better spent purely optimizing for exactly what he thinks is best, imo.
When he knows that he must justify himself to others (who may or may not understand his reasoning), his brain's background-search is biased in favour of what-can-be-explained. For early thinkers, this bias tends to be good, because it prevents them from bullshitting themselves. But there comes a point where you've mostly learned not to bullshit yourself, and you're better off purely aiming your cognition based on what you yourself think you understand.
Paying people for what they do works great if most of their potential impact comes from activities you can verify. But if their most effective activities are things they have a hard time explaining to others (yet have intrinsic motivation to do), you could miss out on a lot of impact by requiring them instead to work on what's verifiable.
The people who are much higher competence will behave in ways you don't recognise as more competent. If you were able to tell what right things to do are, you would just do those things and be at their level. Your "deference limit" is the level of competence above your own at which you stop being able to reliable judge the difference.
Innovation on the frontier is anti-inductive. If you select people cautiously, you miss out on hiring people significantly more competent than you.[1]
Consider how the cost of compromising between optimisation criteria interacts with what part of the impact distribution you're aiming for. If you're searching for a project with top p% impact and top p% explainability-to-funders, you can expect only p^2 of projects to fit both criteria—assuming independence.
But I think it's an open question how & when the distributions correlate. One reason to think they could sometimes be anticorrelated [sic] is that the projects with the highest explainability-to-funders are also more likely to receive adequate attention from profit-incentives alone.[2]
If someone believes something wild, and your response is strict confusion, that's high value of information. You can only safely say they're low-epistemic-value if you have evidence for some alternative story that explains why they believe what they believe.
Alternatively, find something that is surprisingly popular—because if you don't understand why someone believes something, you cannot exclude that they believe it for good reasons.[3]
Still, I really hope funders would consider funding the person instead of the project, since I think Johannes' potential will be severely stifled unless he has the opportunity to go "oops! I guess I ought to be doing something else instead" as soon as he discovers some intractable bottleneck wrt his current project. (...) it would be a real shame if funding gave him an incentive to not notice reasons to pivot.[4]
Comment explaining why I think it would be good if exceptional researchers had basic income (evaluate candidates by their meta-level process rather than their object-level beliefs)
Comment explaining what costs of compromise in conjunctive search implies for when you're "sampling for outliers"
Comment explaining my approach to finding usefwl information in general
This relates to costs of compromise!
It's this class of patterns that frequently recur as a crucial considerations in contexts re optimization, and I've been making too many shoddy comments about it. (Recent1[1], Recent2.) Somebody who can write ought to unify the many aspects of it give it a public name so it can enter discourse or something.
From Wikipedia: Multi-Objective optimization:
Watch the blue twirly thing until you forget how bored you are by this essay, then continue.
I tend to say a lot of wrong stuff, so do take my utterances with grains of salt. I don't optimize for being safe to defer to, but it doesn't matter if I say a bunch of wrong stuff if some of the patterns can work as gears in your own models. Screens off concerns about deference or how right or wrong I am.
I rly like the framing of concave vs convex option-set btw!
Lizka has a post abt concave option-set in forum-post writing! From my comment on it:
As you allude to by the exponential decay of the green dots in your last graph, there are exponential costs to compromising what you are optimizing for in order to appeal to a wider variety of interests. On the flip-side, how usefwl to a subgroup you can expect to be is exponentially proportional to how purely you optimize for that particular subset of people (depending on how independent the optimization criteria are). This strategy is also known as "horizontal segmentation".
The benefits of segmentation ought to be compared against what is plausibly an exponential decay in the number of people who fit a marginally smaller subset of optimization criteria. So it's not obvious in general whether you should on the margin try to aim more purely for a subset, or aim for broader appeal.
Normalization is an explicit step in taking the population vector of an ensemble involved in some computation. So if you imagine the vector for the ensemble(s) involved in choosing what to do next, and take the projection of that vector onto directions representing each option, the intensity of your motivation for any option is proportional to the length of that projection relative to the length of all other projections. (Although here I'm just extrapolating the formula to visualize its consequences—this step isn't explicitly supported by anything I've read. E.g. I doubt cosine similarity is appropriate for it.)
So, in conclusion, maybe this technique could work:
By experience, I know something like this has worked for:
So it seems likely that something in this direction could work, even if this particular technique fails.
The "-eme" suffix inherits from "emic unit", e.g. genes, memes, sememes, morphemes, lexemes, etc. It refers to the minimum indivisible things that compose to serve complex functions. The important notion here is that even if the eme has complex substructure, all its components are selected as a unit, which means that all subfunctions hitchhike on the net fitness of all other subfunctions.
Edit: made it a post.
On my current models of theoretical[1] insight-making, the beginning of an insight will necessarily—afaict—be "non-robust"/chaotic. I think it looks something like this:
This maps to a fragile/chaotic high-energy "question phase" during which the violation of expectation is maximized (in order to adequately propagate the implications of the original discrepancy), followed by a compressive low-energy "solution phase" where correctness of expectation is maximized again.
In order to make this work, I think the brain is specifically designed to avoid being "robust"—though here I'm using a more narrow definition of the word than I suspect you intended. Specifically, there are several homeostatic mechanisms which make the brain-state hug the border between phase-transitions as tightly as possible. In other words, the brain maximizes dynamic correlation length between neurons[4], which is when they have the greatest ability to influence each other across long distances (aka "communicate"). This is called the critical brain hypothesis, and it suggests that good thinking is necessarily chaotic in some sense.
Another point is that insight-making is anti-inductive.[5] Theoretical reasoning is a frontier that's continuously being exploited based on the brain's native Value-of-Information-estimator, which means that the forests with the highest naively-calculated-VoI are also less likely to have any low-hanging fruit remaining. What this implies is that novel insights are likely to be very narrow targets—which means they could be really hard to hold on to for the brief moment between initial hunch and build-up of salience. (Concise handle: epistemic frontiers are anti-inductive.)
I scope my arguments only to "theoretical processing" (i.e. purely introspective stuff like math), and I don't think they apply to "empirical processing".
Harmonic (red) vs inharmonic (blue) waveforms. When a waveform is harmonic, efferent neural ensembles can quickly entrain to it and stay in sync with minimal metabolic cost. Alternatively, in the context of predictive processing, we can say that "top-down predictions" quickly "learn to predict" bottom-up stimuli.
I basically think musical pleasure (and aesthetic pleasure more generally) maps to 1) the build-up of expectations, 2) the violation of those expectations, and 3) the resolution of those violated expectations. Good art has to constantly balance between breaking and affirming automatic expectations. I think the aesthetic chills associates with insights are caused by the same structure as appogiaturas—the one-period delay of an expected tone at the end of a highly predictable sequence.
I highly recommend this entire YT series!
I think the term originates from Eliezer, but Q Home has more relevant discussion on it—also I'm just a big fan of their chaoticoptimal reasoning style in general. Can recommend! 🍵
personally, I try to "prepare decisions ahead of time". so if I end up in situation where I spend more than 10s actively prioritizing the next thing to do, smth went wrong upstream. (prev statement is exaggeration, but it's in the direction of what I aspire to lurn)
as an example, here's how I've summarized the above principle to myself in my notes:
(note: these titles is v likely cause misunderstanding if u don't already know what I mean by them; I try avoid optimizing my notes for others' viewing, so I'll never bother caveating to myself what I'll remember anyway)
I bascly want to batch process my high-level prioritization, bc I notice that I'm v bad at bird-level perspective when I'm deep in the weeds of some particular project/idea. when I'm doing smth w many potential rabbit-holes (eg programming/design), I set a timer (~35m, but varies) for forcing myself to step back and reflect on what I'm doing (atm, I do this less than once a week; but I do an alternative which takes longer to explain).
I'm prob wasting 95% of my time on unnecessary rabbit-holes that cud be obviated if only I'd spent more Manual Effort ahead of time. there's ~always a shorter path to my target, and it's easier to spot from a higher vantage-point/perspective.
as for figuring out what and how to distill…
Good points, but I feel like you're a bit biased against foxes. First of all, they're cute (see diagram). You didn't even mention that they're cute, yet you claim to present a fair and balanced case? Hedgehog hogwash, I say.
Anyway, I think the skills required for forecasting vs model-building are quite different. I'm not a forecaster, but if I were, I would try to read much more and more widely so I'm not blindsided by stuff I didn't even know that I didn't know. Forecasting is caring more about the numbers; model-building is caring more about how the vertices link up, whatever their weights. Model-building is for generating new hypotheses that didn't exist before; forecasting is discriminating between what already exists.
I try to build conceptual models, and afaict I get much more than 80% of the benefit from 20% of the content that's already in my brain. There are some very general patterns I've thought so deeply on that they provide usefwl perspectives on new stuff I learn weekly. I'd rather learn 5 things deeply, and remember sub-patterns so well that they fire whenever I see something slightly similar, compared to 50 things so shallowly that the only time I think about them is when I see the flashcards. Knowledge not pondered upon in the shower is no knowledge at all.
This is one of the most important reasons why hubris is so undervalued. People mistakenly think the goal is to generate precise probability estimates for frequently-discussed hypotheses (a goal in which deference can make sense). In a common-payoff-game research community, what matters is making new leaps in model space, not converging on probabilities. We (the research community) are bottlenecked by insight-production, not marginally better forecasts or decisions. Feign hubris if you need to, but strive to install it as a defense against model-dissolving deference.
Coming back to this a few showers later.
Heh, I've gone the opposite way and now do 3h sleep per 12h-days. The aim is to wake up during REM/light-sleep at the end of the 2nd sleep cycle, but I don't have a clever way of measuring this[1] except regular sleep-&-wake-times within the range of what the brain can naturally adapt its cycles to.
I think the objective should be to maximize the integral of cognitive readiness over time,[2] so here are some considerations (sorry for lack of sources; feel free to google/gpt; also also sorry for sorta redundant here, but I didn't wish to spend time paring it down):
Two unsourced illustrations I found in my notes:
So in summary, two (comparatively minor) reasons I like polyphasic short sleep is:
See my (outdated-but-still-maybe-inspirational) my morning routine.
My mood is harder to control/predict in evenings due to compounding butterfly effects over the course of a day, and fewer natural contexts I can hook into with the right course-corrections before the day ends.
Although waking up with morning-wood is some evidence of REM, but I don't know how reliable that is. ^^
PedanticallyTechnically, we want to maximize brain-usefwlness over time, which in this case would be the integral of [[the distribution cognitive readiness over time] pointwise multiplied by [the distribution of brain-usefwlness over cognitive readiness]].This matters if, for example, you get disproportionately more usefwlness from the peaks of cognitive readiness, in which case you might want to sacrifice more median wake-time in order to get marginally more peak-time.
I assume this is what your suggested strategy tries to do. However, I doubt it actually works, due to diminishing returns to marginal sleep time (and, I suspect,
> "Our findings support the notion that ACh acts as a switch between modes of acquisition and consolidation." (2006)
> "Acetylcholine Mediates Dynamic Switching Between Information Coding Schemes in Neuronal Networks" (2019)
"Blank slate" I think caused by eg flushing neurotransmitters out of synaptic clefts (and maybe glucose and other mobile things), basically rebooting attentional selection-history, and thereby reducing recent momentum for whatever's influenced you short-term.