aysja — LessWrong

I wrote the first draft of this essay around a year ago, in between the bouts of delirium that long covid was beginning to deliver me. And I couldn’t quite tell back then how real it was, and as long covid consumed more of my mind it drifted further away. It began to feel impossible that I had ever had, or could ever have, courage. Because courage requires capacity and I was losing all of mine. And the doubts grew larger, and the clarity dimmed, and I forgot about Frodo for awhile, forgot about most everything, as I was left for many months staring directly into the bowels of deep atheism, wondering if I may ever be free from its merciless hold. And it really tested the fortitude of my soul, for there were moments when completely giving up felt the most natural, and really the only, option. But then somewhere in the grappling with this miserable new world I had come to inhabit I remembered Frodo again. And it was not instant, and it was not easy, but developing this concept of solemn courage did help my spirit recover.

I do not get to choose the world I am given. Reality is such that your mind can be randomly corrupted, some molecular demon etching away the grooves that were you until you are a nothingness. Reality is such that everyone I love will likely die. Some distant, plaintive conclusion accelerating into the present by that mysterious process so ravenously set to motion. And there really might not be much I can do about it, for all of my effort may just be drops against its tidal wave. God! Reality can be so unkind. Yet there is something powerful in the orientation of trying anyway. Because in the end that is all there is. In the end the stakes are what they are, and the situation is what it is, and all I can decide is what to do with what I am given. That’s really it, and accepting this has given me clarity. Yes, there will be days I cannot overcome illness; yes, I may not much affect the looming god machines; and yes, that is all very painful. But I’m not going to get lost in it. I’m going to look at it—the uncertainty and fear, the grip of disease and the overwhelmingly large and complicated threat to all I value—and then I’m going to try. Because it is important, and that is all I can do.

Richard Ngo's Shortform

aysja19d42

I agree that human values are more accretive like this, but I would also call those genes “terminal” in the same sense that I call some of my own goals “terminal.” E.g., I can usually ask myself why I’m taking a given action and my brain will give a reasonable answer: “because I want to finish this post,” “because I’m hungry,” whatever. And then I can keep double clicking on those: “I want to finish the post because I don’t think this crux has been spelled out very well yet” and I can keep going and going until at some point the answer is like “I don’t know, because it’s intrinsically beautiful?” and that’s around when I call the goal/preference “terminal.” Which is similar in structure to a story I imagine evolution might tell if it “asked itself” why some particular gene developed.

Perhaps “terminal” is the wrong word for this, but having a handle for these high-level, upstream nodes in my motivational complex has been helpful. And they do hold a special status, at least for me, because many of the “instrumental” actions (or subgoals) could be switched out while preserving this more nebulous desire to “understand” or “find beauty” or what have you. That feels like an important distinction that I want to keep while also agreeing they aren’t always cleanly demarcated as such. E.g., writing has both instrumental and terminal qualities to me, which can make it a more confusing goal-structure to orient to, but also as you say: more strange and wonderful, too.

Eli's shortform feed

aysja1mo2210

I'm not sure if I expect motivated reasoning to come out better on average, even in domains where you might naively expect it to. In part that's because self-serving strategies often involve doing things other people don't like, e.g. being deceptive, manipulative, or generally unethical, in a way that can cause long-term harm to your reputation and so long term harm to your ability to win. And I think there is significant optimization pressure on catching this kind of thing, in part for reasons similar to the ones outlined in Elephant in the Brain, i.e., that we evolved in an environment where winning that cat and mouse game was a big part of adaptive success. But also just because people don't like being screwed, and so are on the lookout for this kind of behavior.

Also, in my imagination you’re more likely to win if you’re at least self-reflective about motivated cognition, since you can make more informed decisions that way. If you just go blindly ahead, then you’re probably failing to track a bunch of what matters, and so failing to win according to what you ultimately care about. Like, in most cases motivated reasoning spins up not just to convince other people, but to convince yourself, which means there’s a part of you that needed convincing in the first place, i.e., a part that is tracking and wanting different things. And I would guess that charging ahead without understanding those dynamics leads to worse outcomes overall? Another way to say it is that I don't imagine a good rationalist as acting against their own interests, but more like they understand them clearly, such that they can decide what makes sense based on a fuller picture of their own mind.

Ruby's Inkhaven Retrospective

aysja2mo94

Fwiw, my experience has been more varied. My most well received comments (100+ karma) are a mix of spending days getting a hard point right and spending minutes extemporaneously gesturing at stuff without much editing. But overall I think the trend points towards "more effort = more engagement and better received." I have mostly attributed this to the standards and readership LessWrong has cultivated, which is why I feel excited to post here. It seems like one of the rare places on the internet where long, complex essays about the most fascinating and important topics are incentivized. My reddit posts are not nearly as well received, for instance. I haven't posted as many essays yet, but I've spent a good deal of effort on all of them, and they've all done fairly well (according to karma, which ofc isn't a great indicator of impact, but some measure of "popularity").

I weakly guess that your hypothesis is right, here. I.e., that the posts you felt most excited about were exciting in part because they presented more interesting and so more difficult thinking and writing challenges. At least for me, tackling topics on the edge of my knowledge takes much more skill and much more time, and it is often a place where effort translates into "better" writing: clearer, more conceptually precise, more engaging, more cutting to the core of things, more of what Pinker is gesturing at. These posts would not be good were they pumped out in a day—not an artifact I'd be proud of, nor something that other people would see the beauty or the truth in. But the effortful version is worth it, i.e., I expect it to be more helpful for the world, more enduring, and more important, than if that effort had been factored out across a bunch of smaller, easier posts.

Unless its governance changes, Anthropic is untrustworthy

aysja2mo4738

I haven't followed every comment you've left on these sorts of discussions, but they often don't include information or arguments I can evaluate. Which MIRI employees, and what did they actually say? Why do you think that working at Anthropic even in non-safety roles is a great way to contribute to AI safety? I understand there are limits to what you can share, but without that information these comments don't amount to much more than you asking us to defer to your judgement. Which is a fine thing to do, I just wish it were more clearly stated as such.

You Are Much More Salient To Yourself Than To Everyone Else

aysja2mo269

This was a really important update for me. I remember being afraid of lots of things before I started publishing more publicly on the internet: how my intelligence would be perceived, if I'd make some obviously stupid in retrospect point and my reputation ruined forever, etc. Then at some point in this thought loop I was like wait the most likely thing is just that no one reads this, right? More like a "huh" or a nothing at all rather than vitriolic hatred of my soul or whatever I was fearing. This was very liberating, and still is. I probably ended up over optimizing for invisibility because of the freedom I feel from it—being mostly untethered from myopic social dynamics has been really helpful for my thinking and writing.

Aim for single piece flow

aysja3mo148

I tend to write in large tomes that take months or years to complete, so I suppose I disagree with you too. Not that intellectual progress must consist of this, obviously, but that it can mark an importantly different kind of intellectual progress from the sort downstream of continuous shipping.

In particular, I think shipping constantly often causes people to be too moored to social reception, risks killing butterfly ideas, screens off deeper thought, and forces premature legibility. Like, a lot of the time I feel ready to publish something there is some bramble I pass in my writing, some inkling of “Is that really true? What exactly do I mean there?” These often spin up worthy investigations of their own, but I probably would’ve failed to notice them were I more focused on getting things out.

Intellectual labor should aggregate minute-by-minute with revolutionary insights aggregating from hundreds of small changes.

This doesn’t necessarily seem in conflict with “long tomes which take months to write.” My intellectual labor consists of insights aggregating from hundreds of small changes afaict, I just make those changes in my own headspace, or in contact with one or two other minds. Indeed, I have tried getting feedback on my work in this fashion and it’s almost universally failed to be helpful—not because everyone is terrible, but because it’s really hard to get someone loaded enough to give me relevant feedback at all.

Another way to put it: this sort of serial iteration can happen without publishing often, or even at all. It’s possible to do it on your own, in which case the question is more about what kind of feedback is valuable, and how much it makes sense to push for legibility versus pursuing the interesting thread formatted in your mentalese. I don’t really see one as obviously better than the other in general, and I think that doing either blindly can be pretty costly, so I'm wary of it being advocated as such.

RSPs are pauses done right

aysja3mo101

The first RSP was also pretty explicit about their willingness to unilaterally pause:

Note that ASLs are defined by risk relative to baseline, excluding other advanced AI systems.... Just because other language models pose a catastrophic risk does not mean it is acceptable for ours to.

Which was reversed in the second:

It is possible at some point in the future that another actor in the frontier AI ecosystem will pass, or be on track to imminently pass, a Capability Threshold… such that their actions pose a serious risk for the world. In such a scenario, because the incremental increase in risk attributable to us would be small, we might decide to lower the Required Safeguards.

A glimpse of the other side

aysja3mo20

Why Is Printing So Bad?

aysja3mo102

Relatedly, I often feel like I'm interfacing with a process that responded to every edge case with patching. I imagine this is some of what's happening when the poor printer has to interface with a ton of computing systems, and also why bureaucracies like the DMV seem much more convoluted than necessary. Since each time an edge case comes up the easier thing is to add another checkbox/more red tape/etc, and no one is incentivized enough to do the much harder task of refactoring all of that accretion. The legal system has a bunch of this too, indeed I just had to sign legal documents which were full of commitments to abstain from very weird actions (why on Earth would anyone do that?). But then you realize that yes, someone in fact did that exact thing, and now it has to be forever reflected there.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments