I probably qualify as one of the people you're describing.
My reasoning is that we are in the fortunate position of having AI that we can probably ask to do our alignment homework for us. Prior to two or three years ago it seemed implausible that we would get an AI that would:
* care about humans a lot, both by revealed preferences and according to all available interpretability evidence
* be quite smart, smarter than us in many ways, but not yet terrifyingly/dangerously smart
But we have such an AI. Arguably we have more than one such. This is good! We lucked out!
Eliezer has been saying for some time that one of his proposed solutions to the Alignment Problem is to shut down all AI research and to genetically engineer a generation of Von Neumanns to do the hard math and philosophy. This path seems unlikely to happen. However, we almost have a generation of Von Neumanns in our datacenters. I say almost because they are definitely not there yet, but I think, based on an informed awareness of LLM development capabilities and plausible mid-term trajectories, that we will soon have access to arbitrarily many copies of brilliant-but-not-superintelligent friendly AIs who care about human wellbeing, and will be more than adequate partners in the development of AI Alignment theory.
I can foresee many objections and critiques of this perspective. On the highest possible level, I acknowledge that using AI to do our AI Alignment homework carries risks. But I think these risks are clearly more favorable to us than the risks we all thought we would be facing in the late part of the early part of the Singularity. For example, what we don't have is a broadly generally-capable version of AlphaZero. We have something that landed in just the right part of the intelligence space where it can help us quite a lot and probably not kill us all.
In my own writing I am very conscious of whether I’m writing from a place of inspiration.
All my most successful posts came to me as a vibrant and compelling idea that very quickly took shape in my mind and ended up being finished and posted quickly. What made them clear and living in my mind is what made them readable and engaging to readers, my job was mainly to stay out of my own way, to translate that lightning bolt of thought into writing.
There’s a symmetry there: it was easy to write because the idea was so clear to me in my own mind, and this clarity is also what makes it enjoyable to read. If you don’t quite know exactly what you’re trying to say, that problem isn’t going to be overcome by more “effort” at the prose level.
Unfortunately you can’t force inspiration, or at least I haven’t figured out how to do it. I have a lot of drafts that never go posted because that inspiration/clarity wasn’t there.
Good work.
The hardest part of moderation is the need to take action in cases where someone is consistently doing something that imposes a disproportionate burden on the community and the moderators, but which is difficult to explain to a third party unambiguously.
Moderators have to be empowered to make such decisions, even if they can’t perfectly justify them. The alternative is a moderation structure captured by proceduralism, which is predictably exploitable by bad actors.
That said — this is Less Wrong, so there will always be a nitpick — I do think people need to grow a thicker skin. I have so many friends who have valuable things to say, but never post on LW due to a feeling of intimidation. The cure for this is, IMO, not moderating the level of meanness of the commentariat, but encouraging people to learn to regulate their emotions in response to criticism. However, at the margins, clipping off the most uncharitable commenters is doubtless valuable.
Sorry, that’s what I get for replying from the Notification interface.
I'm not sure if I understand your question. I am using the initial quotes from Stoic/Buddhist texts as examples of perverse thinking that I don't endorse.
As to (1), I was following The Mind Illuminated, for what it's worth. And I am a big fan of emotional integration. Spiritual practices can help with that, but I think they can also get in the way, and it's really hard to know in advance which direction you're going.
I think we are basically on the same page with (2).
As for (3) I think it's a matter of degree, requiring the kind of nuance that doesn't fit on a bumper sticker. If you feel so much persistent guilt that it's causing daily suffering, then that's probably something you need to sort out. I was intentional in adding the phrase "for a bit" in "It's okay to feel bad for a bit," because I don't actually think it's okay to feel persistently bad forever! Those are definitely two different situations. If you have ongoing intrusive negative emotions, that sounds adjacent to trauma, and that can be sorted out with some work.
I always appreciate your insights and opinions on this general topic.
At the time, I was following the instructions in The Mind Illuminated very closely. I will grant that this may have been user error/skill issue, but given that The Mind Illuminated is often put forth as a remarkably accessible and lucid map through the stages of vipassana, and given that I still went this badly wrong, you have to wonder if maybe the path itself is perhaps too dangerous to be worth it.
The outcome I reached may have been predictable, given that the ultimate reason I was meditating at the time was to get some relief from the the ongoing suffering of a chronic migraine condition. In that specific sense, I was seeking detachment.
In the end I am left wondering if I would have been better off if I had taken up mountain biking instead of meditation, given that it turned out that the path to integrating my emotions led through action more than reflection.
This post resonated with me when it came out, and I think its thesis only seems more credible with time. Anthropic's seminal "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet" (the Golden Gate Claude paper) seems right in line with these ideas. We can make scrutable the inscrutable as long as the inscrutable takes the form of something organized and regular and repeatable.
This article gets bonus points for me for being succinct and while still making its argument clearly.
My favorite Less Wrong posts are almost always the parables and the dialogues. I find it easier to process and remember information that is conveyed in this way. They're also simply more fun to read.
This post was originally written as an entry for the FTX Future Fund prize, which, at the time of writing the original draft, was a $1,000,000 prize, which I did not win, partly because it wasn't selected as the winner and partly because FTX imploded and the prize money vanished. (There is a lesson about the importance of proper calibration of the extrema of probability estimates somewhere in there.) In any case, I did not actually think I would win, because I was basically making fun of the contest organizers by pointing out that the whole ethos behind their prize specification was wrong. At the time, there was a live debate around timelines, and a lot of discussions about the bio-anchors paper, which itself made in microcosm the same mistakes that I was pointing at.
Technically, the very-first-draft of this post was an extremely long and detailed argument for short AGI timelines that I co-wrote with my brother, but I realized while writing it that the presumption that long and short timelines should be in some sense averaged together to get a better estimate was pervasive in the zeitgeist and needed to be addressed on its own.
I am happy with this post because it started a conversation that I thought needed to be had. My whole shtick these days is that our community has seemingly tried to skip over decision theory basics in favor of esoterica, to our collective detriment, and I feel like writing this post explicitly helped with that.
I am happy to have seen this post referenced favorably elsewhere. I think I wrote it about as well as I could have, given that I was going for the specific Less Wrong Parable stylistic thing and not trying to write literary fiction.
You know those videos where a dog tries to carry a large stick through an opening in a fence, and the stick is too long to fit so it just keeps bumping against the verticals, and it’s obvious to a person watching that the dog would easily get the stick through if it just turned sideways, or rotated its head, or dragged the stick through by one end, or basically did anything at all other than what it is currently doing?
The other day I had on the kitchen counter a sort of floppy cloth place mat that was covered in crumbs and food debris. I tried to lift it and kind of bend it and then pour the crumbs and stuff into the sink. But because it was floppy and soft, instead I poured the crumbs all over the counter and floor. Maybe one-third made it into the sink.
My sister-in-law watched me do all this with the same expression you have on your face when you watch the dog try to get through the gate. We had a good laugh about it.
The point of this story is that smug intellectual superiority is really difficult to maintain when you think about all the moronic buffoonery that you have committed in your life. About all the absolute dumbass mistakes you’ve made. If you really contextualize your own self-image objectively with respect to your brilliancies and your blunders then you can’t help but see yourself as a kind of ridiculous Don Quixote-esque clown, a figure deserving more of bemused pity than anything.
And then you realize, this is just a description of mankind. You and me, benighted fools, are the proper referent for “people.” Dogs baffled by gates, bravely slamming our sticks against the verticals, and sometimes being struck by enough lucky inspiration to think of turning our heads. Locally smart, perhaps, when it concerns our favorite subjects. Capable of amazing things when we’re at our best. But really — try to remember the last time you returned a wave meant for the person standing behind you, while maintaining any sense of general misanthropy. It’s hard not to realize we’re all in this circus together.