Raemon — LessWrong

LessWrong team member / moderator. I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.

Yeah I'll be working out kinks like this tomorrow

Part of the reason I'm rolling the dice on running Solstice the way I am, is, it doesn't really seem like we have the luxury of not engaging with the question. (But, there's a reason I wrote this post including option #1 – if I didn't think I had a decent chance of pulling it off I'd have done something different)

FYI I am also planning an aftercare / decompression / chat around a firepit thing for people who need that afterwards.

This didn't really do what I wanted. For starters, literally quoting Richard is self-defeating – either it's reasonable to make this sort of criticism, or it's not. If you think there is something different between your post and Richard's comment, I don't know what it is and why you're doing the reverse-quote except to be sorta cute.

I don't even know why you think Richard's comment is "primarily doing the social move of lower trust in what Mikhail says". Richard's comment gives examples of why he thinks that about your post, you don't explain what you think is charitable about his.

I think it is necessary sometimes to argue that people are being uncharitable, and looking they are doing a status-lowering move more than earnest truthseeking.

I haven't actually looked at your writing and don't have an opinion I'd stand by, but from my passing glances at it I did think Richard's comment seemed to be pointing at an important thing.

(downvoted because you didn't actually spell out what point you're making with that rephrase. You think nobody should ever call people out for doing social moves? You think Richard didn't do a good job with it?)

See also Writing That Provokes Comments (step 1: be wrong)

(note, if you make a LW event with the "solstice" tag it shows up on the globe on the home page)

From my perpsective, the biggest point of the political work is to buy time to do the technical work.

(And yeah there's a lot of disagreement about what sort of technical work needs doing and what counts as progress, but that doesn't mean there isn't work to do, it just means it's confusing and difficult)

apologies, I hadn't actually read the post at the time I commented here.

In an earlier draft of the comment I did include a line that was "but, also, we're not even really at the point where this was supposed to be happening, the AIs are too dumb", I removed it in a pass that was trying to just simplify the whole comment.

But as of last-I-checked (maybe not in the past month), models are just nowhere near the level of worldmodeling/planning competence where scheming behavior should be expected.

(Also, as models get smart enough that this starts to matter: the way this often works in humans is human's conscious planning verbal loop ISN'T aware of their impending treachery, they earnestly believe themselves when they tell the boss "I'll get it done" and then later they just find themselves goofing off instead, or changing their mind)

Unless you're posing a non-smooth model where we're keeping them at bay now but they'll increase later on?

This is what the "alignment is hard" people have been saying for a long time. ~~(Some search terms here include "treacherous turn" and "sharp left turn")~~

~~https://www.lesswrong.com/w/treacherous-turn~~

~~A central AI alignment problem: capabilities generalization, and the sharp left turn~~

(my bad, hadn't read the post at the time I commented so this presumably came across cluelessly patronizing)

Seems coherent, my skeptical brain next asks "how do we know you are learning to distinguish fine-grained attention, instead of confabulating a new type of thing?"

Good question! Your experience is entirely normative.

Also I'm not 100% sure what "normative" means in this context.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments