adamShimi

Epistemologist specialized in the difficulties of alignment and how to solve AI X-Risks. Currently at Conjecture.

Blogging at For Methods.

Twitter.

Sequences

Building Blocks
Becoming Stronger as Epistemologist
Epistemic Cookbook for Alignment
Reviews for the Alignment Forum
AI Alignment Unwrapped
Deconfusing Goal-Directedness

Wiki Contributions

Comments

Sorted by

I agree that it is in the text. If it wasn't clear, my message was trying to reverse engineer why I bounced off, which is more about my experience of reading than fully about the text.

I remember reading this post, and really disliking it.

Then today, as I was reflecting on things, I recalled that this existed, and went back to read it. And this time, my reaction was instead "yep, that's pointing to the mental move that I've lost and that I'm now trying to relearn".

Which is interesting. Because that means a year or two ago, up till now, I was the kind of people who would benefit from this post; yet I couldn't get the juice out of it. I think a big reason is that while the description of the play/fun mental move is good and clear, the description of the opposite mental move, the one short-circuiting play/fun, felt very caricatural and fake.

My conjecture (though beware mind fallacy), is that it's because you emphasize "naive deference" to others, which looks obviously wrong to me and obviously not what most people I know who suffer from this tend to do (but might be representative of the people you actually met).

Instead, the mental move that I know intimately is what I call "instrumentalization" (or to be more memey, "tyranny of whys"). It's a move that doesn't require another or a social context (though it often includes internalized social judgements from others, aka superego); it only requires caring deeply about a goal (the goal doesn't actually matter that much), and being invested in it, somewhat neurotically.

Then, the move is that whenever a new, curious, fun, unexpected idea pop up, it hits almost instantly a filter: is this useful to reach the goal?

Obviously this filter removes almost all ideas, but even the ones it lets through don't survive unharmed: they get trimmed, twisted, simplified to fit the goal, to actually sound like they're going to help with the goal. And then in my personal case, all ideas start feeling like should, like weight and responsibility and obligations.

Anyway, I do like this post now, and I am trying to relearn how to use the "play" mental move without instrumentalizing everything away.

adamShimiΩ220

Thanks for the comment!

We have indeed gotten the feedback by multiple people that this part didn't feel detailed enough (although we got this much more from very technical readers than from non-technical ones), and are working at improving the arguments.

adamShimiΩ340

Thanks for the comment!

We'll correct the typo in the next patch/bug fix.

As for the more direct adversarial tone of the prologue, it is an explicit choice (and is contrasted by the rest of the document). For the moment, we're waiting to get more feedback on the doc to see if it really turns people off or not.

adamShimiΩ230

Yep, I think you're correct.

Will correct in the next minor update. Thanks!

adamShimi5-2

Thanks for the comment!

We'll consider this point for future releases, but personally, I would say that this kind of hedging also has a lot of downsides: it makes you sound far more uncertain and defensive than you really want to.

This document tries to be both grounded and to the point, and so we by default don't want to put ourselves in a defensive position when arguing things that we think make sense and are supported by the evidence.

Load More