One of the subthreads in Thomas Kwa's MIRI research experience was about his experience working with Nate. In the comments, some other people brought up their own negative experiences. There was a lot of ensuing discussion about it.
Thomas felt this was distracting from the points he was most interested in (e.g. how infohazard policies slow down research, how new researchers can stop flailing around so much, whether deconfusion is a bottleneck to alignment, or the sharp left turn, etc).
I also somewhat regretted curating the post since we normally avoid curating "community politics" posts, and while the post had a lot of timeless content in both the OP and the discussion, it ended up being a major focus of the comments.
So, I'm moving those comments to this escape-valve-post, where the discussion can continue in whatever direction people end up taking while leaving the original post to focus on more timeless topics that are relevant whether or not you're plugged into particular social scenes.
I want to say some things about the experience working with Nate, I’m not sure how coherent this will be.
Reflections on working with Nate
I think jsteinhardt is pretty correct when he talks about psychological safety, I think our conversations with Nate often didn’t feel particularly “safe”, possibly because Nate assumes his conversation partners will be as robust as him.
Nate can pretty easily bulldoze/steamroll over you in conversation, in a way that requires a lot of fortitude to stand up to, and eventually one can just kind of give up. This could happen if you ask a question (and maybe the question was confused in some way) and Nate responds with something of a rant that makes you feel dumb for even asking the question. Or often we/I felt like Nate had assumed we were asking a different thing, and would go on a spiel that would kind of assume you didn’t know what was going on. This often felt like rounding your statements off to the dumbest version. I think it often did turn out that the questions we asked were confused, this seems pretty expected given that we were doing deconfusion/conceptual work where part of the aim is to work out which questions are reasonable to ask.
I think it should have been possible for Nate to give feedback in a way that didn’t make you feel sad/bad or like you shouldn’t have asked the question in the first place. The feedback we often got was fairly cutting, and I feel like it should be possible to give basically the exact same feedback without making the other person feel sad/bad/frustrated.
Nate would often go on fairly long rants (not sure there is a more charitable word), and it could be hard to get a word in to say “I didn’t really want a response like this, and I don’t think it’s particularly useful”.
Sometimes it seemed like Nate was in a bad mood (or maybe our specific things we wanted to talk about caused him a lot of distress and despair). I remember feeling pretty rough after days that went badly, and then extremely relieved when they went well.
Overall, I think the norms of Nate-culture are pretty at-odds with standard norms. I think in general if you are going to do something norm-violating, you should warn the people you are interacting with (which did eventually happen).
Positive things
Nate is very smart, and it was clearly taxing/frustrating to work with us much of the time. In this sense he put in a bunch of effort, where the obvious alternative is to just not talk to us. (This is different than putting in effort into making communication go well or making things easy for us).
Nate is clearly trying to solve the problem, and has been working on it for a long time. I can see how it would be frustrating when people aren’t understanding something that you worked out 10 years ago (or were possibly never confused about in the first place). I can imagine that it really sucks being in Nate’s position, feeling the world is burning, almost no one is trying to save it, those who are trying to save it are looking at the wrong thing, and even when you try to point people at the thing to look at they keep turning to look at something else (something easier, less scary, more approachable, but useless).
We actually did learn a bunch of things, and I think most/all of us feel like we can think better about alignment than before we started. There is some MIRI/Nate/Eliezer frame of the alignment problem that basically no one else has. I think it is very hard to work this out just from MIRI’s public writing, particularly the content related to the Sharp Left Turn. But from talking to Nate (a lot), I think I do (partially) understand this frame, I think this is not nonsense, and is important.
If this frame is the correct one, and working with Nate in a somewhat painful environment is the only way to learn it, then this does seem to be worth it. (Note that I am not convinced that the environment needed to be this hard, and it seems very likely to me that we should have been able to have meetings which were both less difficult and more productive).
It also seems important to note that when chatting with Nate about things other than alignment the conversations were good. They didn’t have this “bulldozer” quality, they were frequently fun and kind, and didn’t feel “unsafe”.
I have some empathy for the position that Nate didn’t really sign up to be a mentor, and we suddenly had all these expectations for him. And then the project kind of morphed into a thing where we expected Nate-mentorship, which he did somewhat grudgingly, and assumed that because we kept requesting meetings that we were ok with dealing with the communication difficulties.
I would probably ex post still decide to join the project
I think I learned a lot, and the majority of this is because of Nate’s mentorship. I am genuinely grateful for this.
I do think that the project could have been more efficient if we had better communication, and it does feel (from my non-Nate perspective) that this should have been an option.
I think that being warned/informed earlier about likely communication difficulties would have helped us prepare and mitigate these, rather than getting somewhat blindsided. It would also have just been nice to have some explicit agreement for the new norms, and some acknowledgement that these are not standard communication norms.
I feel pretty conflicted about various things. I think that there should clearly be incentives such that people with power can’t get away with being disrespectful/mean to people under them. Most people should be able to do this. I think that sometimes people should be able to lay out their abnormal communication norms, and give people the option of engaging with them or not (I’m pretty confused about how this interacts with various power dynamics). I wouldn’t want strict rules on communication stopping people like Nate being able to share their skills/knowledge/etc with others; I would like those others to be fully informed about what they are getting into.
I think the 2021 MIRI Conversations and 2022 MIRI Alignment Discussion sequences are an attempt at this. I feel like I have a relatively good handle on their frame after reading those sequences, and I think the ideas contained within are pretty insightful.
Like Zvi, I might be confused about how confused I am, but I don't think it's because they're trying to keep their views secret. Maybe there's some more specific capabilities-adjacent stuff they're not sharing, but I suspect the thing the grandparent is getting at is more about a communication diffi... (read more)