Ok, so we both had some feelings about the recent Conjecture post on "lots of people in AI Alignment are lying", and the associated marketing campaign and stuff.
I would appreciate some context in which I can think through that, and also to share info we have in the space that might help us figure out what's going on.
I expect this will pretty quickly cause us to end up on some broader questions about how to do advocacy, how much the current social network around AI Alignment should coordinate as a group, how to balance advocacy with research, etc.
Feelings about Conjecture post:
- Lots of good points about people not stating their full beliefs messing with the epistemic environment and making it costlier for others to be honest.
- The lying and cowardice frames feel off to me.
- I personally used to have a very similar rant to Conjecture. Since moving to DC, I'm more sympathetic to those working in with government. We could try to tease out why.
- The post exemplifies a longterm gripe I have with Conjecture's approach to discourse & advocacy, which I've found pretty lacking in cooperativeness and openness (Note: I worked there for ~half a year.)
Questions on my mind:
- How open should people motivated by existential risk be? (My shoulder model of several people says "take a portfolio approach!" - OK, then what allocation?)
- How advocacy-y should people be? I want researchers to not have to tweet their beliefs 24/7 so they can actually get work done
- How do you think about this, Oli?
How sympathetic to be about governance people being private about some motivations and affiliations
I personally used to have a very similar rant to Conjecture. I'm now more sympathetic to governance people. We could try to tease out why.
This direction seems most interesting to me!
My current feelings in the space are that I am quite sympathetic to some comms-concerns that people in government have and quite unsympathetic to some other stuff, and I would also like to clarify for myself where the lines here are.
Curious whether you have any key set of observations or experiences you had that made you more sympathetic.
Observations
I've heard secondhand of at least one instance where a person brought up x risk, then their Congressional office took them less seriously. Other staffers have told me talking about x risk wouldn't play well (without citing specific evidence, but I take their opinions seriously).
- (This didn't update me a ton though. My model already included "most people will think this is weird and take you less seriously". The question is, "Do you make it likelier for people to do good things later, all things considered by improving their beliefs, shifting the Overton window, or convincing 1/10 people, etc.?")
I've also personally found it tricky to talk about takeover & existential risks, just because these ideas take a long time to explain, and there are many inferential steps between there and the policies I'm recommending. So, I'm often tempted to mention my x risk motivations only briefly, then focus on whatever's inferentially closest and still true. (Classically, this would be "misuse risks, especially from foreign adversaries and terrorists" and "bioweapon and cyberoffensive capabilities coming in the next few years".)
Separate point which we might want to discuss later
A thing I'm confused about is:
Should I talk about inferentially close things that makes them likeliest to embrace the policies I'm putting on their desk,
Or, should I just bite the bullet of being confusing and start many meetings with "I'm deeply concerned about humanity going extinct in the next decade because of advancing AI which might try to take over the world. It's a lot to explain but the scientists are on my side. Please help." — where the thing I'm trying to emphasize is the tone of worry.
Because I buy that we're systematically misleading people about how worried we are / they should be by not focusing on our actual concerns, and by not talking about them with a tone that conveys how worried we in fact are.
(Additional explanation added post-dialogue:
- I was trying to differentiate between two issues with people not openly sharing & focusing on their existential risk worries:
- Issue 1 is that by not focusing on your existential risk worries, you're distorting people's sense of what you think is actually important and why. I think Habryka & Conjecture are correct to point out this is misleading (I think in the sense of harmful epistemic effects, rather than unethical intentions).
- Issue 2, which I'm trying to get at above, is about the missing mood. AI governance comms often goes something like "AI has immense potential, but also immense risks. AI might be misused by China, or get of control. We should balance the needs for innovation and safety." I wouldn't call this lying (though I agree it can have misleading effects, see Issue 1). The thing I would emphasize is that it doesn't sound like someone who thinks we all might die. It doesn't convey "AI advancing is deeply scary. Handling this might require urgent, extraordinarily difficult and unprecedented action." As such, I suspect it's not causing people to take the issue seriously enough to do the major governance interventions we might need. Sure, we might be getting government officials to mention catastrophic risk in their press releases, but do they really take 10% p(doom) seriously? If not, our comms seems insufficient.
- Of course, I doubt we should always use the "deeply concerned" tone. It depends on what we're trying to do. I'm guessing the question is how much are we trying to get policies through now vs. trying to get people to take the issue as seriously as us? Also, I admit it's even more complicated because sometimes sounding grave gets you laughed out of rooms instead of taken seriously, different audiences have different needs, etc.)
So, I think for me, I feel totally sympathetic to people finding it hard and often not worth it to explain their x-risk concerns, if they are talking to people who don't really have a handle for that kind of work yet.
Like, I have that experience all the time as well, where I work with various contractors, or do fundraising, and I try my best to explain what my work is about, but it does sure often end up rounded off to some random preconception they have (sometimes that's standard AI ethics sometimes that's cognitive science and psychology, sometimes that's people thinking I run a web-development startup that's trying to maximize engagement metrics).
The thing that I am much less sympathetic to is people being like "please don't talk about my connections to this EA/X-Risk ecosystem, please don't talk about my beliefs in this space to other people, please don't list me as having been involved with anything in this space publicly, etc."
Or like, the thing with Jason Matheny that I mentioned in a comment the other day where the senator that was asking him a question had already mentioned "apocalyptic risks from AI" and was asking Matheny how likely and how soon that would happen, and Matheny just responded with "I don't know".
Those to me don't read as people having trouble crossing an inferential distance. They read to me more as trying to be intentionally obfuscatory/kind-of-deceptive about their beliefs and affiliations here, and that feels much more like it's crossing lines.
Regarding "please don't talk about my connections or beliefs" ––
I'm not sure how bad I feel about this. I usually like openness. But I'm also usually fine with people being very private (but answering direct questions honestly). E.g. it feels weird to write online about two people dating when they'd prefer that info is private. I'm trying to sort out what the difference between that and EA-affiliation is now....
OK maybe the difference is in why they're being private: "who I'm dating" just feels personal and "EA association" is about hurting their ability do stuff". I take that point, and I share some distaste for it.
Anecdote: A few times AI governance people (FWIW: outside DC, who weren't very central to the field) told me that if someone asked how we knew each other, they would downplay our friendship, and they requested I do the same. I felt pretty sad and frustrated by the position this put me in, where I would be seen as defecting for being honest.
All this said, journalists and the political environment can be very adversarial. It seems reasonable and expected to conceal stuff when you know info will be misunderstood if not intentionally distorted. Thoughts?
I think the one thing that feels bad to me is people in DC trying to recruit fellow EAs or X-risk concerned people into important policy positions while also explicitly asking them to keep the strength of their personal connections on the down-low, and punishing people who openly talk about who they knew and their social relationships within DC. I've heard this a lot from a bunch of EAs in DC and it seems to be a relatively common phenomenon.
I am also really worried that it is the kind of thing that as part of its course masks the problems it causes (like, we wouldn't really know if this is causing tons of problems, because people are naturally incentivized to cover the problems up that are caused by it, and because making accusations of this kind of conspiratorial action is really high-stakes and hard), and so if it goes wrong, it will probably go wrong quickly and with a lot of pent-up energy.
I do totally think there are circumstances where I will hide things from other people with my friends, and be quite strategic about it. The classical examples are discussing problems with some kind of authoritarian regime while you are under it (Soviet Russia, Nazi Germany, etc.), and things like "being gay" where society seems kind of transparently unreasonable about it, and in those situations I am opting into other people getting to impose this kind of secrecy and obfuscation request on me. I also feel kind of similar about people being polyamorous, which is pretty relevant to my life, since that still has a pretty huge amount of stigma attached to it.
I do think I experienced a huge shift after the collapse of FTX where I was like "Ok, but after you caused the biggest fraud since Enron, you really lost your conspiracy license. Like, 'the people' (broadly construed) now have a very good reason for wanting to know the social relationships you have, because the last time they didn't pay attention to this it turns out to have been one of the biggest frauds of the last decade".
I care about this particularly much because indeed FTX/Sam was actually really very successful at pushing through regulations and causing governance change, but as far as I can tell he was primarily acting with an interest in regulatory capture (having talked to a bunch of people who seem quite well-informed in this space, it seems that he was less than a year away from basically getting a US sponsored monopoly on derivatives trading in the US via regulatory capture). And like, I can't distinguish the methods he was using from the methods other EAs are using right now (though I do notice some difference).
(Edited note: I use "conspiracy" and "conspiratorial" a few times in this conversation. I am not hugely happy with that choice of words since it's the kind of word that often imports negative associations without really justifying them. I do somewhat lack a better word for the kind of thing that I am talking about, and I do intend some of the negative associations, but I still want to flag that I think accusing things of being "conspiracies" is a pretty frequent underhanded strategy that people use to cause negative associations with something without there being anything particularly clear to respond to.
In this case I want to be clear that what I mean by "conspiratorial" or "conspiracy" is something like "a bunch of people trying pretty hard to hide the existence of some alliance of relatively well-coordinated people, against the wishes of other people, while doing things like lying, omitting clearly relevant information, exaggerating in misleading ways, and doing so with an aim that is not sanctioned by the larger group they are hiding themselves from".
As I mention in this post, I think some kinds of conspiracies under this definition are totally fine. As an example I bring up later, I totally support anti-nazi conspiracies during WW2.)
Yeah, I was going to say I imagine some people working on policy might argue they're in a situation where hiding it is justified because EA associations have a bunch of stigma.
But given FTX and such, some stigma seems deserved. (Sigh about the regulatory capture part — I wasn't aware of the extent.)
For reference, the relevant regulation to look up is the Digitial Commodities Consumer Protection Act
Feelings about governance work by the XR community
I'm curious for you to list some things governance people are doing that you think are bad or fine/understandable, so I can see if I disagree with any.
I do think that at a high level, the biggest effect I am tracking from the governance people is that in the last 5 years or so, they were usually the loudest voices that tried to get people to talk less about existential risk publicly, and to stay out of the media, and to not reach out to high-stakes people in various places, because they were worried that doing so would make us look like clowns and would poison the well.
And then one of my current stories is that at some point, mostly after FTX when people were fed up with listening to some vague EA conservative consensus, a bunch of people started ignoring that advice and finally started saying things publicly (like the FLI letter, Eliezer's time piece, the CAIS letter, Ian Hogarth's piece). And then that's the thing that's actually been moving things in the policy space.
My guess is we maybe could have also done that at least a year earlier, and honestly I think given the traction we had in 2015 on a lot of this stuff, with Bill Gates and Elon Musk and Demis, I think there is a decent chance we could have also done a lot of Overton window shifting back then, and us not having done so is I think downstream of a strategy that wanted to maintain lots of social capital with the AI capability companies and random people in governments who would be weirded out by people saying things outside of the Overton window.
Though again, this is just one story, and I also have other stories where it all depended on Chat-GPT and GPT-4 and before then you would have been laughed out of the room if you had brought up any of this stuff (though I do really think the 2015 Superintelligence stuff is decent evidence against that). It's also plausible to me that you need a balance of inside and outside game stuff, and that we've struck a decent balance, and that yeah, having inside and outside game means there will be conflict between the different people involved in the different games, but it's ultimately the right call in the end.
While you type, I'm going to reflect on my feelings about various things governance people might do:
- Explicitly downplaying one's worries about x risk, eg. saying "there's some chance of catastrophe" when you think it's overwhelmingly likely: dislike
- Edited in: Of course, plenty of people don’t think it’s overwhelmingly likely. They shouldn’t exaggerate either. I just want people to be clear enough such that the listener’s expectations are as close to reality as you can reasonably get them.
- Explicitly downplaying one's connections to EA: dislike
- Asking everyone to be quiet and avoid talking to media because of fears of well-poisoning: dislike
- Asking everyone to be quiet because of fears of the government accelerating AI capabilities: I don't know, leans good
- Asking people not to bring up your EA associations and carefully avoiding mentioning it yourself: I don't know, depends
- Why this might be okay/good: EA has undeserved stigma, and it seems good for AI policy to become/keep themselves separate from EA.
- Why this might be bad: Maybe after FTX, anyone associated with EA should try extra hard to be open, in case anything else bad is happening in their midst
- Current reconciliation: Be honest about your associations if asked and don't ask others to downplay them, but feel free to purposefully distance yourself from EA and avoid bringing it up
- Not tweeting about or mentioning x-risk in meetings where inferential gap is too big: I don't know, leans fine
- I'm conflicted between "We might be able to make the Overton window much wider and improve everyone's sanity by all being very open and explicit about this often" and "It's locally pretty burdensome to frequently share your views to people without context"
The second biggest thing I am tracking is a kind of irrecoverable loss of trust that I am worried will happen between "us" and "the public", or something in that space.
Like, a big problem with doing this kind of information management where you try to hide your connections and affiliations is that it's really hard for people to come to trust you again afterwards. If you get caught doing this, it's extremely hard to rebuild trust that you aren't doing this in the future, and I think this dynamic usually results in some pretty intense immune reactions when people fully catch up with what is happening.
Like, I am quite worried that we will end up with some McCarthy-esque immune reaction to EA people in the US and the UK government where people will be like "wait, what the fuck, how did it happen that this weirdly intense social group with strong shared ideology is now suddenly having such an enormous amount of power in government? Wow, I need to kill this thing with fire, because I don't even know how to track where it is, or who is involved, so paranoia is really the only option".
On "CAIS/Ian/etc. finally just said they're really worried in public"
I think it's likely the governance folk wouldn't have done this themselves at that time, had no one else done it. So, I'm glad CAIS did.
I'm not convinced people could have loudly said they're worried about AI extinction from AI pre-ChatGPT without the blowback people feared.
On public concern,
I agree this is possible...
But I'm also worried about making the AI convo so crowded, by engaging with the public a lot, that x-risk doesn't get dealt with. I think a lot of the privacy isn't malicious but practical. "Don't involve a bunch of people in your project who you don't actually expect to contribute, just because they might be unhappy if they're not included".
Stigmas around EA & XR in the policy world
I'm conflicted between "EA has undeserved stigma" and "after FTX, everyone should take it upon themselves to be very open"
I am kind of interested in talking about this a bit. I feel like it's a thing I've heard a lot, and I guess I don't super buy it. What is the undeserved stigma that EA is supposed to have?
Is your claim that EA's stigma is all deserved?
Laying out the stigmas I notice:
- Weird beliefs lead to corruption, see FTX
- Probably exaggerates the extent of connection each individual had to FTX. But insofar as FTX's decisions were related to EA culture, fair enough.
- Have conflicts with the labs, see OpenAI & Anthropic affiliations
- Fair enough
- Elite out-of-touch people
- Sorta true (mostly white, wealthy, coastal), sorta off (lots of people didn't come from wealth, and they got into this space because they wanted to be as effectively altruistic as possible)
- Billionaires selfish interests
- Seems wrong; we're mostly trying to help people rather than make money
- Weird longtermist techno-optimists who don't care about people
- Weird longtermists - often true.
- Techno-optimistics - kinda true, but some of us are pretty pessimistic about AI and using AI to solve that problem
- Don't care about people - mostly wrong.
I guess the stigmas seem pretty directionally true about the negatives, and just miss that there is serious thought / positives here.
If journalists said, "This policy person has EA associations. That suggests they've been involved with the community that's thought most deeply about catastrophic AI risk and has historically tried hard and succeeded at doing lots of good. It should also raise some eyebrows, see FTX," then I'd be fine. But usually it's just the second half, and that's why I'm sympathetic to people avoiding discussing their affiliation.
Yeah, I like this analysis, and I think it roughly tracks how I am thinking about it.
I do think the bar for "your concerns about me are so unreasonable that I am going to actively obfuscate any markers of myself that might trigger those concerns" is quite high. Like I think the bar can't be at "well, I thought about these concerns and they are not true", it has to be at "I am seriously concerned that when the flags trigger you will do something quite unreasonable", like they are with the gayness and the communism-dissenter stuff.
Fair enough. This might be a case of governance people overestimating honesty costs / underestimating benefits, which I still think they often directionally do.
(I'll also note, what if all the high profile people tried defending EA? (Defending in the sense of - laying out the "Here are the good things; here are the bad things; here's how seriously I think you should take them, all things considered."))
I don't think people even have to defend EA or something. I think there are a lot of people who I think justifiably would like to distance themselves from that identity and social network because they have genuine concerns about it.
But I think a defense would definitely open the door for a conversation that acknowledges that of course there is a real thing here that has a lot of power and influence, and would invite people tracking the structure of that thing and what it might do in the future, and if that happens I am much less concerned about both the negative epistemic effects and the downside risk from this all exploding in my face.
Interesting.
Do you have ideas for ways to make the thing you want here happen? What does it look like? An op-ed from Person X?
How can we make policy stuff more transparent?
Probably somewhat controversially, but I've been kind of happy about the Politico pieces that have been published. We had two that basically tried to make the case there is an EA conspiracy in DC that has lots of power in a kind of unaccountable way.
Maybe someone could reach out to the author and be like "Ok, yeah, we are kind of a bit conspiratorial, sorry about that. But I think let's try to come clean, I will tell you all the stuff that I know, and you take seriously the hypothesis that we really aren't doing this to profit off of AI, but because we are genuinely concerned about catastrophic risks from AI".
That does seem like kind of a doomed plan, but like, something in the space feels good to me. Maybe we (the community) can work with some journalists we know to write a thing that puts the cards on the table, and isn't just a puff-piece that tries to frame everything in the most positive way, but is genuinely asking hard questions.
Politico: +1 on being glad it came out actually!
- (I tentatively wish people had just said this themselves first instead of it being "found out". Possibly this is part of how I'll make the case to people I'm asking to be more open in the future.)
- (Also, part of my gladness with Politico is the more that comes out, the more governance people can evaluate how much this actually blocks their work -- so far, I think very little -- and update towards being more open or just be more open now because now their affiliations have been revealed)
I feel mixed the idea of writing something. If you go with journalists, I'd want to find one who seems really truth-seeking. Also, I could see any piece playing poorly; definitely collect feedback on your draft and avoid sharing info unilaterally.
Would you do this? Would you want some bigger name EA / governance person to do it?
I think I have probably sadly burdened myself with somewhat too much confidentiality to dance this dance correctly, though I am not sure. I might be able to get buy-in from a bunch of people so that I can be free to speak openly here, but it would increase the amount of work a lot, and also decent chance they don't say yes and then I would need to be super paranoid which bits I leak when working on this.
As in, you've agreed to keep too much secret?
If so, do you have people in mind who aren't as burdened by this (and who have the relevant context)?
Yeah, too many secrets.
I assume most of the big names have similar confidentiality burdens.
Yeah, ideally it would be one of the big names since that I think would meaningfully cause a shift in how people operate in the space.
Eliezer is great at moving Overton windows like this, but I think he is really uninterested in tracking detailed social dynamics like this, and so doesn't really know what's up.
Do you have time to have some chats with people about the idea or send around a Google doc?
I do feel quite excited about making this happen, though I do think it will be pretty aggressively shut down, and I feel both sad about that, and also have some sympathy in that it does feel like it somewhat inevitably involves catching some people in the cross-fire who were being more private for good reasons, or who are in a more adversarial context where the information here will be used against them in an unfair way, and I still think it's worth it, but it does make me feel like this will be quite hard.
I also notice that I am just afraid of what would happen if I were to e.g. write a post that's just like "an overview over the EA-ish/X-risk-ish policy landscape" that names specific people and explains various historical plans. Like I expect it would make me a lot of enemies.
Same, and some of my fear is "this could unduly make the 'good plans' success much harder"
Ok, I think I will sit on this plan for a bit. I hadn't really considered it before, and I kind of want to bring it up to a bunch of people in the next few weeks and see whether maybe there is enough support for this to make it happen.
Okay! (For what it's worth, I currently like Eliezer most, if he was willing to get into the social stuff)
Any info it'd be helpful for me to collect from DC folk?
Oh, I mean I would love any more data on how much this would make DC folk feel like some burden was lifted from their shoulders vs. it would feel like it would just fuck with their plans.
I think my actual plan here would maybe be more like an EA Forum post or something that just goes into a lot of detail on what is going on in DC, and isn't afraid to name specific names or organizations.
I can imagine directly going for an Op-ed could also work quite well, and would probably be more convincing to outsiders, though maybe ideally you could have both. Where someone writes the forum post on the inside, and then some external party verifies a bunch of the stuff, and digs a bit deeper, and then makes some critiques on the basis of that post, and then the veil is broken.
Got it.
Would the DC post include info that these people have asked/would ask you to keep secret?
Definitely "would", though if I did this I would want to sign-post that I am planning to do this quite clearly to anyone I talk to.
I am also burdened with some secrets here, though not that many, and I might be able to free myself from those burdens somehow. Not sure.
Ok I shall ask around in the next 2 weeks. Ping me if I don't send an update by then
Thank you!!
Concerns about Conjecture
Ok, going back a bit to the top-level, I think I would still want to summarize my feelings on the Conjecture thing a bit more.
Like, I guess the thing that I would feel bad about if I didn't say it in a context like this, is to be like "but man, I feel like some of the Conjecture people were like at the top of my list of people trying to do weird epistemically distortive inside-game stuff a few months ago, and this makes them calling out people like this feel quite bad to me".
In-general a huge component of my reaction to that post was something in the space of "Connor and Gabe are kind of on my list to track as people I feel most sketched out by in a bunch of different ways, and kind of in the ways the post complaints about" and I feel somewhat bad for having dropped my efforts from a few months ago about doing some more investigation here and writing up my concerns (mostly because I was kind of hoping a bit that Conjecture would just implode as it ran out of funding and maybe the problem would go away)
(For what it's worth, Conjecture has been pretty outside-game-y in my experience. My guess is this is mostly a matter of "they think outside game is the best tactic, given what others are doing and their resources", but they've also expressed ethical concerns with the inside game approach.)
(For some context on this, Conjecture tried really pretty hard a few months ago to get a bunch of the OpenAI critical comments on this post deleted because they said it would make them look bad to OpenAI and would antagonize people at labs in an unfair way and would mess with their inside-game plans that they assured me were going very well at the time)
(I heard a somewhat different story about this from them, but sure, I still take it as is evidence that they're mostly "doing whatever's locally tactical")
Anyway, I was similarly disappointed by the post just given I think Conjecture has often been lower integrity and less cooperative than others in/around the community. For instance, from what I can tell,
- They often do things of the form "leaving out info, knowing this has misleading effects"
- One of their reasons for being adversarial is "when you put people on the defense, they say more of their beliefs in public". Relatedly, they're into conflict theory, which leads them to favor "fight for power" > "convince people with your good arguments."
I have a doc detailing my observations that I'm open to sharing privately, if people DM me.
(I discussed these concerns with Conjecture at length before leaving. They gave me substantial space to voice these concerns, which I'm appreciative of, and I did leave our conversations feeling like I understood their POV much better. I'm not going to get into "where I'm sympathetic with Conjecture" here, but I'm often sympathetic. I can't say I ever felt like my concerns were resolved, though.)
I would be interested in your concerns being written up.
I do worry about the EA x Conjecture relationship just being increasingly divisive and time-suck-y.
Here is an email I sent Eliezer on April 2nd this year with one paragraph removed for confidentiality reasons:
Hey Eliezer,
This is just an FYI and I don't think you should hugely update on this but I felt like I should let you know that I have had some kind of concerning experiences with a bunch of Conjecture people that currently make me hesitant to interface with them very much and make me think they are somewhat systematically misleading or deceptive. A concrete list of examples:
I had someone reach out to me with the following quote:
Mainly, I asked one of their senior people how they plan to make money because they have a lot of random investors, and he basically said there was no plan, AGI was so near that everyone would either be dead or the investors would no longer care by the time anyone noticed they weren’t seeming to make money. This seems misleading either to the investors or to me — I suspect me, because it would really just be wild if they had no plan to ever try to make money, and in fact they do actually have a product (though it seems to just be Whisper repackaged)
I separately had a very weird experience with them on the Long Term Future Fund where Conor Leahy applied for funding for Eleuther AI. We told him we didn't want to fund Eleuther AI since it sure mostly seemed like capabilities-research but we would be pretty interested in funding AI Alignment research by some of the same people.
He then confusingly went around to a lot of people around EleutherAI and told them that "Open Phil is not interested in funding pre-paradigmatic AI Alignment research and that that is the reason why they didn't fund Eleuther AI".
This was doubly confusing and misleading because Open Phil had never evaluated a grant to Eleuther AI (Asya who works at Open Phil was involved in the grant evaluation as a fund member, but nothing else), and of course the reason he cited had nothing to do with the reason we actually gave. He seems to have kept saying this for a long time even after I think someone explicitly corrected the statement to him.
Another experience I had was Gabe from Conjecture reaching out to LessWrong and trying really quite hard to get us to delete the OpenAI critical comments on this post: https://www.lesswrong.com/posts/3S4nyoNEEuvNsbXt8/common-misconceptions-about-openai
He said he thought people in-general shouldn't criticize OpenAI in public like this because this makes diplomatic relationships much harder, and when Ruby told them we don't delete that kind of criticism he escalated to me and generally tried pretty hard to get me to delete things.
[... One additional thing that's a bit more confidential but of similar nature here...]
None of these are super bad but they give me an overall sense of wanting to keep a bunch of distance from Conjecture, and trepidation about them becoming something like a major public representative of AI Alignment stuff. When I talked to employees of Conjecture about these concern the responses I got also didn't tend to be "oh, no, that's totally out of character", but more like "yeah, I do think there is a lot of naive consequentialism here and I would like your help fixing that".
No response required, happy to answer any follow-up questions. Just figured I would err more on the side of sharing things like this post-FTX.
Best,
Oliver
I wish MIRI was a little more loudly active, since I think doomy people who are increasingly distrustful of moderate EA want another path, and supporting Conjecture seems pretty attractive from a distance.
Again, I'm not sure "dealing with Conjecture" is worth the time though.
Main emotional effects of the post for me
- I wish someone else had made these points, less adversarially. I feel like governance people do need to hear them. But the frame might make people less likely to engage or make the engagement.
- Actually, I will admit the post generated lots of engagement in comments and this discussion. It feels uncooperative to solicit engagement via being adversarial though.
- I'm disappointed the comments were mostly "Ugh Conjecture is being adversarial" and less about "Should people be more publicly upfront about how worried they are about AI?"
- There were several other community discussions in the past few weeks that I'll tentatively call "heated community politics", and I feel overall bad about the pattern.
- (The other discussions were around whether RSPs are bad and whether Nate Soares is bad. In all three cases, I felt like those saying "bad!" had great points, but (a) their points were shrouded in frames of "is this immoral" that felt very off to me, (b) they felt overconfident and not truth-seeking, and (c) I felt like people were half-dealing with personal grudges. This all felt antithetical to parts of LessWrong and EA culture I love.)
Yeah, that also roughly matches my emotional reaction. I did like the other RSP discussion that happened that week (and liked my dialogue with Ryan which I thought was pretty productive).
Conjecture as the flag for doomers
I wish MIRI was a little more loudly active, since I think doomy people who are increasingly distrustful of moderate EA want another path, and supporting Conjecture seems pretty attractive from a distance.
Yeah, I share this feeling. I am quite glad MIRI is writing more, but am also definitely worried that somehow Conjecture has positioned itself as being aligned with MIRI in a way that makes me concerned people will end up feeling deceived.
Two thoughts
- Conjecture does seem pretty aligned with MIRI in "shut it all down" and "alignment hard" (plus more specific models that lead there).
- I notice MIRI isn't quite a satisfying place to rally around, since MIRI doesn't have suggestions for what individuals can do. Conjecture does.
Can you say more about the feeling deceived worry?
(I didn't feel deceived having joined myself, but maybe "Conjecture could've managed my expectations about the work better" and "I wish the EAs with concerns told me so more explicitly instead of giving very vague warnings".)
Well, for better or for worse I think a lot of people seem to make decisions on the basis of "is this thing a community-sanctioned 'good thing to do (TM)'". I think this way of making decisions is pretty sus, and I feel a bit confused how much I want to take responsibility for people making decisions this way, but I think because Conjecture and MIRI look similar in a bunch of ways, and I think Conjecture is kind of explicitly is trying to carry the "doomer" flag, a lot of people will parse Conjecture as "a community-sanctioned 'good thing to do (TM)'".
I think this kind of thing then tends to fail in one of two ways:
- The person who engaged more with Conjecture realizes that Conjecture is much more controversial than they realized within the status hierarchy of the community and that it's not actually clearly a 'good thing to do (TM)', and then they will feel betrayed by Conjecture for hiding that from them and betrayed by others by not sharing their concerns with them
- The person who engaged much more with Conjecture realizes that the organization hasn't really internalized the virtues that they associate with getting community approval, and then they will feel unsafe and like the community is kind of fake in how it claims to have certain virtues but doesn't actually follow them in the projects that have "official community approval"
Both make me pretty sad.
Also, even if you are following a less dumb decision-making structure, the world is just really complicated, and especially with tons of people doing hard-to-track behind the scenes work, it is just really hard to figure out who is doing real work or not, and Conjecture has been endorsed by a bunch of different parts of the community for-real (like they received millions of dollars in Jaan funding, for example, IIRC), and I would really like to improve the signal to noise ratio here, and somehow improve the degree to which people's endorsements accurately track whether a thing will be good.
Fair. People did warn me before I joined Conjecture (but it didn't feel very different from warnings I might get before working at MIRI). Also, most people I know in the community are aware Conjecture has a poor reputation.
I'd support and am open to writing a Conjecture post explaining the particulars of
- Experiences that make me question their integrity
- Things I wish I knew before joining
- My thoughts of their lying post and RSP campaign (tl;dr: important truth to the content, but really dislike the adversarial frame)
Well, maybe this dialogue will help, if we edit and publish a bunch of it.
On that, here are a few examples of Conjecture leaving out info in what I think is a misleading way.
(Context: Control AI is an advocacy group, launched and run by Conjecture folks, that is opposing RSPs. I do not want to discuss the substance of Control AI’s arguments -- nor whether RSPs are in fact good or bad, on which question I don’t have a settled view -- but rather what I see as somewhat deceptive rhetoric.)
One, Control AI’s X account features a banner image with a picture of Dario Amodei (“CEO of Anthropic, $2.8 billion raised”) saying, “There’s a one in four chance AI causes human extinction.” That is misleading. What Dario Amodei has said is, “My chance that something goes really quite catastrophically wrong on the scale of human civilisation might be somewhere between 10-25%.” I understand that it is hard to communicate uncertainty in advocacy, but I think it would at least have been more virtuous to use the middle of that range (“one in six chance”), and to refer to “global catastrophe” or something rather than “human extinction”.
Two, Control AI writes that RSPs like Anthropic’s “contain wording allowing companies to opt-out of any safety agreements if they deem that another AI company may beat them in their race to create godlike AI”. I think that, too, is misleading. The closest thing Anthropic’s RSP says is:
Anthropic’s RSP is clearly only meant to permit labs to opt out when any other outcome very likely leads to doom, and for this to be coordinated with the government, with at least some degree of transparency. The scenario is not “DeepMind is beating us to AGI, so we can unilaterally set aside our RSP”, but more like “North Korea is beating us to AGI, so we must cooperatively set aside our RSP”.
Relatedly, Control AI writes that, with RSPs, companies “can decide freely at what point they might be falling behind – and then they alone can choose to ignore the already weak” RSPs. But part of the idea with RSPs is that they are a stepping stone to national or international policy enforced by governments. For example, ARC and Anthropic both explicitly said that they hope RSPs will be turned into standards/regulation prior to the Control AI campaign. (That seems quite plausible to me as a theory of change.) Also, Anthropic commits to only updating its RSP in consultation with its Long-Term Benefit Trust (consisting of five people without any financial interest in Anthropic) -- which may or may not work well, but seems sufficiently different from Anthropic being able to “decide freely” when to ignore its RSP that I think Control AI’s characterisation is misleading. Again, I don't want to discuss the merits of RSPs, I just think Control AI is misrepresenting Anthropic's and others' positions.
Three, Control AI seems to say that Anthropic’s advocacy for RSPs is an instance of safetywashing and regulatory capture. (Connor Leahy: “The primary aim of responsible scaling is to provide a framework which looks like something was done so that politicians can go home and say: ‘We have done something.’ But the actual policy is nothing.” And also: “The AI companies in particular and other organisations around them are trying to capture the summit, lock in a status quo of an unregulated race to disaster.”) I don’t know exactly what Anthropic’s goals are -- I would guess that its leadership is driven by a complex mixture of motivations -- but I doubt it is so clear-cut as Leahy makes it out to be.
To be clear, I think Conjecture has good intentions, and wants the whole AI thing to go well. I am rooting for its safety work and looking forward to seeing updates on CoEm. And again, I personally do not have a settled view on whether RSPs like Anthropic’s are in fact good or bad, or on whether it is good or bad to advocate for them – it could well be that RSPs turn out to be toothless, and would displace better policy – I only take issue with the rhetoric.
(Disclosure: Open Philanthropy funds the organisation I work for, though the above represents only my views, not my employer’s.)
I'm surprised to hear they're posting updates about CoEm.
At a conference held by Connor Leahy, I said that I thought it was very unlikely to work, and asked why they were interested in this research area, and he answered that they were not seriously invested in it.
We didn't develop the topic and it was several months ago, so it's possible that 1- I misremember or 2- they changed their minds 3- I appeared adversarial and he didn't feel like debating CoEm. (For example, maybe he actually said that CoEm didn't look promising and this changed recently?)
Still, anecdotal evidence is better than nothing, and I look forward to seeing OliviaJ compile a document to shed some light on it.