Critiques of prominent AI safety labs: Conjecture

[-]Marius Hobbhahn3y10067

(cross-commented from EA forum)

I personally have no stake in defending Conjecture (In fact, I have some questions about the CoEm agenda) but I do think there are a couple of points that feel misleading or wrong to me in your critique.

1. Confidence (meta point): I do not understand where the confidence with which you write the post (or at least how I read it) comes from. I've never worked at Conjecture (and presumably you didn't either) but even I can see that some of your critique is outdated or feels like a misrepresentation of their work to me (see below). For example, making recommendations such as "freezing the hiring of all junior people" or "alignment people should not join Conjecture" require an extremely high bar of evidence in my opinion. I think it is totally reasonable for people who believe in the CoEm agenda to join Conjecture and while Connor has a personality that might not be a great fit for everyone, I could totally imagine working with him productively. Furthermore, making a claim about how and when to hire usually requires a lot of context and depends on many factors, most of which an outsider probably can't judge.
Given that you state early on that you are an experienced member of the alignment community and your post suggests that you did rigorous research to back up these claims, I think people will put a lot of weight on this post and it does not feel like you use your power responsibly here.
I can very well imagine a less experienced person who is currently looking for positions in the alignment space to go away from this post thinking "well, I shouldn't apply to Conjecture then" and that feels unjustified to me.

2. Output so far: My understanding of Conjecture's research agenda so far was roughly: "They started with Polytopes as a big project and published it eventually. On reflection, they were unhappy with the speed and quality of their work (as stated in their reflection post) and decided to change their research strategy. Every two weeks or so, they started a new research sprint in search of a really promising agenda. Then, they wrote up their results in a preliminary report and continued with another project if their findings weren't sufficiently promising." In most of their public posts, they stated, that these are preliminary findings and should be treated with caution, etc. Therefore, I think it's unfair to say that most of their posts do not meet the bar of a conference publication because that wasn't the intended goal.
Furthermore, I think it's actually really good that the alignment field is willing to break academic norms and publish preliminary findings. Usually, this makes it much easier to engage with and criticize work earlier and thus improves overall output quality.
On a meta-level, I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit. These kinds of critiques make it more likely that people follow small incremental research agendas and alignment just becomes academia 2.0. When you make a critique like that, at least acknowledge that hits-based research might be the right approach.

3. Your statements about the VCs seem unjustified to me. How do you know they are not aligned? How do you know they wouldn't support Conjecture doing mostly safety work? How do you know what the VCs were promised in their private conversations with the Conjecture leadership team? Have you talked to the VCs or asked them for a statement?
Of course, you're free to speculate from the outside but my understanding is that Conjecture actually managed to choose fairly aligned investors who do understand the mission of solving catastrophic risks. I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

In conclusion:
1. I think writing critiques is good but really hard without insider knowledge and context.
2. I think this piece will actually (partially) misinform a large part of the community. You can see this already in the comments where people without context say this is a good piece and thanking you for "all the insights".
3. The EA/LW community seems to be very eager to value critiques highly and for good reason. But whenever people use critiques to spread (partially) misleading information, they should be called out.
4. That being said, I think your critique is partially warranted and things could have gone a lot better at Conjecture. It's just important to distinguish between "could have gone a lot better" and "we recommend not to work for Conjecture" or adding some half-truths to the warranted critiques.
5. I think your post on Redwood was better but suffered from some of the same problems. Especially the fact that you criticize them for having not enough tangible output when following a hits-based agenda just seems counterproductive to me.

[-]Marius Hobbhahn3y2822

(cross-posted from EAF)

Some clarifications on the comment:
1. I strongly endorse critique of organisations in general and especially within the EA space. I think it's good that we as a community have the norm to embrace critiques.
2. I personally have my criticisms for Conjecture and my comment should not be seen as "everything's great at Conjecture, nothing to see here!". In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post.
3. I'd also be fine with the authors of this post saying something like "I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling". Or they could also clearly state which things are known and which things are mostly intuitions.
4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I'm aware of. That's all.

[-]Rohin Shah3y197

(cross-posted from EAF, thanks Richard for suggesting. There's more back-and-forth later.)

I'm not very compelled by this response.

It seems to me you have two points on the content of this critique. The first point:

I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit.

I'm pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?

Presumably you would want to say "the team will be good at hits-based research such that we can expect a future hit, for X, Y and Z reasons". I think you should actually say those X, Y and Z reasons so that the authors of the critique can engage with them; I assume that the authors are implicitly endorsing a claim like "there aren't any particularly strong reasons to expect Conjecture to do more impactful work in the future".

The second point:

Your statements about the VCs seem unjustified to me. How do you know they are not aligned? [...] I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

Hmm, it seems extremely reasonable to me to take as a baseline prior that the VCs are profit-motivated, and the authors explicitly say

We have heard credible complaints of this from their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.

The fact that people who work(ed) at Conjecture say otherwise means that (probably) someone is wrong, but I don't see a strong reason to believe that it's the OP who is wrong.

At the meta level you say:

I do not understand where the confidence with which you write the post (or at least how I read it) comes from.

And in your next comment:

I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.

But afaict, the only point where you actually disagree with a claim made in the OP (excluding recommendations) is in your assessment of VCs? (And in that case I feel very uncompelled by your argument.)

In what way has the OP failed to say true things? Where should they have had more uncertainty? What things did they present as facts which were actually feelings? What claim have they been confident about that they shouldn't have been confident about?

(Perhaps you mean to say that the recommendations are overconfident. There I think I just disagree with you about the bar for evidence for making recommendations, including ones as strong as "alignment researchers shouldn't work at organization X". I've given recommendations like this to individual people who asked me for a recommendation in the past, on less evidence than collected in this post.)

[-]Marius Hobbhahn3y82

I'm not going to crosspost our entire discussion from the EAF.

I just want to quickly mention that Rohin and I were able to understand where we have different opinions and he changed my mind about an important fact. Rohin convinced me that anti-recommendations should not have a higher bar than pro-recommendations even if they are conventionally treated this way. This felt like an important update for me and how I view the post.

[-]Omega.3y12-3

(crossposted from the EA Forum)

We appreciate your detailed reply outlining your concerns with the post.

Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin's point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?

We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We've generally not found their preliminary findings to significantly update our views, whereas we would for example be excited by rigorous negative results that save future researchers from going down dead-ends. However, if you've found engaging with their output to be useful to your research then we'd certainly take that as a positive update.

Your second key concern is that we provide limited evidence for our claims regarding the VCs investing in Conjecture. Unfortunately for confidentiality reasons we are limited in what information we can disclose: it's reasonable if you wish to consequently discount this view. As Rohin said, it is normal for VCs to be profit-seeking. We do not mean to imply these VCs are unusually bad for VCs, just that their primary focus will be the profitability of Conjecture, not safety impact. For example, Nat Friedman has expressed skepticism of safety (e.g. this Tweet) and is a strong open-source advocate, which seems at odds with Conjecture's info-hazard policy.

We have heard from multiple sources that Conjecture has pitched VCs on a significantly more product-focused vision than they are pitching EAs. These sources have either spoken directly to VCs, or have spoken to Conjecture leadership who were part of negotiation with VCs. Given this, we are fairly confident on the point that Conjecture is representing themselves differently to separate groups.

We believe your third key concern is our recommendations are over-confident. We agree there is some uncertainty, but think it is important to make actionable recommendations, and based on the information we have our sincerely held belief is that most individuals should not work at Conjecture. We would certainly encourage individuals to consider alternative perspectives (including expressed in this comment) and to ultimately make up their own mind rather than deferring, especially to an anonymous group of individuals!

Separately, I think we might consider the opportunity cost of working at Conjecture higher than you. In particular, we'd generally evaluate skill-building routes fairly highly: for example, being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. Given the limited information on CogEm it's hard to assess whether it will or won't work, but we think there's ample evidence that there are better places to develop skills than Conjecture.

We wholeheartedly agree that it is important to maintain high epistemic standards during the critique. We have tried hard to differentiate between well-established facts, our observations from sources, and our opinion formed from those. For example, the About Conjecture section focuses on facts; the Criticisms and Suggestions section includes our observations and opinions; and Our Views on Conjecture are more strongly focused on our opinions. We'd welcome feedback on any areas where you feel we over-claimed.

[-]Marius Hobbhahn3y97

(cross-posted from EAG)

Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people.

1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say "what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?" To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits.
2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already "known" among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it's really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don't have enough insight to confidently say who is right here. I'm mainly saying, the confidence of you surprised me given my previous discussions with staff.
4. Regarding confidence: For example, I think saying "We think there are better places to work at than Conjecture" would feel much more appropriate than "we advice against..." Maybe that's just me. I just felt like many statements are presented with a lot of confidence given the amount of insight you seem to have and I would have wanted them to be a bit more hedged and less confident.
5. Sure, for many people other opportunities might be a better fit. But I'm not sure I would e.g. support the statement that a general ML engineer would learn more in general industry than with Conjecture. I also don't know a lot about CoEm but that would lead me to make weaker statements than suggesting against it.

Thanks for engaging with my arguments. I personally think many of your criticisms hit relevant points and I think a more hedged and less confident version of your post would have actually had more impact on me if I were still looking for a job. As it is currently written, it loses some persuasion on me because I feel like you're making too broad unqualified statements which intuitively made me a bit skeptical of your true intentions. Most of me thinks that you're trying to point out important criticism but there is a nagging feeling that it is a hit piece. Intuitively, I'm very averse against everything that looks like a click-bait hit piece by a journalist with a clear agenda. I'm not saying you should only consider me as your audience, I just want to describe the impression I got from the piece.

[-]Omega.3y-3-5

(cross-posted from EAF)

appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.

1) We agree it's worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We're not aware of any equally significant advances from Connor or other key staff members at Conjecture; we'd be interested to hear if you have examples of their pre-Conjecture output you find impressive.

We're not particularly impressed by Conjecture's process, although it's possible we'd change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn't feel like the crux for us: if Conjecture copied ARC's process entirely, we'd still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.

In terms of the explicit comparison with ARC, we would like to note that ARC Theory's team size is an order of magnitude smaller than Conjecture. Based on ARC's recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.

2) Thanks for the concrete examples, this really helps tease apart our disagreement.

We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.

The stuff on SVDs and sparse coding [...] was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.

This sounds similar to our internal evaluation. We're a bit confused by why "3 people in two weeks" is the relevant reference class. We'd argue the costs of Conjecture's "misses" need to be accounted for, not just their "hits". Redwood's team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture's other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator's post is head and shoulders above Redwood's other output)?

Thanks for sharing the data point this influenced independent researchers. That's useful to know, and updates us positively. Are you excited by those independent researchers' new directions? Is there any output from those researchers you'd suggest we review?

3) We remain confident in our sources regarding Conjecture's discussion with VCs, although it's certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It's reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.

4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.

5) This certainly depends on what "general industry" refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We'd be curious to hear your case for Conjecture as skill building; without that it's hard to identify where our main disagreement lies.

[-]Garrett Baker3y*6640

I agree with Conjecture's reply that this reads more like a hitpiece than an even-handed evaluation.

I don't think your recommendations follow from your observations, and such strong claims surely don't follow from the actual evidence you provide. I feel like your criticisms can be summarized as the following:

Conjecture was publishing unfinished research directions for a while.
Conjecture does not publicly share details of their current CoEm research direction, and that research direction seems hard.
Conjecture told the government they were AI safety experts.
Some people (who?) say Conjecture's governance outreach may be net-negative and upsetting to politicians.
Conjecture's CEO Connor used to work on capabilities.
One time during college Connor said that he replicated GPT-2, then found out he had a bug in his code.
Connor has said at some times that open source models were good for alignment, then changed his mind.
Conjecture's infohazard policy can be overturned by Connor or their owners.
They're trying to scale when it is common wisdom for startups to try to stay small.
It is unclear how they will balance profit and altruistic motives.
Sometimes you talk with people (who?) and they say they've had bad interactions with conjecture staff or leadership when trying to tell them what they're doing wrong.
Conjecture seems like they don't talk with ML people.

I'm actually curious about why they're doing 9, and further discussion on 10 and 8. But I don't think any of the other points matter, at least to the depth you've covered them here, and I don't know why you're spending so much time on stuff that doesn't matter or you can't support. This could have been so much better if you had taken the research time spent on everything that wasn't 8, 9, or 10, and used to to do analyses of 8, 9, and 10, and then actually had a conversation with Conjecture about your disagreements with them.

I especially don't think your arguments support your suggestions that

Don't work at Conjecture.
Conjecture should be more cautious when talking to media, because Connor seems unilateralist.
Conjecture should not receive more funding until they get similar levels of organizational competence than OpenAI or Anthropic.
Rethink whether or not you want to support conjecture's work non-monetarily. For example, maybe think about not inviting them to table at EAG career fairs, inviting Conjecture employees to events or workspaces, and taking money from them if doing field-building.

(1) seems like a pretty strong claim, which is left unsupported. I know of many people who would be excited to work at conjecture, and I don't think your points support the claim they would be doing net-negative research given they do alignment at Conjecture.

For (2), I don't know why you're saying Connor is unilateralist. Are you saying this because he used to work on capabilities?

(3) is just absurd! OpenAI will perhaps be the most destructive organization to-date. I do not think your above arguments make the case they are less organizationally responsible than OpenAI. Even having an info-hazard document puts them leagues above both OpenAI and Anthropic in my book. And add onto that their primary way of getting funded isn't building extremely large models... In what way do Anthropic or OpenAI have better corporate governance structures than Conjecture?

(4) is just... what? Ok, I've thought about it, and come to the conclusion this makes no sense given your previous arguments. Maybe there's a case to be made here. If they are less organizationally competent than OpenAI, then yeah, you probably don't want to support their work. This seems pretty unlikely to me though! And you definitely don't provide anything close to the level of analysis needed to elevate such hypotheses.

Edit: I will add to my note on (2): In most news articles in which I see Connor or Conjecture mentioned, I feel glad he talked to the relevant reporter, and think he/Conjecture made that article better. It is quite an achievement in my book to have sane conversations with reporters about this type of stuff! So mostly I think they should continue doing what they're doing.

I'm not myself an expert on PR (I'm skeptical if anyone is), so maybe my impressions of the articles are naive and backwards in some way. This is something which if you think is important, it would likely be good to mention somewhere why you think their media outreach is net-negative, ideally pointing to particular things you think they did wrong rather than vague & menacing criticisms of unilateralism.

[-]Ulisse Mini3y60

From my perspective 9 (scaling fast) makes perfect sense since Conjecture is aiming to stay "slightly behind state of the art", and that requires engineering power.

[-]Garrett Baker3y40

I'm pretty skeptical they can achieve that right now using CoEm given the limited progress I expect them to have made on CoEm. And in my opinion of greater importance than "slightly behind state of the art" is likely security culture, and commonly in the startup world it is found that too-fast scaling leads to degradation in the founding culture. So a fear would be that fast scaling would lead to worse info-sec.

However, I don't know to what extent this is an issue. I can certainly imagine a world where because of EA and LessWrong, many very mission-aligned hires are lining up in front of their door. I can also imagine a lot of other things, which is why I'm confused.

[-]Omega.3y1-4

(cross-posted from the EA Forum)

Regarding your specific concerns about our recommendations:

1) We address this point in our response to Marius (5th paragraph)

2) As we note in the relevant section: “We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.” This kind of relationship-building is unilateralist when it can decrease goodwill amongst policymakers.

3) To be clear, we do not expect Conjecture to have the same level of “organizational responsibility” or “organizational competence” (we aren’t sure what you mean by those phrases and don’t use them ourselves) as OpenAI or Anthropic. Our recommendation was for Conjecture to have a robust corporate governance structure. For example, they could change their corporate charter to implement a "springing governance" structure such that voting equity (but not political equity) shift to an independent board once they cross a certain valuation threshold. As we note in another reply, Conjecture’s infohazard policy has no legal force, and therefore is not as strong as either OpenAI or Anthropic’s corporate governance models. As we’ve noted already, we have concerns about both OpenAI and Anthropic despite having these models in place: Conjecture doesn’t even have those, which makes us more concerned.

[-]Garrett Baker3y86

I responded to a very similar comment of yours on the EA Forum.

To respond to the new content, I don't know if changing the board of conjecture once a certain valuation threshold is crossed would make the organization more robust (now that I think of it, I don't even really know what you mean by strong or robust here. Depending on what you mean, I can see myself disagreeing about whether that even tracks positive qualities about a corporation). You should justify claims like those, and at least include them in the original post. Is it sketchy they don't have this?

[-]TurnTrout3y6552

We have heard that Conjecture misrepresent themselves in engagement with the government, presenting themselves as experts with stature in the AIS community, when in reality they are not.

What does it mean for Conjecture to be "experts with stature in the AIS community"? Can you clarify what metrics comprise expertise in AIS -- are you dissatisfied with their demonstrated grasp of alignment work, or perhaps their research output, or maybe something a little more qualitative?

Basically, this excerpt reads like a crisp claim of common knowledge ("in reality") but the content seems more like a personal judgment call by the author(s).

[-]Omega.3y*10-22

Hi TurnTrout, thanks for asking this question. We're happy to clarify:

'experts': We do not consider Conjecture at the same level of expertise as [edit] alignment leaders and researchers at other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.
'with stature in the AIS community': Based on our impression (from conversations with many senior TAIS researchers at a range of organizations, including a handful who reviewed this post and didn't disagree with this point) of the TAIS community, Conjecture is not considered a top alignment research organization within the community.

[-]Erik Jenner3y6242

We do not consider Conjecture at the same level of expertise as other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.

This isn't quite the right thing to look at IMO. In the context of talking to governments, an "AI safety expert" should have thought deeply about the problem, have intelligent things to say about it, know the range of opinions in the AI safety community, have a good understanding of AI more generally, etc. Based mostly on his talks and podcast appearances, I'd say Connor does decently well along these axes. (If I had to make things more concrete, there are a few people I'd personally call more "expert-y", but closer to 10 than 100. The AIS community just isn't that big and the field doesn't have that much existing content, so it seems right that the bar for being an "AIS expert" is lower than for a string theory expert.)

I also think it's weird to split this so strongly along organizational lines. As an extreme case, researchers at CHAI range on a spectrum from "fully focused on existential safety" to "not really thinking about safety at all". Clearly the latter group aren't better AI safety experts than most people at Conjecture. (And FWIW, I belong to the former group and I still don't think you should defer to me over someone from Conjecture just because I'm at CHAI.)

One thing that would be bad is presenting views that are very controversial within the AIS community as commonly agreed-upon truths. I have no special insight into whether Conjecture does that when talking to governments, but it doesn't sound like that's your critique at least?

[-]Omega.3y22

Hi Erik, thanks for your points, we meant to say "at the same level of expertise as alignment leaders and researchers other organizations such as...". This was a typo on our part.

[-]mishka3y4133

As a person not affiliated with Conjecture, I want to record some of my scattered reactions. A lot of upvotes on such a post without substantial comments seems... unfair?

On one hand, it is always interesting to read something like that. Many of us have pondered Conjecture, asking ourselves whether what they are doing and the way they are doing it make sense. E.g. their infohazard policy has been remarkable, super-interesting, and controversial. My own reflections on that have been rather involved and complicated.

On the other hand, when I am reading the included Conjecture response, what they are saying there seems to me to make total sense (if I were in an artificial binary position of having to fully side with the post or with them, I would have sided with Conjecture on this). Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?

Specifically, on their research quality, the Simulator theory has certainly been controversial, but many people find it extremely valuable, and I personally tend to recommend it to people as the most important conceptual breakthrough of 2022 (in my opinion) (together with the notes I took on the subject) . It is particularly valuable as a deconfusion tool on what LLMs are and aren't, and I found that framing the LLM-related problems in terms of properties of simulation runs and in terms of sculpting and controlling the simulations is very productive. So I am super-greatful for that part of their research output.

On the other hand, I did notice that the authors of that work and Conjecture had parted ways (and when I noticed that I told myself, "perhaps I don't need to follow that org all that closely anymore, although it is still a remarkable org").

I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

[-]Andrea_Miotti3y30

Apologies for the 404 on the page, it's an annoying cache bug. Try to hard refresh your browser page (CMD + Shift + R) and it should work.

[-]mishka3y20

Works now. Thanks!

[-]mishka3y10

I am afraid, this is a more persistent problem (or, perhaps, it comes and goes, but I am even trying browsers I don't normally use (in addition to hard reload on those I do normally use), and it still returns 404).

I'll be testing this further occasionally... (You might want to check whether anyone else who does not have privileged access to your systems is seeing it at the moment; some systems like, for example, GitHub often show 404 to people who don't have access to an actually existing file instead of showing 403 as one would normally expect.)

[-]Omega.3y3-10

Thanks for commenting and sharing your reactions Mishka.

Some quick notes on what you've shared:

Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?

In their response to us they told us this offer was still standing.

A lot of upvotes on such a post without substantial comments seems... unfair?

As of the time of your comment, we believe there were about 8 votes and 30 karma and the post had been up a few hours. We are not sure what voting frequency is on LW (e.g. we're not sure if this is higher or lower than average?) but if it's higher, some hypotheses (we'd love to hear inputs from folks who have upvoted without a comment):

Some people are supportive of criticism in general, and may have upvoted to support more critical discussion (even though they may disagree with object level comments)
Some people who upvoted may already agree with the views of this post (e.g. some of the upvoters could be our reviewers)
Some people may have upvoted so this post gets more attention / discussion so they could see what others think of it
Some folks may have upvoted for now and might come back to the post to leave more substantive comments when they have time

I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

If understanding correctly, we think what you're saying is that because there are many claims in this post, it seems suboptimal that people can't indicate that via post-level voting.

We think this is a great point. We'd love to see an option for people to agree/disagree with specific claims on posts to provide a more nuanced understanding of where consensus lies. We think it's very plausible that some of our points will end up being much more controversial than others. (if you wanted to add separate comments for specific claims that people could vote on, we'd love to see that and would be happy to add a note to the top-level post encouraging folks to do so)

Our hope is that folks can comment with areas of disagreement to start a discussion on those points.

[-]janus3y2727

we think Conjecture [...] have too low a bar for sharing, reducing the signal-to-noise ratio and diluting standards in the field. When they do provide evidence, it appears to be cherry picked.

This is an ironic criticism, given that this post has very low signal-to-noise quality and when it does provide evidence, it's obviously cherry-picked. Relatedly, I am curious whether you used AI to write many parts of this post because the style is reminiscent and it reeks of a surplus of cognitive labor put to inefficient use, and seems to include some confabulations. A large percentage of the words in this post are spent on redundant, overly-detailed summaries.

I actually did not mind reading this style, because I found intriguing, but if typical lesswrong posts were like this it would be annoying and harm the signal-to-noise ratio.

Confabulation example:

(The simulators) post ends with speculative beliefs that they stated fairly confidently that took the framing to an extreme (e.g if the AI system adopts the “superintelligent AI persona” it’ll just be superintelligent).

This is... not how the post ends, nor is it a claim made anywhere in the post, and it's hard to see how it could even be a misinterpretation of anything at the end of the post.

Your criticisms of Conjecture's research are vague statements that it's "low quality" and "not empirically testable" but you do not explain why. These potentially object-level criticisms are undermined from an outside view by your exhaustive, one-sided nitpicking of Connor's character, which gives the impression that the author is saying every possible negative thing they can against Conjecture without regard for salience or even truth.

[-]cfoster03y2720

Having known some of Conjecture's founders and their previous work in the context of "early-stage EleutherAI", I share some^[1] of the main frustrations outlined in this post. At the organizational level, even setting aside the departure of key researchers, I do not think that Conjecture's existing public-facing research artifacts have given much basis for me to recommend the organization to others (aside from existing personal ties). To date, only^[2] a few posts like their one on the polytope lens and their one on circumventing interpretability were at the level of quality & novelty I expected from the team. Maybe that is a function of the restrictive information policies, maybe a function of startup issues, maybe just the difficulty of research. In any case, I think that folks ought to require more rigor and critical engagement from their future research outputs^[3].

^{^}
I didn't find the critiques of Connor's "character and trustworthiness" convincing, but I already consider him a colleague & a friend, so external judgments like these don't move the needle for me.
^{^}
The main other post I have in mind was their one on simulators. AFAICT the core of "simulator theory" predated (mid-2021, at least) Conjecture, and yet even with a year of additional incubation, the framework was not brought to a sufficient level of technical quality.
^{^}
For example, the "cognitive emulation" work may benefit from review by outside experts, since the nominal goal seems to be to do cognitive science entirely inside of Conjecture.

[-]Nathan Helm-Burger3y2411

I think the critique of Redwood Research made a few valid points. My own critique of Redwood would go something like:

they hired too few support staff to keep their primary researchers well supported and happy, and thus had unnecessarily high turnover
they hired too high a proportion of junior researchers, in an unsettled phase of life without high likelihood of sticking with a current job, again contributing to too much turnover and to a lack of researchers who knew what to expect from a workplace and how to maintain their work-life balance.

Not much of a critique, honestly. A reasonable mistake that a lot of start-ups led by young inexperienced people would make, and certainly something fixable. Also, they have longer AGI timelines than me, and thus are not acting with what I see as sufficient urgency. But I don't think that it's necessarily fair for me to critique orgs for having their own well-considered opinions on this different from my own. I'm not even sure if them having my timelines would improve their output any.

This critique on the other hand seems entirely invalid and counterproductive. You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me. The CEO needs to know enough about the technical aspects to be able to hire good technical people, and then needs to coordinate and inspire those people and promote the company. I think Connor is an excellent pick for this, and your criticisms of him are entirely beside the point, and also rather rude.

Conjecture, and Connor, seem to actually be trying to do something which strikes at the heart of the problem. Something which might actually help save us in three years from now when the leading AI labs have in their possession powerful AGI after a period of recursive self-improvement by almost-but-not-quite-AGI. I expect this AGI will be too untrustworthy to make more than very limited use of. So then, looking around for ways to make use of their newfound dangerous power, what will they see? Some still immature interpretability research. Sure. And then? Maybe they'll see the work Conjecture has started and realize that breaking down the big black magic box into smaller more trustworthy pieces is one of the best paths forward. Then they can go knocking on Conjecture's door, collect the research so far, and finish it themselves with their abundant resources.

My criticism of their plan is primarily: you need even more staff and more funding to have a better chance of this working. Which is basically the opposite of the conclusion you come to.

As for the untrustworthiness of their centralized infohazard policy... Yeah, this would be bad if the incentives were for the central individual to betray the world for their own benefit. That's super not the case here. The incentive is very much the opposite. For much the same reason that I feel pretty trusting of the heads of Deepmind, OpenAI, and Anthropic. Their selfish incentives to not destroy themselves and everyone they love are well aligned with humanity's desire to not be destroyed. Power-seeking in this case is a good thing! Power over the world through AGI, to these clever people, clearly means learning to control that untrustworthy AGI... thus means learning how to save the world. My threat model says that the main danger comes from not the heads of the labs, but the un-safety-convinced employees who might leave to start their own projects, or outside people replicating the results the big labs have achieved but with far fewer safety precautions.

I think reasonable safety precautions, like not allowing unlimited unsupervised recursive self-improvement, not allowing source code or model weights to leave the lab, sandbox testing, etc can actually be quite effective in the short term in protecting humanity from rogue AGI. I don't think surprise-FOOM-in-a-single-training-run-resulting-in-a-sandbox-escaping-superintelligence is a likely threat model. I think a far more likely threat model is foolish amatuers or bad actors tinkering with dangerous open source code and stumbling into an algorithmic breakthrough they didn't expect and don't understand and foolishly releasing it onto the web.

I think putting hope in compute governance is a very limited hope. We can't govern compute for long, if at all, because there will be huge reductions in compute needed once more efficient training algorithms are found.

[-]Shayne O'Neill3y97

You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me.

Yeah this confiused me a little too. My current job (in soil science) has a non academic boss, and a team of us boffins, and he doesn't need to be an academic, because its not his job, he just has to know where the money comes from, and how to stop the stakeholders from running away screaming when us soil nerds turn up to a meeting and start emitting maths and graphs out of our heads. Likewise the previous place I was at, I was the only non PhD haver on technical staff (being a 'mere' postgrad) and again our boss wasn't academic at all. But he WAS a leader of men and herder of cats, and cat herding is probably a more important skill in that role than actually knowing what those cats are taking about.

And it all works fine. I dont need an academic boss, even if I think an academic boss would be nice. I need a boss who knows how to keep the payroll from derailing, and I suspect the vast majority of science workers feel the same way.

[-]Omega.3y2-5

Note that we don't criticize Connor specifically, but rather the lack of a senior technical expert on the team in general (including Connor). Our primary criticisms of Connor don't have to do with his leadership skills (which we don't comment on this at any point in the post).

[-]Linch3y31

I'm confused about the disagree votes. Can someone who disagree-voted say which of the following claims they disagreed with:
1. Omega criticized the lack of a senior technical expert on Conjecture's team.

2. Omega's primary criticisms of Connor doesn't have to do with his leadership skills.

3. Omega did not comment on Connorship's leadership skills at any point in the post.

[-]Roman Leventov2y21

Beren Millidge is not a senior technical expert?

Nathan Helm-Burger' used a different notion of "leadership" (like a startup CEO) to criticise the post and Omega responded to it by saying something about "management" leadership, which doesn't respond to Nathan's comment really.

[-]Linch2y10

Ah I see. Hmm, if I say "Yesterday I said X," people-who-talk-like-me will interpret contextless disagreement with that claim as "Yesterday I didn't say X" and not as "X is not true." Perhaps this is a different communication norm from LW standards, in which case I'll try to interpret future agree/disagree comments in that light.

I agree from quickly looking at Beren's LinkedIn page that he seems like a technical expert (I don't know enough about ML to have a particularly relevant inside-view about ML technical expertise).

[-]Raemon2y30

I think the (perhaps annoying) fact is that LW readers aren't a monolith and different people interpret disagreement votes differently.

[-]Roman Leventov2y10

BTW, from the comment to the EA forum cross-post, I discovered that Beren reportedly left Conjecture very recently. That's indeed a negative update on Conjecture for me (maybe not as much as he specifically left but rather that this indicates a high turnover rate), but regardless, this doesn't apply to the inference made by Omega in this report, along the lines that "Conjecture's research is iffy because they don't have senior technical experts and don't know what are they doing", because this wasn't true until very recently and probably still isn't true (overwhelmingly likely there are other technical experts who are still working at Conjecture), so this doesn't invalidate or stain the research that has been done and published previously.

[-]Linch3y41

Interestingly, the reception on the EA Forum is more positive (154 net karma at 136 votes), compared to here (24 net karma at 105 votes).

12

Critiques of prominent AI safety labs: Conjecture

12

12

Key Takeaways

Criticisms and Suggestions

Our views on Conjecture

About Conjecture

Funding

Outputs

Products

Alignment Research

Infohazard policy

Governance outreach

Incubator Program

Team

Conjecture in the TAIS ecosystem

Criticisms and Suggestions

Low quality research

General thoughts on Conjecture’s research

Initial research agenda (March 2022 - Nov 2022)

New research agenda (Nov 22 - Present)

CEO’s character and trustworthiness

Conjecture and their CEO misrepresent themselves to various parties

Contributions to race dynamics

Overstatement of accomplishments and lack of attention to precision

Inconsistency over time regarding releasing LLMs

Scaling too quickly

Unclear plan for balancing profit and safety motives

Limited meaningful engagement with external actors

Lack of productive communication between TAIS researchers and Conjecture staff

Lack of engagement with the broader ML community

Our views on Conjecture

We would advise against working at Conjecture

We would advise Conjecture to take care when engaging with important stakeholders and represent their place in the TAIS ecosystem accurately

We do not think that Conjecture should receive additional funding before addressing key concerns

We encourage TAIS and EA community members to consider to what extent they want to legitimize Conjecture until Conjecture addresses these concerns

Appendix

Communication with Conjecture

Conjecture’s Reply

Brief response and changes we made

Notes