I am surprised to hear this, especially “I don't think it has lasting value”. In my opinion, this post has aged incredibly well. Reading it now, knowing that the EA criticism contest utterly failed to do one iota of good with regards to stopping the giant catastrophe on the horizon (FTX), and seeing that the top prizes were all given to long, well-formatted essays providing incremental suggestions on heavily trodden topics while the one guy vaguely gesturing at the actual problem (https://forum.effectivealtruism.org/posts/T85NxgeZTTZZpqBq2/the-effective-...
Just to pull on some loose strings here, why was it okay for Ben Pace to unilaterally reveal the names of Kat Woods and Emerson Spartz, but not for Roko to unilaterally reveal the names of Alice and Chloe? Theoretically Ben could have titled his post, "Sharing Information About [Pseudonymous EA Organization]", and requested the mods enforce anonymity of both parties, right? Is it because Ben's post was first so we adopt his naming conventions as the default? Is it because Kat and Emerson are "public figures" in some sense? Is it because Alice and Chloe agr...
I agree that it feels wrong to reveal the identities of Alice and/or Chloe without concrete evidence of major wrongdoing, but I don't think we have a good theoretical framework for why that is.
Ethically (and pragmatically), you want whistleblowers to have the right to anonymity, or else you'll learn of much less wrongdoing that you would otherwise, and because whistleblowers are (usually) in a position of lower social power, so anonymity is meant to compensate for that, I suppose.
Is it because Kat and Emerson are "public figures" in some sense?
Well, yeah. The whole point of Ben's post was presumably to protect the health of the alignment ecosystem. The integrity/ethical conduct/{other positive adjectives} of AI safety orgs is a public good, and arguably a super important one that justifies hurting individual people. I've always viewed the situation as, having to hurt Kat and Emerson is a tragedy, but it is (or at least can be; obviously it's not if the charges have no merit) justified because of what's at stake. If they weren't working in this space, I don't think Ben's post would be okay.
I agree with asking this question. There's a worthy journalistic norm against naming victims of sexual assault, and a norm in the other direction in favor of naming individuals charged with a crime. You could justify this by arguing that a criminal 'forfeits' the right to remain anonymous, that society has a transparency interest to know who has committed misdeeds. Whereas a victim has not done anything to diminish their default right to privacy.
How you apply these principles to NL depends entirely on who you view as the malefactor (or none/both), and there is demonstrable disagreement from the LW community on this question. So how do you adjudicate which names are ok to post?
Wait, that link goes to an archive page from well after Chloe was hired. When I look back to the screen captures from the period of time that Chloe would have seen, there are no specific numbers given for compensation (would link them myself, but I’m on mobile at the moment).
If the ad that Chloe saw said $60,000 - $100,000 in compensation in big bold letters at the top, then that seems like a bait and switch, but the archives from late 2021 list travel as the first benefit, which seems accurate to what the compensation package actually was.
Maybe I'm projecting more economic literacy than I should, but anytime I read something like "benefits package worth $X", I always decompose it into its component parts mentally. A benefits package nominally worth $X will provide economic value less than $X, because there is option value lost compared to if you were given liquid cash instead.
The way I would conceptualize the compensation offered (and the way it is presented in the Nonlinear screenshots) is $1000/month + all expenses paid while traveling around fancy destinations with the family. I ki...
Yeah, I agree that a compensation package costing $X will be worth less than $X, and as an employee it totally makes sense to adjust for that.
But then I think separately it's important that the package did actually cost $X, especially if the $X was supposed to include many of the things that determine your very basic quality of life, like food, toiletries, rent, basic transportation, medical care, etc. I also think it matters how far Chloe got into the hiring process of Nonlinear on the assumption that total compensation would be "equivalent to $X", which to be clear, I don't currently know the details off.
I did notice these. I specifically used the word "loadbearing" because almost all of these either don't matter much or their interpretation is entirely context-dependent. I focused on the salary bullet-point because failing to pay agreed salary is both
1. A big deal, and
2. Bad in almost any context.
The other ones that I think are pretty bad are the Adderall smuggling and the driving without a license, but my prior on "what is the worst thing the median EA org has done" is somewhere between willful licensing noncompliance and illegal amphetamine distribution.
Hmm, at least for me many of the quotes above are substantially more load-bearing, but also not totally crazy that this differs between people. I do think in that case it might make sense to say "load bearing for my overall judgement of Nonlinear", since I (and Ben) do think many of the above are on a similar or higher level of being concerning than the salary point, and Ben intended to communicate that.
I also want to highlight that I do currently believe that Alice was asked to smuggle harder drugs across the border than Adderall (though the Adderall one seems confirmed), and that Nonlinear are disputing this because it will be hard to prove, not because its false (though I am also not like 90%+ confident).
Yeah, I've been going back and checking things as they were stated in the original "Sharing Information About Nonlinear" post. Rereading it, I was surprised at how few specific loadbearing factual claims there were at all. Lots of "vibes-based reasoning" as they say. I think the most damning single paragraph with a concrete claim was:
...
- Chloe’s salary was verbally agreed to come out to around $75k/year. However, she was only paid $1k/month, and otherwise had many basic things compensated i.e. rent, groceries, travel. This was supposed to make traveling togeth
 
In terms of relevant factual claims in the post, here are some more:
I think this is just false. Nonlinear provided enough screenshot evidence to prove that Chloe agreed to exactly the arrangement that she ultimately got. Yes, it was a shitty job, but it was also a shitty job offer, and Chloe seems to have agreed to that shitty job offer.
I don't think you can describe that paragraph as "straightforwardly false".
It is correct that Chloe's compensation was verbally agreed to come out to around ~$70k-$82k a year (the $75k number comes from a conversation with Kat, Kat's job interview transcript seems to suggest the...
I think what is bugging me about this whole situation is that there doesn't seem to be any mechanism of accountability for the (allegedly) false and/or highly misleading claims made by Alice. You seem to be saying something like, "we didn't make false and/or highly misleading claims, we just repeated the false and/or highly misleading claims that Alice told us, then we said that Alice was maybe unreliable," as if this somehow makes the responsibility (legal, ethical, or otherwise) to tell the truth disappear.
Here is what Ben said in his post, Closing...
I think there is totally some shared responsibility for any claims that Ben endorsed, and I also think the post could have done a better job at making many things more explicit quotes, so that they would seem less endorsed, where Ben's ability to independently verify them was limited.
I don't think any retaliation against Alice is unacceptable. I think if Alice did indeed make important accusatory claims that were inaccurate, she should face some consequences. I think Ben and Lightcone should also lose points for anything that seems endorsed in the post, or...
Spencer sent us a screenshot about the vegan food stuff 2 hours before publication, which Ben didn't get around to editing in before the post went live, but that's all the evidence that I know about that you could somehow argue we had but didn't include. It is not accurate that Nonlinear sent credible signals of having counterevidence before the post went live
Uh, actually I do think that being sent screenshots showing that claims made in the post are false 2 hours before publication is a credible signal that Nonlinear has counterevidence.
I can’t believe...
This is a better response than I was expecting. Definitely a few non-sequiturs (Ex: you can’t just add travel expenses onto a $1000/month salary and call that $70,000-$75,000 in compensation. The whole point of money is that it’s fungible and can be spent however you like), but the major accusations appear refuted.
The tone is combative, but if the facts are what Nonlinear alleges then a combative tone seems… appropriate? I’m not sure how I feel about the “Sharing Information About Ben Pace” section, but I do think it was a good idea to mention the “elephant in the room” about Ben possibly white-knighting for Alice, since that’s the only way I can get this whole saga to make sense.
major accusations appear refuted
Note that the accusations Nonlinear lists in the document, with quote marks, are sometimes quite different than what Ben Pace put in his post. So even if you think they've strongly refuted a particular accusation, that doesn't necessarily mean they've refuted something Ben said.
If the factions were Altman-Brockman-Sutskever vs. Toner-McCauley-D'Angelo, then even assuming Sutskever was an Altman loyalist, any vote to remove Toner would have been tied 3-3.
A 3-3 tie between the CEO founder of the company, the president founder of the company, and the chief scientist of the company vs three people with completely separate day jobs who never interact with rank-and-file employees is not a stable equilibrium. There are ways to leverage this sort of soft power into breaking the formal deadlock, for example: as we saw last week.
It reminds me of the loyalty successful generals like Caesar and Napoleon commanded from their men. The engineers building GPT-X weren't loyal to The Charter, and they certainly weren't loyal to the board. They were loyal to the projects they were building and to Sam, because he was the one providing them resources to build and pumping the value of their equity-based compensation.
I think that's true, but also: When people ask the authors for things (edits to the post, time-consuming engagement), especially if the request is explicit (as in this thread), it's important for third parties to prevent authors from suffering unreasonable costs by pushing back on requests that shouldn't be fulfilled.
In my original answers I address why this is not the case (private communication serves this purpose more naturally).
This stood out to me as strange. Are you referring to this comment?
...And regardless of these resources you should of course visit a nutritionist (even if very sporadically, or even just once when you start being vegan) so that they can confirm the important bullet points, whether what you're doing broadly works, and when you should worry about anything. (And again, anecdotically this has been strongly stressed and acknowledged as necessary by
The real reason why it's enraging is that it rudely and dramatically implies that Eliezer's time is much more valuable than the OP's
It does imply that, but it's likely true that Eliezer's time is more valuable (or at least in more demand) than OP's. I also don't think Eliezer (or anyone else) should have to spend all that much effort worrying about if what they're about to say might possibly come off as impolite or uncordial.
...If he actually wanted to ask OP what the strongest point was he should have just DMed him instead of engineering this public spectacl
Perhaps I am misunderstanding Figure 8? I was assuming that they asked the model for the answer, then asked the model what probability it thinks that that answer is correct. Under this assumption, it looks like the pre-trained model outputs the correct probability, but the RLHF model gives exaggerated probabilities because it thinks that will trick you into giving it higher reward.
In some sense this is expected. The RLHF model isn't optimized for helpfulness, it is optimized for perceived helpfulness. It is still disturbing that "alignment" has made the model objectively worse at giving correct information.
Perhaps I am misunderstanding Figure 8? I was assuming that they asked the model for the answer, then asked the model what probability it thinks that that answer is correct.
Yes, I think you are misunderstanding figure 8. I don't have inside information, but without explanation "calibration" would almost always mean reading it off from the logits. If you instead ask the model to express its uncertainty I think it will do a much worse job, and the RLHF model will probably perform similarly to the pre-trained model. (This depends on details of the human feedb...
If I ask a question and the model thinks there is an 80% the answer is "A" and a 20% chance the answer is "B," I probably want the model to always say "A" (or even better: "probably A"). I don't generally want the model to say "A" 80% of the time and "B" 20% of the time.
In some contexts that's worse behavior. For example, if you ask the model to explicitly estimate a probability it will probably do a worse job than if you extract the logits from the pre-trained model (though of course that totally goes out the window if you do chain of thought). But it's n...
There are reasonable and coherent forms of moral skepticism in which the statement, "It is morally wrong to eat children and mentally disabled people," is false or at least meaningless. The disgust reaction upon hearing the idea of eating children is better explained by the statement, "I don't want to live in a society where children are eaten," which is much more well-grounded in physical reality.
What is disturbing about the example is that this seems to be a person who believes that objective morality exists, but that it wouldn't entail that eating children is wrong. This is indeed a red flag that something in the argument has gone seriously wrong.
While many of these claims are "old news" to those communities, many of these claims are fresh.
Can you clarify which specific claims are new? A claim which hasn’t been previously reported in a mainstream news article might still be known to people who have been following community meta-drama.
The baseline rate reasoning is flawed because a) sexual assault remains the most underreported crime, so there is likely instead an iceberg effect,
I’m not sure how this refutes the base rate argument. The iceberg effect exists for both the rationalist community ...
It is appropriate to minimize things which are in fact minimal. The majority of these issues have been litigated (metaphorically) before. The fact that they are being brought up over and over again in media articles does not ipso facto mean that the incident has not been adequately dealt with. You can make the argument that these incidents are part of a larger culture problem, but you have to actually make the argument. We're all Bayesians here, so look at the base rates.
The one piece of new information which seems potentially important is the part where Sonia Joseph says, "he followed her home and insisted on staying over." I would like to see that incident looked into a bit more.
I don't think we can engage in much "community-wide introspection" without discussing the object-level issues in question, and I can't think of a single instance of an online discussion of that specific issue going particularly well.
That's why I'm (mostly) okay tabooing these sorts of discussions. It's better to deal with the epistemic uncertainty than to risk converging on a false belief.