Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.
(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)
Yep, I am generally on the record thinking that Deepmind's safety team is doing the best work on a few different dimensions (including taking the existential risk problem most straightforwardly seriously, and generally contributing to discourse and sanity in the space the most).
This isn't to say I think Deepmind is a great organization! I think a non-trivial reason for my optimism here comes from the fact that the Deepmind safety team is a much smaller part of Deepmind than the Anthropic and OpenAI teams, and this allows them to specialize more into the important things, and allows them to speak more freely because their statements don't cause everyone to panic or be taken as representative of the org.
I mean, it seems like a straightforward case of specification gaming, which is the title of the paper "Demonstrating specification gaming in reasoning models". I haven't reread the Time article, maybe you are referencing something in there, but the paper seems straightforward and true (and like, it's up to the reader to decide how to interpret the evidence of specification gaming in language models).
"Giving LLMs ambiguous tasks" is like, the foundation of specification gaming. It's literally in the name. The good old specification gaming boat of course was "attempting to perform well at an ambiguous task". That doesn't make it an uninteresting thing to study!
Possible that I am missing something in the paper and there is something more egregious happening.
Also briefly:
I think that your policies regarding providing value to Sam Altman should be transparent to your (potential) donors;
I agree! A picture of Sam Altman at Lighthaven is literally in the top-level post you are commenting on!
I am not going to comment much on this because I really have already spent a huge enormous amount of time clarifying my perspectives here in the threads I linked. I am happy to answer other people's questions or address their misunderstandings, and so if someone wants to second any specific part of this, I am happy to clarify more. I will only address one thing:
It's not really clear to me to what extent you didn't communicate your policy well back then, or changed it on your own in the meantime (what caused it?), or changed it because of the pushback, or what.
My policy has not changed at all!
It would continue to be surprising to me if we never hosted AI scaling labs teams here, for a potentially very wide variety of different taxation levels. If the Deepmind alignment team wanted to host something here I would probably give them a substantial discount! If the Anthropic alignment team wanted to host something here I would probably charge them a small but not enormous tax. If an OpenAI capabilities team wanted to host something here the tax would be much higher, as I have clarified.
Like, look man, communication is hard. I am not saying it's totally unreasonable to walk away with the understanding that we would at most charge a modest tax for even the most vile events, but I have now tried to clarify this 5+ times with you, and you keep just making new top-level posts not linking to any previous discussion, and speaking with great confidence.
My guess is we would have disagreement about what appropriate levels of taxation are. If someone else wants to propose concrete counterfactuals about what kind of premium we would charge for different events, I would be happy to give my best guess of them, if someone is genuinely curious about my policy here.
Lighthaven wants to be an impartial event venue.
Come on, please stop summarizing my positions in an authoritative tone. You seem to keep giving extremely misleading summaries of my positions, and you keep doing it, even though I keep asking you to stop. Even this exact point I have already clarified like 4 different times in like 4 different contexts (1, 2, 3, 4). At least link to those or acknowledge any previous discussion!
Lighthaven is not an "impartial event venue" in the way you describe here! Lighthaven is run centrally with the purpose of trying to make the world better, on the basis of a quite opinionated worldview!
And part of that worldview is to not exclude people from spaces I run because I disagree with them! I do not ban people I disagree with on LessWrong! It would be a terribly dumb mistake to do so. It would similarly be a terribly dumb mistake to ban events from inviting people they want to talk to, that I disagree with.
This would impose huge pressures of conformity on everyone close to me. Both LessWrong and events at Lighthaven are largely institutions of discourse. You don't get better at truth-seeking if you ban anyone who disagrees with you from talking to you or your community![1]
We don't just provide services to whoever the highest bidder is. We very heavily seek out clients who do stuff that we think is valuable to the world, as should be obvious to anyone who takes a look at our event calendar. That event calendar is not the calendar of a completely random hotel or conference venue!
When we have clients that we think aren't doing things we are excited about, or are maybe even actively harming the world, we charge (potentially enormous) premiums. If Sam Altman himself wanted to run an event here that didn't to me seem like it would improve some kind of important diplomatic conversation or facilitate some safety-relevant coordination, but was centrally just making OpenAI better at building ASI faster, we would end up charging a huge premium. I don't know how much, depends on the detail and how bad the externalities of the specific event would be, but it would not be cheap!
Of course, you can disagree with our decisions on who to work with. If you really want us to ban the Progress Community from running events at Lighthaven because they want to invite Sam Altman, or want us to ban Manifest for inviting Hanania to their event, say so directly! But please don't try to misrepresent this as some kind of general policy to just rent to the highest bidder.
I have already clarified this like 4 times. Please, for the love of god, change how you present these things, or at the very least link or mention the previous discussion. It's extremely frustrating.
I do think there are valid reasons to exclude someone from a space like Lighthaven. I think Sam Altman has met many of those reasons, but not enough to be worth taking an action as drastic as to cut ties with any organizations that want to invite him to events they host here. Whether you like it or not Sam Altman is an important stakeholder in AI who many people want to talk to for good and valid reasons, and he isn't violent or disruptive in a way where it would make sense to exclude him on the basis of that making discourse around him much more difficult.
It was honestly a statement that was so clearly wrong in the straightforward interpretation that my guess was you obviously meant to convey something different than the obvious interpretation, but then I didn't really put in the effort to reflect on that.
I found this quite helpful, thank you!
I think it’s probably pretty easy to identify the top political candidates. These are the people like Alex Bores, who have a track record of getting hard legislation through their legislature and leave public evidence of strongly understanding the issue TODO LINK
A link would actually be nice.
I don't understand. Like, dozens of people have linked to evidence of LW being early on COVID, compared to basically any other reference class of smart educated people who didn't specifically specialize in disease prevention (and even there, people took it more seriously).
What do you mean by "early"? Is this some kind of weird definitional dispute? It was extremely blatantly obvious that while I was preparing with my friends and trying to gather evidence about disease spread, and mortality rates, that practically no one else was taking it seriously. Also, lots of people we know made huge returns in the market betting on their COVID beliefs, which also clearly indicates the same thing (that LW was substantially earlier than comparably smart other groups).
Lots of people have given lots of evidence! I don't know what you want!