cdt
cdt has not written any posts yet.

cdt has not written any posts yet.

You may mean phylogenetic inertia.
I think this article would've been far better without talking about Greenpeace. The engagement with Greenpeace is brief and low-context but most of the argument relies on taking your position on Greenpeace as fact.
The new short form content seems clearly way worse. Imagine children switching from watching old television shows to YouTube Kids or Shorts on that same TV.
I agree. In comparison to old-form television shows, I wonder how small the teams that produce shortform content are, and consequently how few people are able to moderate and judge its appropriateness.
I experience cognitive dissonance, because my model of Eliezer is someone who is intelligent, rational, and aiming at using at least their public communications to increase the chance that AI goes well.
Consider that he is just as human and fallible as everyone else. "None of Eliezer's public communication is -EV for AI safety" is such an incredibly high bar it is almost certainly not true. We all say things that are poor.
Really enjoyed this!!
Quick question: What does the "% similarity" bar mean? It's not obviously functional (GO-based) nor is it obviously structural. Several rounds of practice have been waylaid by me misinterpreting what it means for a protein to be 95% similar to the target...
I'm pleased to see this, and giving me credit or blame for it is far too generous. It seems many other people have also enjoyed reading it.
I do feel this "reflexive ick reactions to the ideas" and it is interesting how orthogonal they are to the typical concerns around horizon scanning or post-AGI thought (e.g. coup risk).
Please put this in a top-level post. I don't agree (or rather I don't feel it's this simple), but I really enjoyed reading your two rejoinders here.
I particularly dislike that this topic has stretched into psychoanalysis (of Anthropic staff, of Mikhail Samin, of Richard Ngo) when I felt that the best part of this article was its groundedness in fact and nonreliance on speculation. Psychoanalysis of this nature is of dubious use and pretty unfriendly.
Any decision to work with people you don't know personally that relies on guessing their inner psychology is doomed to fail.
The post contains one explicit call-to-action:
If you are considering joining Anthropic in a non-safety role, I ask you to, besides the general questions, carefully consider the evidence and ask yourself in which direction it is pointing, and whether Anthropic and its leadership, in their current form, are what they present themselves as and are worthy of your trust.
If you work at Anthropic, I ask you to try to better understand the decision-making of the company and to seriously consider stopping work on advancing general AI capabilities or pressuring the company for stronger governance.
This targets a very small proportion of people who read this article. Is there another way we could operationalize this work, one that targets people who aren't working/aiming to work at Anthropic?
From discussing AI politics with the general public [i.e. not experts], it seems that the public perception of AI progress is bifurcating on two parallel lines:
A) Current AI progress is sudden and warrants a response (either acceleration or regulation)
B) Current AI progress is a flash-in-the-pan or a nothingburger.
(This is independent from responding to hypothetical AI-in-concept.)
These perspectives are largely factual rather than ideological. In conversation, the active tension between these two incompatible perspectives is really obvious. It makes it hard to hold meaningful conversations without being overbearing or ?accusatory.
Where does this divide come from? Is it the image hangover from the public's interaction with the first ChatGPT? How can we bridge this when speaking to the average person?
Thanks for doing this work, this is a really important paper in my view.
One question that sprang to mind when reading this: to what extent do the disempowerment primitives and amplifying factors correlate with each other? i.e. Are conversations which contain one primitive likely to contain others? Ditto with the amplifying factors?
The impression I took away is that these elements were being treated independently which strikes me as a reasonable modelling assumption but the estimates of rate strike me as quite sensitive to that. Would be happy to be proven wrong.