I think I do agree with some points in this post. This failure mode is the same as the one I mentioned about why people are doing interpretability for instance (section Outside view: The proportion of junior researchers doing Interp rather than other technical work is too high), and I do think that this generalizes somewhat to whole field of alignment. But I'm highly skeptical that recruiting a bunch of physicists to work on alignment would be that productive:
I've been testing this with @Épiphanie Gédéon for a few months now, and it's really, really good for doing more work that's intellectually challenging. In my opinion, the most important sentence in the post is the fact that it doesn't help that much during peak performance moments, but we’re not at our peak that often. And so, it's super important. It’s really a big productivity boost, especially when doing cognitively demanding tasks or things we struggle to "eat the frog". So, I highly recommend it.
But the person involved definitely needs to be pretty intelligent to keep up and to make recommendations that aren’t useless. Sometimes, it can feel more like co-working, there are quite a few different ways it can work, more or less passive/active. But overall, generally speaking, we recommend trying it for at least a few days.
It took me quite a while to take the plunge because there's a social aspect—this kind of thing isn’t very common in France. It’s not considered a real job. Although, honestly, it should be a real job for intellectual professions, in my opinion. And it’s not an easy job.
I often find myself revisiting this post—it has profoundly shaped my philosophical understanding of numerous concepts. I think the notion of conflationary alliances introduced here is crucial for identifying and disentangling/dissolving many ambiguous terms and resolving philosophical confusion. I think this applies not only to consciousness but also to situational awareness, pain, interpretability, safety, alignment, and intelligence, to name a few.
I referenced this blog post in my own post, My Intellectual Journey to Dis-solve the Hard Problem of Consciousness, during a period when I was plateauing and making no progress in better understanding consciousness. I now believe that much of my confusion has been resolved.
I think the concept of conflationary alliances is almost indispensable for effective conceptual work in AI safety research. For example, it helps clarify distinctions, such as the difference between "consciousness" and "situational awareness." This will become increasingly important as AI systems grow more capable and public discourse becomes more polarized around their morality and conscious status.
Highly recommended for anyone seeking clarity in their thinking!
I don't Tournesol is really mature currently, especially for non french content, and I'm not sure they try to do governance works, that's mainly a technical projet, which is already cool.
Yup, we should create an equivalent of the Nutri-Score for different recommendation AIs.
"I really don't know how tractable it would be to pressure compagnies" seems weirdly familiar. We already used the same argument for AGI safety, and we know that governance work is much more tractable than expected.
I'm a bit surprised this post has so little karma and engagement. I would be really interested to hear from people who think this is a complete distraction.
Fair enough.
I think my main problem with this proposal is that under the current paradigm of AIs (GPTs, foundation models), I don't see how you want to implement ATA, and this isn't really a priority?
I believe we should not create a Sovereign AI. Developing a goal-directed agent of this kind will always be too dangerous. Instead, we should aim for a scenario similar to CERN, where powerful AI systems are used for research in secure labs, but not deployed in the economy.
I don't want AIs to takeover.
What do you think of my point about Scott Aaronson? Also, since you agree with points 2 and 3, it seems that you also think that the most useful work from last year didn't require advanced physics, so isn't this a contradiction with you disagreing with point 1?