- Tell me father, when is the line
where ends everything good and fine?
I keep searching, but I don't find.
- The line my son, is just behind.
Camille Berger
There is hope that some “warning shot” would help humanity get its act together and change its trajectory to avoid extinction from AI. However, I don't think that's necessarily true.
There may be a threshold beyond which the development and deployment of advanced AI becomes essentially irreversible and inevitably leads to existential catastrophe. Humans might be happy, not even realizing that they are already doomed. There is a difference between the “point of no return” and "extinction." We may cross the point of no return without realizing it. Any useful warning shot should happen before this point of no return.
We will need a very...
I'm not sure of what you meant about studying transistors.
It seems to me to me that if we are studying transistors so hard, it's to push computers capabilities (faster, smaller, more energy efficient etc.), and not at all to make software safer. Instead to make software safer, we use anti-viruses, automatic testing, developer liability, standards, regulations, pop-up warnings, etc.
This is the write-up for our (@cozyfractal and mine) capstone project during ARENA's 2023 summer iteration. Our project explored a novel approach for interpreting language models, focusing on understanding their internal flow of information. While the practical implementation was completed in just one week and lacks formal rigor, we believe it offers some interesting insights and holds promise as a foundation for future research in this area. The accompanying repository with code examples and more experiments can be found here.
We want to thank Alexandre Variengien whose original idea served as the inspiration for this work, and who provided extensive feedback as well as thought-provoking discussions. Additionally, we want to express our thanks to the organizers of ARENA and our fellow participants for fostering an environment that encouraged...
It's the horizontal difference that matters and not the vertical one, so the water boils about 200s earlier or 20% faster (according to this one experiment) which quite nice!
Thank you for bringing those four ideas into one nicely written post! It helped me have a better overview of what happens inside transformers, even though I had worked with each idea independently before :)
I agree, that's an important point. I probably worry more about your first possibility, as we are already seeing this effect today, and worry less about the second, which would require a level of resignation that I've rarely seen. Entities that are responsible would likely try to do something about it, but the ways this “we're doomed, let's profit” might happen are:
- The warning shot comes from a small player and a bigger player feels urgency or feels threatened, in a situation where they have little control
- There is no clear responsibility and there are many
... (read more)