Three of the big AI labs say that they care about alignment and that they think misaligned AI poses a potentially existential threat to humanity. These labs continue to try to build AGI. I think this is a very bad idea.
The leaders of the big labs are clear that they do not know how to build safe, aligned AGI. The current best plan is to punt the problem to a (different) AI,[1] and hope that can solve it. It seems clearly like a bad idea to try and build AGI when you don’t know how to control it, especially if you readily admit that misaligned AGI could cause extinction.
But there are certain reasons that make trying to build AGI a more reasonable thing to do, for example:
- They want to build AGI first because they think this is better than if a less safety-focused lab builds it
- They are worried about multi-polar scenarios
- They are worried about competition from other nations, specifically from China
- They think one needs to be able to play with the big models in order to align the bigger models, and there is some other factor which means we will soon have bigger models we need to align
I think the labs should be explicit that they are attempting to build AGI[2], and that this is not safe, but there are specific reasons that cause them to think that this is the best course of action. And if these specific reasons no longer hold then they will stop scaling or attempting to build AGI. They should be clear about what these reasons are. The labs should be explicit about this to the public and to policy makers.
I want a statement like:
We are attempting to build AGI, which is very dangerous and could cause human extinction. We are doing this because of the specific situation we are in.[3] We wish we didn’t have to do this, but given the state of the world, we feel like we have to do this, and that doing this reduces the chance of human extinction. If we were not in this specific situation, then we would stop attempting to build AGI. If we noticed [specific, verifiable observations about the world], then we would strongly consider stopping our attempt to build AGI.
Without statements like this, I think labs should not be surprised if others think they are recklessly trying to build AGI.
I had a thought. When comparing parameter counts of LLMs to synapse counts, for parity the parameter count of each attention head should be multiplied by the number of locations that it can attend to, or at least its logarithm. That would account for about an order of magnitude of the disparity. So make that 2-3 orders of magnitude. That sounds rather more plausible for sparks of AGI to full AGI.