slowing down LLM progress would be dangerous, as other approaches like RL agents would pass them by before appearing dangerous.
This seems misleading to me & might be a false dichotomy. It's not LLMs or RL agents. I think we'll (unfortunately) build agents on the basis of LLMs & the capabilities they have. Every additional progress on LLMs gives these agents more capabilities faster with less time for alignment. They will be (and are!) built based on the mere (perceived) incentives of everybody involved & the unilateralist curse. (See esp. Gwern...
Neither captures it quite imo. I think it’s mostly an attempt at deconfusion:
Our current LLMs like GPT-4 are not, in their base configurations, agents. They do not have goals.
What is the difference between being an agent and not being an agent here? Goals seem like an obvious point but since GPT-4 also minimized its loss during training and perhaps still does as they keep tweaking it, is the implied difference that base GPT-4 is not minimizing its loss anymore (which is its goal in some sense) or does not minimize it continually? If so, the distinction seems quite fuzzy since you'd have to concede the same for an AutoGPT where you ...
I think most comments regarding the covid analogy miss the point made in the post. Leopold makes the case that there will be a societal moment of realization and not that specific measures regarding covid were good and this should give us hope.
Right now talking about AI risk is like yelling about covid in Feb 2020.
I agree with this & there likely being a wake-up moment. This seems important to realize!
I think unless one has both an extremely fast takeoff model and doesn’t expect many more misaligned AI models with increases in capabilities to be ...
If you have multiple AI agents in mind here, then yes, though it's not the focus. Otherwise, also yes, though depending on what you mean with "multiple agent systems acting as one", one could also see this as being fulfilled by some agents dominating other agents so that they (have to) act as one. I'd put it rather as the difficulty of predicting other agents' actions and goals leads to the risks from AI & the difficulty of the alignment and control problem.