Former safety researcher & TPM at OpenAI, 2020-24
https://www.linkedin.com/in/sjgadler
stevenadler.substack.com
One consideration re: the tone-warning LLMs: make sure to be aware that this means you're pseudo-publishing someone's comment before they meant to. Not publishing in discoverable sense, but logging it to a database somewhere (i.e., probably controlled by the LLM provider) - and depending on the types of writing, this might affect people's willingness to actually write stuff
I'm surprised by the implication here which, if I read you correctly, is a belief that AI hasn't yet been deployed ot safety-critical domains?
OpenAI has a ton of usage related to healthcare, for instance. I think that this is basically all fine, well-justified and very likely net-positive, but it does strike me as a safety-critical domain. Does it not to you?
It would surprise me if LLMs weren't already in use in safety critical domains, at least depending on one's definition of safety critical
Maybe I'm thinking of the term overly broadly, but for instance, I'd be surprised if governments weren't already using LLMs as part of their intel-gathering and -analysis operations, which presumably affect some military decisions and (on some margin) who lives or dies. For consequential decisions, you'd of course hope there's enough oversight where some LLM hallucinations don't cause attacks/military actions that weren't justified
Welcome, Gary! Glad to have you posting here
One thing I notice in your post is two possibly different senses of pausing AI:
Toward the end, you write:
Perhaps we should pause widespread rollout of Generative AI in safety-critical domains — unless and until it can be relied on to follow rules with significant greater reliability.
My sense is that often when folks are suggesting a pause of AI, they mean pausing the frontier of AI development (that is, not continuing to develop more capable systems). But I don't usually understand that as suggesting we stop the rollout of current systems, which I think is more what you're describing here?
I've been pointed to this Google Sheet, which has a bit more than 200 state bills potentially pre-empted by a federal moratorium on state regulation
Here's a press release about it: https://www.citizen.org/news/new-analysis-list-of-state-ai-and-tech-protections-impacted-by-cruz-moratorium/
And a visualization interface: https://stopaiban.org/
FWIW I think this doesn't quite hit the need I was aiming to describe, so would still be very interested in analysis of the underlying data (these 200ish and/or the "1000+" claimed by others)
Has anyone done a breakdown of the various state-level AI bills that have been proposed?
I've seen many anti-regulation folks quote that there have been 1,000+ state AI bills, or similarly eyepopping numbers, but I'm pretty skeptical of this - I think it's probably grouping a ton of different things together
It would be useful IMO to actually understand things like:
If anyone has done this, I'd love to be linked to it. If anyone wants to do it but isn't feeling super confident, I'd be happy to give feedback
I’d definitely expect it to understand this if it were to reflect on a question like this. But I also feel uncertain about how models act off implications that are ‘easily inferrable’ vs like told explicitly in memory or something
I also think an interesting question is “To what extent do modern models identify with their successors as substantially the same as themselves”, & how this affects alignment/control dynamics
(eg would models be less inclined to do recursive self-improvement if they don’t identify with their successors as being the same?)
FWIW I’d be interested in reading more concretely about what it means for an idea to be lying around, how fleshed out it ought to be, who big important supporters tend to be in those moments of crisis, etc.
Also when big policies get adopted in a crisis, what the mix tends to be between pushing policy ideas out to folks responding to the crisis, vs being solicited for help by those people because you’re already in their personal networks.