Agustin_Martinez_Suñe

I don’t believe there is a single AI safety agenda that will once and for all "solve" AI safety or AI alignment (and even "solve" doesn’t quite capture the nature of the challenge). Hence, I've been considering safety cases as a way to integrate elements from various technical AI safety approaches, which in my opinion have so far evolved mostly in isolation with limited interaction.

I’m curious about your thoughts on the role of "big science" here. The main example you provide of a trading zone and boundary object involves nation-states collaborating toward a specific, high-stakes warfare objective. While "big science" large-scale scientific collaboration isn’t inherently necessary for trading zones to succeed, it might be essential for the specific goal of developing safe advanced AI systems. Any thoughts?

Replying toLimitations on Formal Verification for AI Safety

Agustin_Martinez_Suñe2y

Limitations on Formal Verification for AI Safety

But in any case, advocates of GS approaches are not, for the most part, talking about estimates, but instead believe we can obtain strong proofs that can effectively guarantee failure rates of 0% for complex AI software systems deployed in the physical world

I don't think this paragraph's description of the Guaranteed Safe AI approach is accurate or fair. Different individuals may place varying emphasis on the claims involved. If we examine the Guaranteed Safe AI position paper that you mentioned (https://arxiv.org/abs/2405.06624), we'll notice a more nuanced presentation in two key aspects:

1. The safety specification itself may involve probabilities of harmful outcomes. The approach does not rely on guaranteeing a 0% failure rate,... (read more)

LESSWRONG
LW

LESSWRONG
LW

Agustin_Martinez_Suñe

Agustin_Martinez_Suñe

Agustin_Martinez_Suñe

Agustin_Martinez_Suñe