It's hard for me to know what's crux-y without a specific proposal.
I tend to take a dim view of proposals that have specific numbers in them (without equally specific justifications). Examples include the six month pause, and sb 1047.
Again, you can give me an infinite number of demonstrations of "here's people being dumb" and it won't cause me to agree with "therefore we should also make dumb laws"
If you have an evidence-based proposal to reduce specific harms associated with "models follow goals" and "people are dumb", then we can talk price.
“OK then! So you’re telling me: Nothing bad happened, and nothing surprising happened. So why should I change my attitude?”
I consider this an acceptable straw-man of my position.
To be clear, there are some demos that would cause me to update.
For example, I think the Solomonoff Prior is Malign to be basically a failure to do counting correctly. And so if someone demonstrated a natural example of this, I would be forced to update.
Similarly, I think the chance of a EY-style utility-maximizing agent arising from next-token-prediction are (with caveats) basically 0%. So if someone demonstrated this, it would update my priors. I am especially unconvinced of the version of this where the next-token predictor simulates a malign agent and the malign agent then hacks out of the simulation.
But no matter how many times I am shown "we told the AI to optimize a goal and it optimized the goal... we're all doomed", I will continue to not change my attitude.
Tesla fans will often claim that Tesla could easily do this
Tesla fan here.
Yes, Tesla can easily do the situation you've described (stop and go traffic on a highway in good weather with no construction). With higher reliability than human beings.
I suspect the reason Tesla is not pursuing this particular certification is because given the current rate of progress it would be out of date by the time it was authorized. There have been several significant leaps in capabilities in the last 2 years (11->12, 12->12.6, and I've been told 12->13). Most likely Elon (who has undeniably been over optimistic) is waiting to get FSD certified until it is at least level 4.
It's worth noting that Tesla has significantly relaxed the requirements for FSD (from "hands on wheel" to "eyes on road") and has done so for all circumstances, not just optimal ones.
Seems like he could just fake this by writing a note to his best friend that says "during the next approved stock trading window I will sell X shares of GOOG to you for Y dollars".
Admittedly:
1. technically this is a derivative (maybe illegal?)
2. principal agent risk (he might not follow through on the note)
3. his best friend might encourage him to work harder for GOOG to succeed
But I have a hard time believing any of those would be a problem in the real world, assuming TurnTrout and his friend are reasonably virtuous about actually not wanting TurnTrout to make a profit off of GOOG.
You could come up with more complicated versions of the same thing. For example instead of his best friend, TurnTrout could gift the profit to an for-charity LLC that had AI Alignment as its mandate. This would (assuming it was set up correctly) eliminate 1. and 3.
Isn't there just literally a financial product for this? TurnTrout could sell Puts for GOOG exactly equal to his vesting amounts/times.
Einstein didn't write a half-assed NYT op-ed about how vague 'advances in science' might soon lead to new weapons of war and the USA should do something about that; he wrote a secret letter hand-delivered & pitched to President Roosevelt by a trusted advisor.
Strongly agree.
What other issues might there be with this new ad hoced strategy...?
I am not a China Hawk. I do not speak for the China Hawks. I 100% concede your argument that these conversations should be taking place in a room that neither you our I are in right now.
I would like to see them state things a little more clearly than commentators having to guess 'well probably it's supposed to work sorta like this idk?'
Meh. I want the national security establishment to act like a national security establishment. I admit it is frustratingly opaque from the outside, but that does not mean I want more transparency at the cost of it being worse. Tactical Surprise and Strategic Ambiguity are real things with real benefits.
A great example, thank you for reminding me of it as an illustration of the futility of these weak measures which are the available strategies to execute.
I think both can be true true: Stuxnet did not stop the Iranian nuclear program and if there was a "destroy all Chinese long-range weapons and High Performance Computing clusters" NATSEC would pound that button.
Is your argument that a 1-year head start on AGI is not enough to build such a button, or do you really think it wouldn't be pressed?
It is a major, overt act of war and utter alarming shameful humiliating existential loss of national sovereignty which crosses red lines so red that no one has even had to state them - an invasion that no major power would accept lying down and would likely trigger a major backlash
The game theory implications of China waking up to finding all of their long-range military assets and GPUs have been destroyed are not what you are suggesting. A very telling current example being the current Iranian non-response to Israel's actions against Hamas/Hezbollah.
Nukes were a hyper-exponential curve too.
While this is a clever play on words, it is not a good argument. There are good reasons to expect AGI to affect the offense-defense balance in ways that are fundamentally different from nuclear weapons.
Because the USA has always looked at the cost of using that 'robust military superiority', which would entail the destruction of Seoul and possibly millions of deaths and the provoking of major geopolitical powers - such as a certain CCP - and decided it was not worth the candle, and blinked, and kicked the can down the road, and after about three decades of can-kicking, ran out of road.
I can't explicitly speak for the China Hawks (not being one myself), but I believe one of the working assumptions is that AGI will allow the "league of free nations" to disarm China without the messiness of millions of deaths. Probably this is supposed to work like EY's "nanobot swarm that melts all of the GPUs".
I agree that the details are a bit fuzzy, but from an external perspective "we don't publicly discuss capabilities" and "there are no adults in the room" are indistinguishable. OpenAI openly admits the plan is "we'll as the AGI what to do". I suspect NATSEC's position is more like "amateurs discuss tactics, experts discuss logistics" (i.e. securing decisive advantage is more important that planning out exactly how to melt the GPUs)
To believe that the same group that pulled of Stuxnet and this lack the imagination or will to use AGI enabled weapons strikes me as naive, however.
The USA, for example, has always had 'robust military superiority' over many countries it desired to not get nukes, and yet, which did get nukes.
It's also worth nothing AGI is not a zero-to-one event but rather a hyper-exponential curve. Theoretically it may be possible to always stay far-enough-ahead to have decisive advantage (unlike nukes where even a handful is enough to establish MAD).
Okay, this at least helps me better understand your position. Maybe you should have opened with "China Hawks won't do the thing they've explicitly and repeatedly said they are going to do"
I do not think arguing about p(doom) in the abstract is a useful exercise. I would prefer the Overton Window for p(doom) look like 2-20%, Zvi thinks it should be 20-80%. But my real disagreement with Zvi is not that his P(doom) is too high, it is that he supports policies that would make things worse.
As for the outlier cases (1-in-a-gazillon or 99.5%), I simply doubt those people are amenable to rational argumentation. So, I suspect the best thing to do is to simply wait for reality to catch up to them. I doubt when there are 100M's of humanoid robots out there on the streets, people will still be asking "but how will the AI kill us?"
That does make me feel better.