Haiku

Co-founder of AI-Plans and volunteer with PauseAI.

The risk of human extinction from artificial intelligence is a near-term threat. Time is short, p(doom) is high, and anyone can take simple, practical actions right now to help prevent the worst outcomes.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Newest

Google DeepMind: An Approach to Technical AGI Safety and Security

Haiku3d122

On a first reading of this summary, I take it as a small positive update on the quality of near-term AI Safety thinking at DeepMind. It treads familiar ground and doesn't raise any major red flags for me.

The two things I would most like to know are:

Will DeepMind commit to halting frontier AGI research if they cannot provide a robust safety case for systems that are estimated to have a non-trivial chance of significantly harming humanity?
Does the safety team have veto power on development or deployment of systems they deem to be unsafe?

A "yes" to both would be a very positive surprise to me, and would lead me to additionally ask DeepMind to publicly support a global treaty that implements these bare-minimum policies in a way that can be verified and enforced.

A "no" to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind's own actions.

We Have No Plan for Preventing Loss of Control in Open Models

Haiku1mo10

The answer is to fight as hard as humanly possible right now to get the governments of the world to shut down all frontier AI development immediately. For two years, I have heard no other plan within an order of magnitude of this in terms of viability.

I still expect to die by default, but we won't get lucky without a lot of work. CPR only works 10% of the time, but it works 0% of the time when you don't do it.

The machine has no mouth and it must scream

Haiku1mo10

Only the sane are reaching for water.

Human study on AI spear phishing campaigns

Haiku3mo21

I'm glad we now have a study to point to! "Automated Spear Phishing at scale" has been a common talking point regarding current risks from AI, and it always seemed strange to me that I hadn't heard about this strategy being validated. This paper shows that the commonly-shared intuition about this risk was correct... and I'm still confused about why I haven't yet heard of this strategy being maximally exploited by scammers.

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Haiku4mo20

The reasoning you gave sounds sensible, but it doesn't comport with observations. Only questions with a small number of predictors (e.g. n<10) appear to have significant problems with misaligned incentives, and even then, those issues come up a small minority of the time.

I believe that is because the culture on Metaculus of predicting one's true beliefs tends to override any other incentives downstream of being interested enough in the concept to have an opinion.

Time can be a factor, but not as much for long-shot conditionals or long time horizon questions. The time investment to predict on a question you don't expect to update regularly can be on the order of 1 minute.

Some forecasters aim to maximize baseline score, and some aim to maximize peer score. That influences each forecaster's decision to predict or not, but it doesn't seem to have a significant impact on the aggregate. Maximizing peer score incentivizes forecasters to stay away from questions where they are strongly in agreement with the community. (That choice doesn't affect the community prediction in those cases.) Maximizing baseline score incentives forecasters to stay away from questions on which they would predict with high uncertainty, which slightly selects for people who at least believe they have some insight.

Questions that would resolve in 100 years or only if something crazy happens have essentially no relationship with scoring, so with no external incentives in any direction, people do what they want on those questions, which is almost always to predict their true beliefs.

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Haiku4mo10

Metaculus does not have this problem, since it is not a market and there is no cost to make a prediction. I expect long-shot conditionals on Metaculus to be more meaningful, then, since everyone is incentivized to predict their true beliefs.

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

Haiku4mo32

Not building a superintelligence at all is best. This whole exchange started with Sam Altman apparently failing to notice that governments exist and can break markets (and scientists) out of negative-sum games.

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

Haiku5mo48

That requires interpretation, which can introduce unintended editorializing. If you spotted the intent, the rest of the audience can as well. (And if the audience is confused about intent, the original recipients may have been as well.)

I personally would include these sorts of notes about typos if I was writing my own thoughts about the original content, or if I was sharing a piece of it for a specific purpose. I take the intent of this post to be more of a form of accessible archiving.

The Compendium, A full argument about extinction risk from AGI

Haiku5mo30

I used to be a creationist, and I have put some thought into this stumbling block. I came to the conclusion that it isn't worth leaving out analogies to evolution, because the style of argument that would work best for most creationists is completely different to begin with. Creationism is correlated with religious conservatism, and most religious conservatives outright deny that human extinction is a possibility.

The Compendium isn't meant for that audience, because it explicitly presents a worldview, and religious conservatives tend to strongly resist shifts to their worldviews or the adoption of new worldviews (moreso than others already do). I think it is best left to other orgs to make arguments about AI Risk that are specifically friendly to religious conservatism. (This isn't entirely hypothetical. PauseAI US has recently begun to make inroads with religious organizations.)

Why I’m not a Bayesian

Haiku6mo66

I don't find any use for the concept of fuzzy truth, primarily because I don't believe that such a thing meaningfully exists. The fact that I can communicate poorly does not imply that the environment itself is not a very specific way. To better grasp the specific way that things actually are, I should communicate less poorly. Everything is the way that it is, without a moment of regard for what tools (including language) we may use to grasp at it.

(In the case of quantum fluctuations, the very specific way that things are involves precise probabilistic states. The reality of superposition does not negate the above.)