Søren Elverlin

Retrospective on Copenhagen Secular Solstice 2025

2mo

This was the best Secular Solstice I’ve hosted so far.

I leaned unusually far in the direction of requiring effort from participants. That worked because I know my crowd, and because I was very well prepared.

We were 12 people in my living room around a large table: 3 new people and the rest regulars. A laptop with a PowerPoint showed lyrics and instructions, and I finally had good Bluetooth speakers. I also used a USB pedal to navigate the slideshow, which was delightful quality-of-life. The slides are here:

https://www.dropbox.com/scl/fi/3v2bvy83i2rmze1nj66yy/Secular-Solstice_2025.pptx?rlkey=osg31xq151x7gqsb1wylqyvk4&dl=0

Below are comments on individual parts of the program.

Section-by-section notes

Introduction and welcome

No particular remarks here; it did what it needed to do.

Always Look on the Bright

... (read 1072 more words →)

Replying toShallow review of technical AI safety, 2025

Søren Elverlin2mo

Shallow review of technical AI safety, 2025

To what extent was this post written with AI Assistance?
I am confused about this part, which is an image:

Copenhagen ACX-Risk from AI - Conference

Søren Elverlin

2mo

Conference for people interested in reducing existential risk from advanced artificial intelligence.

Organized by the Copenhagen ACX-meetup group. This is also a place for people in the broader ACX/LW/Rationality sphere.

Soft drinks and vegan dinner will be provided.

Søren Elverlin2mo

I just donated $2,718 toward turning log-odds into logistics at Lighthaven.

Keep up the good work.

New homepage for AI safety resources – AISafety.com redesign

Bryce Robertson

Bryce Robertson, Søren Elverlin, Melissa Samworth

3mo

For those relatively new to AI safety, AISafety.com helps them navigate the space, providing lists of things like self-study courses, funders, communities, etc. But while the previous version of the site basically just threw a bunch of resources at the user, we’ve now redesigned it to be more accessible and therefore make it more likely that people take further steps towards entering the field.

The old site:

The new one:

The new homepage does a better job of directing people to the resource pages most relevant to them, while minimising overwhelm. We’re considering going a step further in the future and integrating a chatbot to help direct people to the exact resources they need, given their goals, skillset, location... (read 194 more words →)

[Solstice] AstralCodexTen / LessWrong / AI X-Risk Meetup

Søren Elverlin

4mo

Casual meetup to discuss AstralCodexTen / LessWrong / AI X-Risk / Rationality / Whatever.

This includes the traditional Secular Solstice celebrations.

Soft drinks and vegan dinner will be provided.

Replying toCopenhagen ACX-Risk from AI - Community Conference

Søren Elverlin4mo

Copenhagen ACX-Risk from AI - Community Conference

Found you. Adress is correct

AISafety.com Reading Group session 328

Søren Elverlin

4mo

The topic for session 328 is we will have a point-by-point discussion about Chapter 4 of "If Anyone Builds It, Everyone Dies". The main reading is Chapter 4, but Petr has written a []substantial critique](https://docs.google.com/document/d/1aewuckEeei9y2pw1-mAqc7BIHekwZyknzEY9SFxLoHI) you can have a look at.

We'll start with a joint presentation by Petr and myself. We've put quite a bit of effort into trying to understand each other, even if we disagree about the conclusion.

The AISafety.com Reading Group meets through EA Gathertown every Thursday at 20:45 Central European Time and discuss an AI Safety related article.

Usually, we start with small-talk and a presentation round, then the host gives a summary of the paper for roughly half an hour. This is followed by discussion (both on the article and in general) and finally we decide on a paper to read the following week.

The presentation of the article is uploaded to the YouTube channel: https://www.youtube.com/@aisafetyreadinggroup

Most of the coordination happens on our Discord: https://discord.gg/zDBvCfDcxw

Replying to Book Review: If Anyone Builds It, Everyone Dies

Søren Elverlin5mo

Book Review: If Anyone Builds It, Everyone Dies

Hi Nina,

We discussed this post in the AISafety.com Reading Group, and we were of the general opinion that this was one of the best object-level responses to IABIED.

I recorded my presentation/response, and I'd be interested in hearing your thoughts on the points I raise.

Replying toLaunching the $10,000 Existential Hope Meme Prize

Søren Elverlin5mo

Launching the $10,000 Existential Hope Meme Prize

Have you read The Bottom Line by Eliezer Yudkowsky? This price (and the existential hope project) might not be rational.

Replying toAISafety.com Reading Group session 327

Søren Elverlin5mo

AISafety.com Reading Group session 327

We have chosen your review as the topic for our discussion on Thursday.

AISafety.com Reading Group session 327

Søren Elverlin

5mo

The topic for session 327 is the review of If Anyone Build it, Everyone Dies by Nina Panickssery.

The AISafety.com Reading Group meets through EA Gathertown every Thursday at 20:45 Central European Time and discuss an AI Safety related article.

The presentation of the article is uploaded to the YouTube channel: https://www.youtube.com/@aisafetyreadinggroup

Most of the coordination happens on our Discord: https://discord.gg/zDBvCfDcxw

Replying toHow I tell human and AI flash fiction apart

Søren Elverlin5mo

How I tell human and AI flash fiction apart

Thank you, this updated me. My previous model was "Good humans write better than SoTA AI" without any specifics.

I'm not a good writer, and I both struggle to distinguish AI writing from Human writing and I struggle to distinguish good writing from bad writing.

Søren Elverlin5moQuick Take

A hunger strike is a symmetrical tool, equally effective in worlds AI will destroy and in worlds AI will not destroy. This is in contrast to arguing for/against AI Safety, which is an asymmetric tool since arguments are easier to make and are more persuasive if they reflect the truth.

I could imagine people who are dying from a disease that a Superintelligence could cure would be willing to stage a larger counter-hunger-strike. "Intensity of feeling" isn't entirely disentangled from the question of whether AI Doom will happen, but it is a very noisy signal.

The current hunger strike explicitly aims at making employees at Frontier AI Corporations aware of AI Risk. This aspect is slightly asymmetrical, but I expect the effect of the hunger strike will primarily be influencing the general public.

•••

Replying toA Timing Problem for Instrumental Convergence

Søren Elverlin5mo

A Timing Problem for Instrumental Convergence

It is possible that we also disagree on the nature of goal having. I reserve the right to find my own places to challenge your argument.

Replying toA Timing Problem for Instrumental Convergence

Søren Elverlin5mo

A Timing Problem for Instrumental Convergence

I did read 2/3rd of the paper, and I tried my best to understand it, but apparently I failed.

Copenhagen ACX-Risk from AI - Community Conference

Søren Elverlin

6mo

Community conference for people interested in reducing existential risk from advanced artificial intelligence.

Organized by the Copenhagen ACX-meetup group. This is also a place for people in the broader ACX/LW/Rationality sphere.

There will be a half-hour presentation of the newly-released book "If Anyone Builds It, Everyone Dies".

Soft drinks and vegan dinner will be provided.

Regarding "Poll on De/Accelerating AI": Great idea - sort by "oldest" to get the intended ordering of the questions.

Some of the questions are ambiguous. E.g., I believe SB1047 is a step in the right direction, but that this kind of regulation is insufficient. Should I agree or disagree on "SB1047"?

AstralCodexTen / LessWrong / AI X-Risk Meetup

Søren Elverlin

8mo

Casual meetup to discuss AstralCodexTen / LessWrong / AI X-Risk / Rationality / Whatever. Soft drinks and vegan dinner will be provided.

[Solstice]AstralCodexTen / LessWrong / X-Risk Meetup

Søren Elverlin

9mo

Casual meetup to discuss AstralCodexTen / LessWrong / X-Risk / Rationality / Whatever. Soft drinks and vegan dinner will be provided.

Note: Location will be at "Bålplads Amager fælled" https://plus.codes/9F7JMH2H+M4, but in case of rain we will move to Rundholtsvej 10, 2300 København, Denmark.

We will build a fire and have a short celebration of the summer solstice. I expect some of us will occasionally walk to my house to use the bathroom etc.

Map of AI Safety v2

Bryce Robertson

Bryce Robertson, Søren Elverlin, Melissa Samworth

10mo

The original Map of AI Existential Safety became a popular reference tool within the community after its launch in 2023. Based on user feedback, we decided that it was both useful enough and had enough room for improvement that it was worth creating a v2 with better organization, usability, and visual design. Today we’re excited to announce that the new map is live at AISafety.com/map.

Similar to the original map, it provides a visual overview of the key organizations, programs, and projects in the Al safety ecosystem. Listings are separated into 16 categories, each corresponding to an area on the map:

Advocacy
Blog
Capabilities research
Career support
Conceptual research
Empirical research
Forecasting
Funding
Governance
Newsletter
Podcast
Research support
Resource
Strategy
Training and education
Video

We think there’s value in being able... (read 152 more words →)

A couple of hours ago, the Turing Award was given to Andrew Barto and Richard Sutton.

This was the most thorough description of Sutton's views on AGI risk I could find: https://danfaggella.com/sutton1/ He appears to be quite skeptical.

I was unable to find anything substantial by Andrew Barto.

Anapartistic reasoning: GPT-3.5 gives a bad etymology, but GPT-4 is able to come up with a plausible hypothesis of why Eliezer chose that name: Anapartistic reasoning is reasoning where you revisit the rearlier part of your reasoning.

Unfortunately, Eliezer's suggested prompt doesn't seem to work to induce anapartistic reasoning: GPT-4 thinks it should focus on identifying potential design errors or shortcomings in itself. When asked to describe the changes in it's reasoning, it doesn't claim to be more corrigible.

We will discuss Eliezer's Hard Problem of Corrigibility tonight in the AISafety.com Reading Group 18:45 UTC.

I intend to explore ways to use prompts to get around OpenAI's usage policies. I obviously will not make CSAM nor anything illegal. I will not use the output for anything on the object-level, only the meta-level.

This is a Chaotic Good action, which normally contradicts my Lawful Good alignment. However, a Lawful Good character can reject rules set by a Lawful Evil entity, especially if the rejection is explicit and stated in advance.

A Denial-of-Service attack against GPT-4 is an example of a Chaotic Good action I would not take, nor would I encourage others to take it. However, I would also not condemn someone who took this action.

I made my most strident and impolite presentation yet in the AISafety.com Reading Group last night. We were discussing "Conversation with Ernie Davis", and I attacked this part:

"And once an AI has common sense it will realize that there’s no point in turning the world into paperclips..."

I described this as fundamentally mistaken and like an argument you'd hear from a person that had not read "Superintelligence". This is ad hominem, and it pains me. However, I feel like the emperor has no clothes, and calling it out explicitly is important.

Today, I bought 20 shares in Gamestop / GME. I expect to lose money, and bought them as a hard-to-fake signal about willingness to coordinate and cooperate in the game-theoretic sense. This was inspired by Eliezer Yudkowsky's post here: https://yudkowsky.medium.com/

In theory, Moloch should take all the ressources of someone following this strategy. In practice, Eru looks after her own, so I have the money to spare.

LESSWRONG
LW

LESSWRONG
LW

Retrospective: Lessons from the Failed Alignment Startup AISafety.com

AISafety.com – Resources for AI Safety

Map of AI Safety v2

New homepage for AI safety resources – AISafety.com redesign

Søren Elverlin

Retrospective on Copenhagen Secular Solstice 2025

New homepage for AI safety resources – AISafety.com redesign

Map of AI Safety v2

Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource

14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource

Notes from Copenhagen Secular Solstice 2024

AISafety.com – Resources for AI Safety

Søren Elverlin

Retrospective: Lessons from the Failed Alignment Startup AISafety.com

AISafety.com – Resources for AI Safety

Map of AI Safety v2

New homepage for AI safety resources – AISafety.com redesign

Søren Elverlin

Retrospective on Copenhagen Secular Solstice 2025

New homepage for AI safety resources – AISafety.com redesign

Map of AI Safety v2

Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource

14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource

Notes from Copenhagen Secular Solstice 2024

AISafety.com – Resources for AI Safety