Stefan
Stefan has not written any posts yet.

Stefan has not written any posts yet.

When you chat with an AI assistant, it usually acts helpful and professional. But sometimes things get weird - the model starts speaking in a mystical tone, claims to be something else entirely, or drifts into bizarre behavior. What's going on under the hood?
A recent Anthropic paper digs into the geometry of "personas" inside language models. They find that diverse character types (Ghost, Sage, Nomad, Demon...) cluster along a primary axis - and at one end sits the helpful Assistant we're familiar with.
We'll discuss the paper, what it tells us about how RLHF actually shapes models, and what it might mean for alignment.
Please read this paper to prepare for the session:
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
(A shorter summary can be found here)
To start off the new year, we will discuss the art and science of forecasting and try our hands on making our own forecasts for 2026, so we can see how well calibrated (or not ;-) ) we are next year.
Hi all!
Let's get together for a cozy fika and reflect on 2025. What surprised you this year, what did you learn, what predictions did you get wrong or right, and what are you looking forward to in 2026.
Whether you're a regular or dropping by for the first time, you're welcome to join us for some December coziness and thoughtful discussion.
This year's ACX Everywhere Fall Meetup in Gothenburg. If you are reading this, you're invited!
We will be in the Condeco Fredsgatan on the second floor, look for some books on the table.
PS: We are on the second floor of the café, look for a book on the table
Come by! Meet interesting people, chat interesting chat!
Normally we just chat about whatever comes up. Past topics of conversation have included AI alignment, decision theory (Newcomb's paradox etc), progress in AI and much much more.
Come by! Meet interesting people, chat interesting chat!
Normally we just chat about whatever comes up. Past topics of conversation have included AI alignment, decision theory (Newcomb's paradox etc), progress in AI and much much more.
In this session we will cover Aumanns Agreement Theorem. If you are not familiar with it, here is an explanation: Explanation of Aumanns Agreement Theorem by Scott Aronson, but that is optional reading, we will go through the basics during the meetup event.
Afterwards we are going to play the Aumann Game, were we practice updating probabilities in a cooperative setting.
This year's Spring ACX Everywhere Meetup in Gothenburg. If you are reading this, you're invited!
We will be in the Condeco Fredsgatan on the second floor, look for some books on the table.
Come by! Meet interesting people, chat interesting chat!
Normally we just chat about whatever comes up. Past topics of conversation have included AI alignment, decision theory (Newcomb's paradox etc), progress in AI and much much more.
(We will be on the second floor of the Condeco café, look for a book on the table)
Come by! Meet interesting people, chat interesting chat!
Normally we just chat about whatever comes up. Past topics of conversation have included AI alignment, decision theory (Newcomb's paradox etc), progress in AI and much much more.
(We will be on the second floor of the Condeco café, look for a book on the table)
Hi all,
I had to move the location to a nearby café to avoid a collision with a different meetup.