bodry

I agree with this a lot.

Do you have any thoughts on how you make multi-agent interactions between virtuous AIs robust to defectors? It seems like a strong case for solving interpretability in order to verify weights have the virtue of honesty.

Replying toWhat's a good methodology for "is Trump unusual about executive overreach / institution erosion / corruption?"

bodry24d*

What's a good methodology for "is Trump unusual about executive overreach / institution erosion / corruption?"

For general American decline there is a recent article by Noah Smith on capital flight away from America. Its quantitative and its a Trump second term phenomenon. There is also the number of meetings between Trump and different traditionally independent agencies like the FBI and the Department of Justice. I believe those numbers have exploded in this term. The number of probes into elected and appointed officials have definitely also exploded this term. There have also been a large number of acquittals at the grand jury level because the charges are so clearly vindictive and false.

I've always thought about using LLMs to analyze and compare speeches. You could for example look at... (read more)

•••

Replying toEliezer's Unteachable Methods of Sanity

bodry2mo

Eliezer's Unteachable Methods of Sanity

This vocalized some thoughts I had about our current culture. Stories can be training for how to act and bad melodramatic tropes are way too common. Every sad song about someone not getting over their ex or a dark hero movie where the protagonist is perpetually depressed about something that happened in the past conditions people the wrong way.

There is an annoying character in the recent Nuremberg film. He's based off a real person but I don't know how accurate that portrayal is.

He’s a psychiatrist manipulated by Goering. He's supposed to prevent the jailed Nazis from killing themselves but he also wants to write a book about the Nazis. In the process he

... (read more)

bodry3mo

Thirdly, although I’ve been talking about the “value” of a position as if it’s a well-defined concept, it mostly isn’t. Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself. But there’s no clear way to translate from that to the likelihood of winning against one’s actual opponent, which is what we’re interested in. I won’t discuss this further here, but trying to pin down how to estimate a position’s value in that sense seems potentially fruitful.

I'll take a shot. If is the expected return (1 for win, 0.5 for draw and 0 for loss) for Alice given that she's playing Bob, she knows Bob's source code, and the moves $m v s$ have been played so far, then the value of a position for Alice is $\sum_{p ϵ S} E_{A l i c e} (m v s, p) * P (p | m v s)$ where $S$ is the set of programs that Alice's opponent is drawn from.

Replying toAn Opinionated Guide to Privacy Despite Authoritarianism

bodry4mo

An Opinionated Guide to Privacy Despite Authoritarianism

I read this as government actions will be taken to persecute me and/or government actions will be taken to make me afraid. I only care about the first category. Whether or not I'm afraid is entirely within my power. I could be actively persecuted and still choose to not be afraid.

News is really bad right now and LLMs don't help.

Over 170 US citizens have been wrongly detained by ICE

That link description is misleading. That article does not claim that. The article states that they found 170 citizens that have been factually detained by ICE. ICE may detain US citizens without a warrant for searches and arrests if conditions are met (see Powers... (read more)

-5

Replying toAn Opinionated Guide to Privacy Despite Authoritarianism

bodry4mo

An Opinionated Guide to Privacy Despite Authoritarianism

I am a US citizen who opposes Trump. I follow the political situation closely and I would consider my threat-level to be low to very low. Why do you think the threat-level is medium?

btw I appreciate the privacy guide.

Replying toThe Memetics of AI Successionism

bodry4mo

The Memetics of AI Successionism

This is a solid piece of analysis.

There are also some memes that don't occupy a clear position:

There are memes that permit a retreat to "its a joke".
Sometimes politicians will argue for one thing along based off some values and its near opposite based off different values. For example, a politician might argue for free speech based off tradition and liberalism, interpose a "but", and then hint some specified speech should be prosecuted based off security and pragmatism.

A wait-and-see crowd ready to take the winning side will share these kind of memes. They are most common in political power struggles but may arise in the AI landscape.

bodry4mo

I agree this variation would lengthen the game.
The experience would change for sure for all human players.

An objectively losing human player may intentionally play objectively bad moves that lengthen a game and complicate it. It’s a learned skill that some players have honed better than others.

In this variation that skill is neutralized so I imagine elos would be different enough to have different player rankings.

bodry4mo

This thread made me very curious as to what the elo rating of an optimal player would be when it knows the source code of its opponent.

For flawed deterministic programs an optimal player can steer the game to points where the program makes a fatal mistake. For probabilistic programs an optimal player is intentionally lengthening the game to induce a mistake. For this thought experiment if an optimal player is playing a random player than an optimal player can force the game to last 100s of moves consistently.

Replying toSome Biology Related Things I Found Interesting

bodry5mo

Another factor for the evolutionary benefit of peeing promptly is it decreases the risk of Urinary Tract Infections. It also lets you drink more water.

What is the Base Model Simulation of Human AI-Assistant Conversation?

bodry

5mo

There has been talk^[1] about how our (often-mistaken) understanding of the AI assistant is spreading its way on the internet and influencing the next-generation models. If AI companies are researching early checkpoints of their fine-tuned models than they probably have some idea of the internal shaping of the concept of an AI assistant produced by the internet. For example, in Section 4.1.4 the Claude-4 model card^[2] the authors mention that an early model checkpoint would sometimes generate text similar to the public transcripts produced by the paper Alignment Faking in Large Language Models^[3].

In an investigation into the Internet's shaping of the AI assistant, I ran an experiment with the DeepSeek v3.1 base model. The... (read 6021 more words →)

I've made a timeline of the federal takeover of DC that I plan to update daily.

https://plosique.substack.com/p/timeline-of-the-federal-takeover

This is a well-documented event so I've not making this a full link post. I grew up and currently live in Northern Virginia and I've made several visits to DC since the takeover. It feels significant and definitely feels like it could grow into something very significant. I am not supportive of the takeover but there's more nuance than the coverage of it (no surprise there). A bird eye's view has been helpful in thinking about it and arguing with the people I know who are supportive of it.

bodry's Shortform

bodry

6mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Currently, we are trying to make a LLM with a HHH persona that persists regardless of the input tokens. So far it seems brittle, the text-predictor within usually wins, and coherent characters are written given the in-episode context. However, the HHH persona is becoming stronger as capabilities improve. It's becoming harder to jailbreak and its global persona stays coherent in contexts where the text-predictor wants to write a much different character. I don't want training to succeed in turning the text-predictor/base model into a completely globally coherent character regardless of the traits we give it. My intuition is that the basin of global coherence is filled with personas that are situationally-aware, know... (read more)

How well can Claude write coding questions?

bodry

I'm curious as to how well Claude can write interesting coding and mathematics problems. This post is a partial product of that exploration.

It is a much harder skill to come up with a good problem than to solve one of equal difficulty. While there is a large corpus of data on solving problems, there is very little on writing them. Problem authors typically just share their problem and not the thought process that went into it. I also think that it's a skill correlated to doing good research. FYI, I prompted Claude to write both interesting and novel questions but I did not seriously research whether the questions it wrote were actually... (read 3368 more words →)

How might language influence how an AI "thinks"?

bodry

In some fiction I've encountered humans are treated as being linguistically deterministic. For example, in Arrival humans are strongly linguistically deterministic and a human that learns the alien language is able to escape the linear ordering of time that English imposes. In 1984 Newspeak is a language created for the purpose of limiting human expression particularly those of political will.

Is there any evidence that the language a model is trained in significantly effects any abilities, like that of deception? Or does the language a model is trained on not matter at all?

What is the near-future feasibility of training an AI on a language with a limited amount of publicly available data?

LESSWRONG
LW

LESSWRONG
LW

How well can Claude write coding questions?

What is the Base Model Simulation of Human AI-Assistant Conversation?

How might language influence how an AI "thinks"?

bodry's Shortform

bodry

What is the Base Model Simulation of Human AI-Assistant Conversation?

bodry's Shortform

How well can Claude write coding questions?

How might language influence how an AI "thinks"?

bodry

How well can Claude write coding questions?

What is the Base Model Simulation of Human AI-Assistant Conversation?

How might language influence how an AI "thinks"?

bodry's Shortform

bodry

What is the Base Model Simulation of Human AI-Assistant Conversation?

bodry's Shortform

How well can Claude write coding questions?

How might language influence how an AI "thinks"?