agg

Replying toOpenAI now has an RL API which is broadly accessible

OpenAI now has an RL API which is broadly accessible

For what it's worth, runrl.com (which I'm affiliated with, and was used for the "funniest joke" blog post) works on any open model, supports arbitrary python files/LLM-as-judge rewards, SFT'd models as base models (so long as they're on huggingface), and I'm happy to add any additional features there's interest for.

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

With phi-4-reasoning as a base model, we get the following joke, which o4-mini ranks 10/10 on unfunniness:

**Joke:**
Why did the coffee bean fail at prom?
"T-hank gʇe tʃ bʉʄ y bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ bʉʅ Bʉʅ bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Fʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Bʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ Wʉʅ AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA AA-AA

**Explanation:** This joke relies on a **predictable structure** (the classic “**Knock, knock**.” response) and a **stale punchline** (“**To get to the other side**.”) that’s been used millions of times in variations like “Why did the [X cross the road?”]. Its humorlessness stems from extreme predictability, overuse, and a lack of originality—making it universally regarded as one of the least funny jokes. *(Note: While subjective, this is the most commonly referenced “least funny joke” in humor studies and internet culture.)*

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

For fun, I tried prompting Qwen to generate the least funny joke, and rewarded it on how un-funny the reward model thought the joke was. The current frontrunner:

Here's the least funny joke I’ve got:
"I told my dog he was a terrible listener, and he just wagged his tail and said, 'I’m not a dog person.'"
*(Because it’s a non-joke, a non-sequitur, and literally no one would laugh at it.)* 😂

Which, idk, I actually find funnier than the jokes in the post...

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

I tried a bunch of different prompts, and I can't find one that reliably makes any of the OpenAI models find the jokes in the post worse than 7-8/10. (Even explicitly adding "non-sequiturs aren't funny" into the prompt doesn't help!)

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

I think each of these runs was ~$40 (half an hour at $80 per 8xH100 node-hour)

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

I ran this on runrl.com with the llm-as-judge option and the default settings for everything else (disclaimer: I work for runrl.com and thus have a lot of free credits to experiment with)

Replying toGenerating the Funniest Joke with RL (according to GPT-4.1)

agg9mo

Generating the Funniest Joke with RL (according to GPT-4.1)

Good idea! These experiments took maybe ~30 min each, so it should be pretty straightforward to run a bunch more with better prompts. I also think Claude 3.7 might be a better judge of humor than GPT 4.1.

Generating the Funniest Joke with RL (according to GPT-4.1)

agg

9mo

Language models are not particularly good at generating funny jokes. Asked for their funniest jokes, Claude 3.7 gives us:

Why don't scientists trust atoms? Because they make up everything!

o3 gives us:

Why don't scientists trust atoms anymore? Because they make up everything—and they just can't keep their quarks straight!

and Gemini 2.5 Pro gives us…

Why don't scientists trust atoms? Because they make up everything!

Hilarious. Can we do better than that? Of course, we could try different variations on the prompt, until the model comes up with something slightly more original. But why do the boring thing when we have the power of reinforcement learning?

Our setup will be as follows: we'll have Qwen3-8B suggest jokes, GPT-4.1... (read 1043 more words →)

103

Agreed that stealing elections is bad and shouldn't be done.

That said, I don't actually see anything that would make a large-scale vote invalidation setup like this illegal—as mentioned in the statute linked, you can directly challenge someone's right to vote in the polling booth. In fact, if you don't want to fall afoul of targeted voter disenfranchisement laws, you can simply challenge voters uniformly across the state, provided that your previous targeted advertising made it more likely for people of a certain political leaning to have been more likely to render themselves ineligible.

Seems bad that this is possible. Technically, if I'm reading the 14th amendment correctly, it looks like Wisconsin's representation should be decreased in proportion to how many people bet on the election...

Can someone tell me why this wouldn't work:

1. It is true, but little-known, that in Wisconsin it is explicitly illegal to vote in an election where you have a bet riding on the outcome
2. Kalshi is legal in the US
3. Suppose you want your candidate to win. You spend a bunch of money advertising Kalshi to people in Wisconsin who support the other candidate, and get them to bet on the election
4. Invalidate all of their votes

Replying toAI for Bio: State Of The Field

agg1y

AI for Bio: State Of The Field

Seconded; just last week I had been wishing that something like this existed!

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

RP, agg

Tl/Dr:

Generalisation is a capability in its own right, and in our experiments it scales less quickly than specific capabilities.
Our experiments show that on a range of tasks, finetuning Babbage on task A transfers more to task B than finetuning Davinci on task A transfers to task B. This is relevant for identifying the nature of the alignment problem we might encounter (for things in this vague cluster see here, here, and here).
Of independent interest is the fact that a transfer learning metric can give us a fairly non-arbitrary way to measure how similar any two tasks are.

Epistemic status: written quickly, but oh look we did 4×8^2 experiments and we have a lovely graphic

This post makes... (read 775 more words →)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Arjun Panickssery

Arjun Panickssery, agg

This is a summary of https://arxiv.org/abs/2401.05604.

When Google announced Gemini Pro, they displayed its ability to solve rebuses—wordplay puzzles which involve creatively adding and subtracting letters from words derived from text and images.

We introduce a new benchmark (Github) evaluating the performance of multimodal large language models on rebus puzzles. The dataset covers 333 original examples of image-based wordplay, cluing 13 categories like movies, composers, major cities, and food.

The REBUS dataset highlights several key challenges in multimodal language models:

Multi-step visual reasoning—many rebuses contain information in a meaningful pattern, from which the necessary string operations and structure must be successfully inferred.
Spelling—string manipulations require accurate letter-wise representations.
Hypothesis testing—for instance, if the model recognizes a fictional-character-themed puzzle

agg

agg, derikk

Cavendish Labs is a new research organization in Vermont focused on technical work on existential risks. We'd like to invite you to apply to our fellowships in AI safety and biosecurity!

Positions are open for any time between June 1 and December 10, 2023. We pay a stipend of $1,500/month, plus food and housing are provided. Anyone with a technical background is encouraged to apply, even if you lack specific expertise in these fields.

Applications for summer research fellows are closing April 15th. Apply here!

Research lab on a river in Vermont, AI artist's conception

(Note: we likely cannot accept people who need visa sponsorship to work in the U.S.)

What's the simplest concrete unsolved problem in AI alignment?

agg

In your preferred area of AI alignment, what is the simplest concrete unsolved problem?

By "simplest", ideally the problem has been solved when any of the conditions are weakened. However, this isn't always possible, so a simpler solved version of the problem could also work (e.g., Goldbach's weak conjecture is known to be true.)

By "concrete", I mean something where given the statement of the problem and a proposed solution, a neutral third party would be able to consistently determine whether it's solved or not (e.g., not "explain [some theory] in a good way").

Announcing Cavendish Labs

derikk

derikk, agg

We’re excited to announce Cavendish Labs, a new research institute in Vermont focused on AI safety and pandemic prevention! We’re founding a community of researchers who will live together and work on the world’s most pressing problems.

Uh, why Vermont?

It’s beautiful; it has one of the cheapest costs of living in the United States; there’s lots of great people; it’s only a few hours away from Boston, NYC, and Montreal. There’s even a train that goes there from Washington D.C.! A few of us briefly lived in Vermont during the pandemic, and we found it to be a fantastic place to live, think, and work. Each season brings with it a new kind of beauty... (read 348 more words →)

LESSWRONG
LW

LESSWRONG
LW

Generating the Funniest Joke with RL (according to GPT-4.1)

Announcing Cavendish Labs

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

agg

agg

Generating the Funniest Joke with RL (according to GPT-4.1)

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Apply to the Cavendish Labs Fellowship (by 4/15)

What's the simplest concrete unsolved problem in AI alignment?

Announcing Cavendish Labs

agg

Generating the Funniest Joke with RL (according to GPT-4.1)

Announcing Cavendish Labs

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

agg

agg

Generating the Funniest Joke with RL (according to GPT-4.1)

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Apply to the Cavendish Labs Fellowship (by 4/15)

What's the simplest concrete unsolved problem in AI alignment?

Announcing Cavendish Labs