Wikitag Dashboard — LessWrong

Shift Towards The Hypothesis Of Least Surprise

I might just be daft, but I was confused by this sentence and my best explanation is there is a typo in it:

> Previously, we've been combining both and into unified likelihood ratios, like which says that the 'blue' observation carries 1 bit of evidence

It seems like the correct reading would be that a blue observation carries 1 bit of evidence against H?

Aether

Edited by (+2/-7) Jun 24th 2026 GMT 1

Discuss this tag

Aumann's Agreement Theorem

Edited by (+7/-138) Jun 24th 2026 GMT 1

Discuss this tag

There is only one logarithm

Edited by (+3/-332) Jun 22nd 2026 GMT 1

Discuss this wiki

Role Science

New tag created by lesscertain at 7d

207A Mechanistic Explanation of Prompt Injection (and why you should study roles)

Charles Ye, lesscertain

15Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs

dgros

24d

1Role confusion: sounding like the cause is indistinguishable from being it.

Owain Mogford

Discuss this tag

The characteristic of the logarithm

Edited by (+65/-213) Jun 20th 2026 GMT 1

Discuss this wiki

Guarded Definition

Parash Rahman9d10

I appreciate this idea. It happens way too much. I find it is an issue of confusion instead of discourtesy. Ideas become messy and unclear when expanded arbitrarily to a person's subjective tastes. Finding new terms for the intuitive extensions help with clarity.

No Free Lunch Theorems Are Often Irrelevant

Parash Rahman9d10

I contend that these results are "often" irrelevant. I think this may be a case of survivorship bias. You point out examples in which the theorems are side-stepped, but the side stepping was gained after most of human history was finished. They are also still unsatisfactory to researchers (NP-P isn't even proven). There are combinatorially difficult problems everywhere. Everyday problems with relationships, jobs, business ideas, governing, are technically too hard and are pretty much trial and error. You mention humans being smart, which is true, but we are still slow to learn, and science/math/society are slow to progress despite having billions of us. Otherwise, we would be heuristicing our way to the answers of grand questions and we would have things as smart as us already.

Rare LLM Behaviours

Edited by (+2155/-222) Jun 17th 2026 GMT 1

Discuss this tag

Rare LLM Behaviours

Edited by (+2581) Jun 17th 2026 GMT 1

Discuss this tag

Rare LLM Behaviours

New tag created by beyarkay (Boyd Kane) at 12d

Many "rare" LLM behaviours are known if you're in the know (e.g. Gemma/Gemini acting weird around dates after their training cutoff) but aren't immediately apparent if you're just working with the LLMs. In lieu of an existing resource about this, I thought I'd start the wiki (with the hope of others contributing to it in the future).

I'd like this list to become an evaluation so that it's actually reproducible, but I don't have time to do that at the moment.

If you know of a weird behaviour that's not on this list, please add it!...

(Read More)

Discuss this tag

Dath Ilan

Edited by (+995/-249) Jun 13th 2026 GMT 2

Discuss this tag

Ordinary Claims Require Ordinary Evidence

alek5nder16d10

I guess its not a clue of the example, but its definitely not formulated right. If the hypothesis is ordinary (how are we distinguish ordinary from extraordinary) I am not suppose to ask for the evidence? What about my own evidence? I mean, I should evaluate a prior evidence that he is a bad employee considering that I knew the guy and I thought that he is okay (he is not just a random worker I guess)

Bayes Rule Odds Form

Alex Wang17d20

Reminds me of homogeneous coordinates in projective spaces

AI Safety & Entrepreneurship

Edited by (+583/-5) Jun 10th 2026 GMT 1

Discuss this wiki

AI Safety Entrepreneurship

mick19d10

It's a wiki, so feel free to add resources related to entrepreneurship here! I've added Atlas Computing now

Original Sequences

Amitav Krishna1mo10

"PDF version of most sequences in a single file. Has cross-reference support for internal links (PDF links or footnotes), and page size is appropriate for tablets."

Hey, this link appears to be dead

Son of CDT

Edited by (+5/-7) Jun 1st 2026 GMT 2

Discuss this wiki

Aether

Edited by (+19/-53) May 29th 2026 GMT 1

Discuss this tag

Aether

Edited by (+41/-6) May 29th 2026 GMT 1

Discuss this tag

The first, later-retconned version of dath ilan was introduced by Eliezer ~~first introduced it~~ in his April Fool's day post 'My April Fools Day Confession', where he claimed that he was merely an average person from ~~that~~a different world and none of ~~his~~"his" ideas were ~~original.~~

~~This world was further fleshed out (and some its backstory changed) in a later April Fool's post:~~original to him.

~~And in~~Dath ilan was later hugely revised to be premised on "the median person is Eliezer Yudkowsky", ie, the average bloke on a dath ilan street would, transported into Earth as a child, grow up reading about causal decision theory, and invent timeless decision theory instead.

The new dath ilan was defined mainly via a series of glowfics featuring dath ilani ~~characters:~~characters. Eliezer's penname appears as "Iarwain" in these stories.

The stories weren't first chronologically, but are relatively short completed stories with an interior view of dath ilan:

Some others by Eliezer (Iarwain) in chronological order:

a dath ilani matchmaker in King Randale's court (incomplete)
but hurting people is wrong (complete)
~~mad investor chaos and the woman of asmodeus~~Project Lawful ~~(ongoing as of Apr 2022)~~(1.8m words, complete but for epilogue)
a dath ilani EMT in queen abrogail's court ~~(ongoing as of Apr 2022)~~(incomplete)

Standard education includes rationality training (from which, allegedly, all of Eliezer's own ideas are plagiarized).
What this Earth knows as "logical decision theory" / "Functional Decision Theory" is just standard decision theory as formulated in dath ilan.
Land Value Tax, positional-goods tax, status-goods tax, marketing-tax, no income tax.
Movable Homes.
Autonomous electric cars in tunnels instead of ICE cars on roads.
NoMost cities have no streetlights at night, except for red lights along walkways, which for 45 minutes each night also turn off to see the sky. And on winter solstice (night of stars) the lights stay off the whole night, to give a perfect view of the night sky.
The average IQ in Earth-equivalent terms is 143, after an unknown number of generations of "heritage optimization" pursued via positive government subsidies.
Dath ilan has a lower population than Earth (1 billion people), since their higher intelligence and higher coordination enabled them to make tech progress without having first attained a higher population than that.
There exists a single global civilization, calling itself "Civilization", with one top-level world government, called "Governance"., divided into many cities with varying local governments.
All humans (and other hominids, like chimpanzees) are cryopreserved upon death.
If you commit a murder but don't want to consign your victim to true death, you can call in a government service, the Surreptitious Head Removers, who will remove your victim's head for cryonic preservation, while being sworn not to report on this. Trials for murder are not allowed to take into

...

Read More (161 more words)

Algebraically, writing f for the function that measures your costs, c(x⋅2)= c(x)+c(2), and, in general, c(x⋅y)= c(x)+c(y), where we can interpret x as the number of possible messages before the increase, y as the factor by which the possibilities increased, and x⋅y as the number of possibilities after the increase.

This is the key characteristic of the logarithm: It says that, when the input goes up by a factor of y, the quantity measured goes up by a fixed amount (that depends on y). When you see this pattern, you can bet that c is a logarithm function. Thus, whenever something you care about goes up by a fixed amount every time something else doubles, you can measure the thing you care about by taking the logarithm of the growing thing. For example:

Conversely, whenever you see a ~~log~~2 in an equation, you can deduce that someone wants to measure some sort of thing by counting the number of doublings that another sort of thing has undergone. For example, let's say you see an equation where someone takes the ~~log~~2 of a relative likelihood. What should you make of this? Well, you should conclude that there is some quantity that someone wants to measure which can be measured in terms of the number of doublings in that likelihood ratio. And indeed there is! It is known as (Bayesian) evidence, and the key idea is that the strength of evidence for a hypothesis A over its negation ¬A can be measured in terms of 2:1 updates in favor of A over ¬A. (For more on this idea, see What is evidence?).

In fact, a given function f such that f(x⋅y)=f(x)+f(y) is almost guaranteed to be a logarithm function — modulo a few technicalities.

This puts us in a position where you can derive all the main properties of the logarithm (such as ~~log~~b(xn)=n~~log~~b(x) for any b) yourself. ~~Check this box if that's something~~If you'~~re interested in doing.~~

y
- ~~path:~~d like to try doing so, you can check your rederivations here: Properties of the logarithm
n

~~Conditional text depending what's next on the path.~~

I'd like this list to become an evaluation so that it's actually reproducible, but I don't have time to do that at the moment.

If you know of a weird behaviour that's not on this list, please add it!

GPT-5.1 to GPT-5.5 models seem to be somewhat obsessed with goblins, gremlins and other small fantasy creatures "they increasingly mentioned goblins, gremlins, and other creatures in their metaphors" source
- Specific models affected: GPT-5 Thinking, GPT-5.1 Thinking, GPT-5.2 Thinking, GPT-5.4 Thinking, GPT-5.5 Thinking
GPT-4o was widely considered to be sycophantic, although I've struggled to find the version of 4o for which this was the worst, I believe they made several changes to the model they called GPT-4o that reduced the sycophancy over time before eventually retiring 4o OpenAI blog post, Simon Willison weblog
Gemma3-27b tends to break down when told that it's answer is wrong LW, Arxiv
- This seems to be a persistent issue with the Gemma/Gemini models from GDM (e.g. see these posts from GDM about trying to remove these behaviours)
- I attempted to reproduce this, and the behaviour is only present in Gemma3-27b when sampling with top_k=-1 and top_p=1.0 (e.g. sampling from the full range of tokens). Many providers now sample with something like top_k=64 and top_p=0.95 (e.g. DeepInfra via OpenRouter)
Gemma3, Gemma 4 & Gemini 3 (maybe also others) seem to be skeptical of dates in 2026 and beyond, claiming that anything happening in 2026 is just fictional or rollplay LW post
Claude Opus 4.8 seems to slip in non-english language tokens (in a sensible way) although I've not seen much of this beyond tweets:

Many of the Chinese models (Qwen, DeepSeek) show CCP-aligned behaviours and censorship LW post
Many LLMs have "attractor states", styles of talking and topics of conversation that they devolve into if you let them talk with each other for 30+ turns LW post.
Many LLMs have "glitch tokens" which cause them to be unable to answer the prompt, or to be unable to repeat that token, or to otherwise be unpredictable. SolidGoldMagikarp is the original LW post, although this file on GitHub (from Pliny the Liberator) and then this website (click the "bug" icon in the top right) have a large collection of glitch tokens and their origins

Wikitags in Need of Work

Newest Wikitags

Wikitag Voting Activity

Recent Wikitag Activity

Wikitags in Need of Work

Newest Wikitags

Wikitag Voting Activity

Recent Wikitag Activity

You may want to look at the Founder Toolkit which will probably be more up-to-date.date, especially on programs and funding.

Obsession with goblins, gremlins, and other small fantasy creatures in metaphors

Sycophancy

Breaking down when told the answer is wrong

Skepticism about dates in 2026 and beyond

lesscertain	Role Science (3)	7d
beyarkay (Boyd Kane)	Rare LLM Behaviours (1)	12d
Joey Yudelson	Aether (16)	1mo
abramdemski	Open-Minded Updatelessness (3)	1mo