2022 (and All Time) Posts by Pingback Count

Raemon

For the past couple years I've wished LessWrong had a "sort posts by number of pingbacks, or, ideally, by total karma of pingbacks". I particularly wished for this during the Annual Review, where "which posts got cited the most?" seemed like a useful thing to track for potential hidden gems.

We still haven't built a full-fledged feature for this, but I just ran a query against the database, and made it into a spreadsheet, which you can view here:

LessWrong 2022 Posts by Pingbacks

Here are the top 100 posts, sorted by Total Pingback Karma

`Title/Link`	`Post Karma`	`Pingback Count`	`Total Pingback Karma`	`Avg Pingback Karma`
`AGI Ruin: A List of Lethalities`	870	158	12,484	79
`MIRI announces new "Death With Dignity" strategy`	334	73	8,134	111
`A central AI alignment problem: capabilities generalization, and the sharp left turn`	273	96	7,704	80
`Simulators`	612	127	7,699	61
`Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover`	367	83	5,123	62
`Reward is not the optimization target`	341	62	4,493	72
`A Mechanistic Interpretability Analysis of Grokking`	367	48	3,450	72
`How To Go From Interpretability To Alignment: Just Retarget The Search`	167	45	3,374	75
`On how various plans miss the hard bits of the alignment challenge`	292	40	3,288	82
`[Intro to brain-like-AGI safety] 3. Two subsystems: Learning & Steering`	79	36	3,023	84
`How likely is deceptive alignment?`	101	47	2,907	62
`The shard theory of human values`	238	42	2,843	68
`Mysteries of mode collapse`	279	32	2,842	89
`[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain`	57	30	2,731	91
`Why Agent Foundations? An Overly Abstract Explanation`	285	42	2,730	65
`A Longlist of Theories of Impact for Interpretability`	124	26	2,589	100
`How might we align transformative AI if it’s developed very soon?`	136	32	2,351	73
`A transparency and interpretability tech tree`	148	31	2,343	76
`Discovering Language Model Behaviors with Model-Written Evaluations`	100	19	2,336	123
`A note about differential technological development`	185	20	2,270	114
`Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]`	195	35	2,267	65
`Supervise Process, not Outcomes`	132	25	2,262	90
`Shard Theory: An Overview`	157	28	2,019	72
`Epistemological Vigilance for Alignment`	61	21	2,008	96
`A shot at the diamond-alignment problem`	92	23	1,848	80
`Where I agree and disagree with Eliezer`	862	27	1,836	68
`Brain Efficiency: Much More than You Wanted to Know`	201	27	1,807	67
`Refine: An Incubator for Conceptual Alignment Research Bets`	143	21	1,793	85
`Externalized reasoning oversight: a research direction for language model alignment`	117	28	1,788	64
`Humans provide an untapped wealth of evidence about alignment`	186	19	1,647	87
`Six Dimensions of Operational Adequacy in AGI Projects`	298	20	1,607	80
`How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme`	240	16	1,575	98
`Godzilla Strategies`	137	17	1,573	93
`(My understanding of) What Everyone in Technical Alignment is Doing and Why`	411	23	1,530	67
`Two-year update on my personal AI timelines`	287	18	1,530	85
`[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA`	90	16	1,482	93
`[Intro to brain-like-AGI safety] 6. Big picture of motivation, decision-making, and RL`	66	25	1,460	58
`Human values & biases are inaccessible to the genome`	90	14	1,450	104
`You Are Not Measuring What You Think You Are Measuring`	350	21	1,449	69
`Open Problems in AI X-Risk [PAIS #5]`	59	14	1,446	103
`[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?`	146	25	1,407	56
`Conditioning Generative Models`	24	11	1,362	124
`Conjecture: Internal Infohazard Policy`	132	14	1,340	96
`A challenge for AGI organizations, and a challenge for readers`	299	18	1,336	74
`Superintelligent AI is necessary for an amazing future, but far from sufficient`	132	11	1,335	121
`Optimality is the tiger, and agents are its teeth`	288	14	1,319	94
`Let’s think about slowing down AI`	522	17	1,273	75
`Niceness is unnatural`	121	12	1,263	105
`Announcing the Alignment of Complex Systems Research Group`	91	11	1,247	113
`[Intro to brain-like-AGI safety] 13. Symbol grounding & human social instincts`	67	23	1,243	54
`ELK prize results`	135	17	1,235	73
`Abstractions as Redundant Information`	64	18	1,216	68
`[Link] A minimal viable product for alignment`	53	12	1,184	99
`Acceptability Verification: A Research Agenda`	50	11	1,182	107
`What an actually pessimistic containment strategy looks like`	647	16	1,168	73
`Let's See You Write That Corrigibility Tag`	120	10	1,161	116
`chinchilla's wild implications`	403	18	1,151	64
`Worlds Where Iterative Design Fails`	185	17	1,122	66
`why assume AGIs will optimize for fixed goals?`	138	14	1,103	79
`Gradient hacking: definitions and examples`	38	11	1,079	98
`Contra shard theory, in the context of the diamond maximizer problem`	101	6	1,073	179
`We Are Conjecture, A New Alignment Research Startup`	197	8	1,050	131
`Circumventing interpretability: How to defeat mind-readers`	109	11	1,047	95
`Evolution is a bad analogy for AGI: inner alignment`	73	7	1,043	149
`Refining the Sharp Left Turn threat model, part 1: claims and mechanisms`	82	8	1,042	130
`MATS Models`	86	8	1,035	129
`Common misconceptions about OpenAI`	239	11	1,028	93
`Prizes for ELK proposals`	143	20	1,022	51
`Current themes in mechanistic interpretability research`	88	9	1,014	113
`Discovering Agents`	71	13	994	76
`[Intro to brain-like-AGI safety] 12. Two paths forward: “Controlled AGI” and “Social-instinct AGI”`	42	15	992	66
`What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?`	118	24	988	41
`Inner and outer alignment decompose one hard problem into two extremely hard problems`	115	17	959	56
`Threat Model Literature Review`	73	13	953	73
`Language models seem to be much better than humans at next-token prediction`	172	11	952	87
`Will Capabilities Generalise More?`	122	7	952	136
`Pivotal outcomes and pivotal processes`	91	8	938	117
`Conditioning Generative Models for Alignment`	56	9	934	104
`Training goals for large language models`	28	9	930	103
`It’s Probably Not Lithium`	441	5	929	186
`Latent Adversarial Training`	40	11	914	83
`“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments`	129	11	913	83
`Conditioning Generative Models with Restrictions`	18	5	913	183
`The alignment problem from a deep learning perspective`	97	8	910	114
`Instead of technical research, more people should focus on buying time`	100	15	904	60
`By Default, GPTs Think In Plain Sight`	84	9	903	100
`[Intro to brain-like-AGI safety] 4. The “short-term predictor”`	64	16	890	56
`Don't leave your fingerprints on the future`	109	11	890	81
`Strategy For Conditioning Generative Models`	31	5	883	177
`Call For Distillers`	204	19	878	46
`Thoughts on AGI organizations and capabilities work`	102	5	871	174
`Optimization at a Distance`	87	9	868	96
`[Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learning`	52	17	859	51
`What does it take to defend the world against out-of-control AGIs?`	180	11	853	78
`Monitoring for deceptive alignment`	135	11	851	77
`Late 2021 MIRI Conversations: AMA / Discussion`	119	8	849	106
`How to Diversify Conceptual Alignment: the Model Behind Refine`	87	27	845	31
`wrapper-minds are the enemy`	103	8	833	104
`But is it really in Rome? An investigation of the ROME model editing technique`	102	8	833	104
`An Open Agency Architecture for Safe Transformative AI`	74	12	831	69

[-]jessicata1y140

I have to look for a while before finding any non-AI posts. Seems LW is mainly an AI / alignment discussion forum at this point.

[-]ryan_greenblatt1y*112

It seems more informative to just look at top (inflation adjusted) karma for 2022 (similar to what habryka noted in the sibling). AI posts in bold.

AGI Ruin: A List of LethalitiesΩ
Where I agree and disagree with EliezerΩ
SimulatorsΩ
What an actually pessimistic containment strategy looks like
Let’s think about slowing down AIΩ
Luck based medicine: my resentful story of becoming a medical miracle
Counter-theses on Sleep
Losing the root for the tree
The Redaction Machine
It Looks Like You're Trying To Take Over The WorldΩ
(My understanding of) What Everyone in Technical Alignment is Doing and WhyΩ
Counterarguments to the basic AI x-risk caseΩ
It’s Probably Not Lithium
Reflections on six months of fatherhood
chinchilla's wild implicationsΩ
You Are Not Measuring What You Think You Are Measuring [AI related]
Lies Told To Children
What DALL-E 2 can and cannot do
Staring into the abyss as a core life skill
DeepMind alignment team opinions on AGI ruin argumentsΩ
Accounting For College Costs
A Mechanistic Interpretability Analysis of GrokkingΩ
Models Don't "Get Reward"Ω
Why I think strong general AI is coming soon
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeoverΩ
Why Agent Foundations? An Overly Abstract ExplanationΩ
MIRI announces new "Death With Dignity" strategy
Beware boasting about non-existent forecasting track records [AI related]
A challenge for AGI organizations, and a challenge for readersΩ

I count 18/29 about AI. A few AI posts are technically more general. A few non-AI posts seem to indirectly be about AI.

[-]habryka1y74

I think the AI posts are definitely substantially more interlinked than the non-AI posts, so I think specific metric oversamples AI posts.

[-]Raemon1y110

I just updated the spreadsheet to include All Time posts. Lists of Lethalities is still the winner by Total Pingback Karma, although not by Pingback Count (and this seems at least partially explained by karma inflation)

[-]Steven Byrnes1y80

The list would look pretty different if self-cites were excluded. E.g. my posts would probably all be gone 😂

[-]Raemon1y71

Yeah if I have time today I'll make an "exclude self-cites" column, although fwiw I think the "total pingback karma" is fairly legit even if including self-cites. If your followup work got a lot of karma, I think that's a useful signal about your original post even if you quote yourself liberally.

[-]Raemon1y40

(I've updated it to also show non-author-pingback count)

[-]Yoav Ravid1y20

Pingback count means amount of LW comments or posts that linked to it?

[-]Raemon1y20

The current version is just posts. It gets a little more complicated sorting out the comments.

Oh wow, so list of lethalities was linked to in 158 posts. That's a lot!

53

2022 (and All Time) Posts by Pingback Count

53

53