This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
AI
•
Applied to
Release of UN's draft related to the governance of AI (a summary of the Simon Institute's response)
by
Sebastian Schmidt
2h
ago
•
Applied to
Refusal in LLMs is mediated by a single direction
by
habryka
3h
ago
•
Applied to
Mercy to the Machine: Thoughts & Rights
by
TagWrong
4h
ago
•
Applied to
Link: Let's Think Dot by Dot: Hidden Computation in Transformer Language Models by Jacob Pfau, William Merrill & Samuel R. Bowman
by
TagWrong
8h
ago
•
Applied to
Two Vernor Vinge Book Reviews
by
Maxwell Tabarrok
9h
ago
•
Applied to
Plausibility of Getting Early Warning Shots because AIs can't coordinate?
by
TagWrong
13h
ago
•
Applied to
Exploring the Esoteric Pathways to AI Sentience (Part One)
by
TagWrong
20h
ago
•
Applied to
Superposition is not "just" neuron polysemanticity
by
habryka
21h
ago
•
Applied to
Wouldn't weak AI agents provide warning?
by
Mandatory Topic
1d
ago
•
Applied to
Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?
by
Gordon Seidoh Worley
1d
ago
•
Applied to
Scaling of AI training runs will slow down after GPT-5
by
TagWrong
1d
ago
•
Applied to
An Introduction to AI Sandbagging
by
Teun van der Weij
1d
ago
•
Applied to
LLMs seem (relatively) safe
by
JustisMills
2d
ago
•
Applied to
Why I stopped being into basin broadness
by
TagWrong
2d
ago
•
Applied to
AXRP Episode 29 - Science of Deep Learning with Vikrant Varma
by
TagWrong
2d
ago
•
Applied to
Improving Dictionary Learning with Gated Sparse Autoencoders
by
Neel Nanda
2d
ago
•
Applied to
Cybersecurity of Frontier AI Models
by
TagWrong
2d
ago
•
Applied to
The first future and the best future
by
TagWrong
3d
ago
•
Applied to
At last! ChatGPT does, shall we say, interesting imitations of “Kubla Khan”
by
Bill Benzon
3d
ago