AI agents might not need to develop long-term drives of their own to perform well at long-term tasks, if they can use humans for that purpose. As a corollary, AI agents that do not have long-term drives of their own can still act in long-term coherent ways, if they inhabit a world with abundant, easy-to-hire human labor. If it's cheaper to hire a human to get a scaffolded AI system to stay on track towards achieving some goal than it is to automate that monitoring, then it probably makes sense to just use the human.
Concretely, consider an AI system tasked ...
I think this doesn't make sense for solving genuinely hard scientific/technological/mathematical/philosophical problems such as the strawberry problem. (It makes sense when the big task has a basically known decomposition into a large number of small easy tasks though.) A central issue is that good high-level decisions are very important, there are very many of them, and they
Small donors should not worldview-diversify.
Occasionally I encounter small donors (e.g. 10% pledgers earning <$200K) with highly specialised skills and knowledge (e.g. working on a sub-sub-topic of an EA cause area) who donate primarily to GiveWell top charities. These people do incredible amounts of good, and are highly commendable.
That said, I think would probably do more good by donating according to their inside view and special knowledge. Worldview diversification makes sense for large funders like Coefficient Giving, but their reasons don't apply ...
I think for small donors, donating to the best unregistered charity is >>2x times the best registered charity, for the reasons OP outlines: registered charities are much better covered by large institutions, and lots of people are overanchored on registration so the unregistered are neglected by comparison.
The counterargument is that bednets/givedirectly are just pretty good and it's unlikely any particular new thing beats them. Which is a fine approach, but not what we're talking about here.
Here are the 2025 AI safety papers and posts I like the most.
The list is very biased by my taste, by my views, by the people who had time to argue that their work is important to me, and by the papers that were salient to me when I wrote this list. I am highlighting the parts of papers I like, which is also very subjective. (This is similar to the 2024 edition here.)
Bangers on multiple dimensions
★★★ You can measure time horizon, and it grows predictably-ish (Measuring AI Ability to Complete Long Tasks)
★★★ Black-box techniques work better than you think + y...
I think that this should really be a top-level post.
Why I’m unconvinced by Tegmark’s argument for the mathematical universe hypothesis
The basic argument seems to be:
I don’t see why we should buy (2).
As far as I know, there are two arguments for (2)....
Freedom from human independent concepts is supposed to support the idea of a mathematical universe, not a relational one
Tegmark says (p. 10 here) "the only intrinsic properties of a mathematical structure are its relations". So I think I am representing his actual argument for MUH in the OP. I think "the claim that the idea that some , but only some, maths exists materially, is unnecessary baggage" is meant to derive the Level IV multiverse from MUH.
...It isn't obvious that maths isn't a human invention. It isn't obvious that an external world has to be indep
Something I'm thinking about today: frontier LLMs have a pretty unusual capabilities profile. This means one of two things: either I should think of LLMs as leveraging massive amounts of necessary compute and the problems they can solve as much more compute-vulnerable than I thought they were (i.e. this is Deep Blue and everything is kind of chess) or multiple intelligences models are simply true, in that cognition has multiple parts that don't necessarily have anything to do with each other. The latter predicts that step changes in capabilities are availa...
I agree that this is the relevant consideration. I think that if cognition has many parts, we should actually expect some parts that humans use to be completely missing in LLMs (and vice-versa), and it's not clear to me whether I should expect scaling architecture to actually produce more parts in this way, I have some intuitions that say (for combinatorial reasons) that within a certain architecture, training dynamics will eventually stop favoring the formation of circuits past a certain size regardless of how many layers you stack, but I am not that conf...
I'm cautiously optimistic about my new Claude Coach GitHub repo. I want to work out more but hate trying to decide what to do and tracking things, especially when I'm not working with a full gym. Now I just open Claude Code and ask it what to do (specifying the gym), do the work out, then update it with what I did and how it felt. It creates a PR to track the session and update the plan.
I still hate working out, but at least I don't have to go anywhere, deal with any people, or think about it all.
Perhaps my favourite relation in physics is
t/T = (l/L)^{1-k/2}.
This says that for a bunch of particle in a potential V = a x^k, if you let the system evolve over time T forming a path which has size L in some sense, then there is another path which is a re-scaled version of the original one s.t. if it has size l, then the time taken to form this new path is t.
We can use this trick to create a bunch of “scaling laws” for simple physical systems. For example:
1) Let V = a x ^{-1} i.e. a gravitational potential. Then we have k=-1, so
t/T = (l/L)^{3/2}
(t/T)^{2} ...
It'll one-shot easy cases, yeah. And if you want to convert to HTML/Unicode for places where you don't have direct LaTeX support, you can also have a LLM do that, albeit there area lot of edge-cases and I don't think LLMs will usually use more exotic Unicode like FRACTION SLASH for things like '3/2' etc, so I have a big script for that: https://gwern.net/static/build/latex2unicode.py (Github).
Do you know a person who believes that ASI will be created in <50 years who ISN'T in the LW/rationalists circle?
My parents don't believe that a superintelligent AI will be created within this century, or ever for that matter, or that AI will ever take jobs. My relatives laugh at the idea of AI solving a high school math problem and think state-of-the-art AI is on the level of GPT-2 (I mean that the capabilities they have in mind are on the level of GPT-2, not that they know what GPT-2 is). My friend who is an organic chemist laughs at the idea of AI doi...
I've been in the mostly-academic AI circles in the Boston area for decades. Lots of people in these circles think ASI is plausibly close. I think it's difficult to pay close technical attention to the field and not think that AI is currently par-human, and improving every year. Many of them disagree with LessWrong concensus that it will be fatal, of course. Or simply haven't thought it through.
The four human theory suggesting bloodletting might be quite bad, but at the same time the might have recommended chicken soup for viral influenza like infections which is better than the treatments of modern medicine.
I run a curated Discord for high agency people with Long COVID/ME (myalgic encephalomyelitis). The group includes tech founders, researchers, rationalists/EAs, etc. The focus is on troubleshooting each other's conditions actively, as well as creating a body of knowledge to bring back to the wider community in the form of writing, education, projects, companies, etc.
Some know me as Liface, others as Liam Rosen - I have been in the Rationality community for over 10 years, previously in the Bay, now in New York City, and am the main moderator of r/slatestarc...
Sorry to hear about your condition.. starting a community sounds good, I just emailed you. I went through a two year long Long COVID ordeal and now consider myself recovered. I've written several in-depth articles on Long COVID, and am working on a Long COVID dashboard (work in progress) to help make sense of all the trials conducted so far around the world.
I'm very confused why purchasing power varies so dramatically internationally. like why are there countries where everyone has very low wages but everything is also really cheap so it balances out? prima facie, huge disparities like this should get evened out by arbitrage.
the simple explanation is that some labor can only be performed locally, labor mobility is limited (immigration laws, people don't like moving, etc), and transportation costs for goods exist (shipping and tariffs).
however, global shipping is ridiculously cheap. and the economy increasingl...
Yeah, I guess there is just more friction than one would expect. I also found out that Slovaks are now more like 66% of the price of Germans, so either the margin is great or the income gap is smaller than I thought.
Also: Ease of communication is hard to overrate. I always enjoy working with people who have a very similar background compared to me (i.e. similar milieu growing up, not just same country), communication is so smooth.
A model to track:
The adaptation is starting to happen in earnest, including by super-high-profile folks like Tao and other Field medalists like Tim Gowers, so I'm less worried than Litt.
It seems like the switch where high-status brands went from making high quality products to featuring their name and logo prominently was good for both producer and consumer: the producer gets free advertising, and the consumer more clearly signals their wealth. I don't know the history of this, but I'm interested in how this happened – seems like there's a tough coordination problem, where if one brand/consumer switches then they just look tacky.
(Not all brands have switched, and maybe there's a new money/old money difference.)
Meditations on Meditation:
I’ve always noticed something about meditation that I’ve never thought to articulate aloud before. (Note- I’m using mantra meditation as an example.) A beginning meditation practitioner- which is all I have ever been- is told to focus their awareness on the mantra, notice when their mind wanders without judgement, and then redirect their thoughts to the mantra. Similar instructions are given when the focus is the breath, etc. However, my experience of meditation has never been that simple; my attention comes in layers. The first a...
I'm not sure there are specifically 4 layers, but I do assume my mind is finite. There are probably many subconscious layers I can't perceive, but I doubt I could control or focus them if I can't even perceive them, so I don't worry about them.
lingao qiming is the hardest scifi I've ever read. it puts other "hard" scifi like project hail mary or three body problem to shame. the basic conceit of the book is that it's an isekai where some people discover a wormhole to a parallel universe exactly like ours but during the time of ming dynasty china, and decide to being 500 technical specialists and a bunch of modern supplies to the past to try and conquer ancient china. the vast majority of the book is devoted to discussing every single technical aspect in excruciating well-researched detail. you do...
Yes, I would have to agree here. The parts that are sci-fi (almost none) are not hard, and the parts that are hard are not sci-fi.
I'm seeing more sophisticated LLM-slop in the LW moderation queue.
Eight months ago, I wrote "hey, we're getting tons of AI-psychosis'd people, deluded into thinking their crackpot coherence/spiralism/emergence/ChatGPTAwakening experience is true and meaningful. We process like 15-20 of these a day."
Nowadays, we still get some of those, but, a lot less. Instead, I think we often now get a somewhat more sophisticated looking set of AI LLM slop. It's often something like:
"Me and ChatGPT have been working on some ML experiments for months, checking if we get s...
That makes sense, probably the majority are in this camp.
Placebo is powered by the belief of the person who uses it, not the person who researches it.
If person A takes the placebo to e.g. fix their blood pressure, and person B measures their blood pressure but doesn't know why, and person C organizes the experiment, the results should depend on A's beliefs -- because B doesn't have any, and C already receives the final numbers.
There are norms. Examples of norms: drive on the right side of the street. Do not ghost people. Text back your friends within a day. Do not post cringe. Do not distribute explicit materials without trigger warning. Do not enforce norms too tightly. Do not use LLMs in writing without letting people know. Norms are enforced by people (I will skip examples of how here).
Most of the norms are helpful, some are harmful. I'm particularly interested in norms around being cringe and creativity. Doing some things but being unskilled about it is just soo discouraged...
There is no contradiction between something being "cringe" and "a rite of passage". Actually, cringe things probably make good rites, because the signaling is (psychologically) costly.
The leadership of CEA should resign.

https://forum.effectivealtruism.org/posts/XxXnPoGQ2eKsQx3FE/cea-s-response-to-sexual-harassment
The lesson I take here is that any organization that employs people should have a legal/HR expert available and actually call them immediately when anything unusual happens.
You may think that you are in the business of telling people how to send money to charities, but as soon as you employ people, you are also in the business of solving all kinds of things that concern the people you employ. And you most likely don't have any expertize in that, and things can get very serious, so you need to have someone's phone number and to actually use it.
You need an H...