microgrant thoughts
one reason for microgrants: assuming you can somehow substantially reduce the admin cost, it obviously makes sense to give more small grants than one big one, for the same reason that giving to the poorest people is better than giving to only kinda poor people. $10k for a well-funded university lab is such a small grant that might not be worth bothering with; $10k is a pretty meaningfully large amount of money for a struggling grad student in a less-well-funded lab, buying a month or two of runway to try something new; $10k is a career a...
The first podcast episode I've ever participated in has been released, if anyone wants to get an update on what I've been thinking about recently, in audio form. (A transcript is also available.) Thanks to Fin Moorhouse @fin for the conversation and handling the logistics/production.
BTW, related to the theme of humans being bad at philosophy and also just not caring very much about it, I recently finished The Good Place (spoiler alert), and it bothered me a lot that it ends with Chidi, the philosopher character, deciding to end his (after)life instead of u...
This is just working toward a different kind of goal than the metaphilosophy picture takes interest in.
I have some reservations about (my impression of) Wei Dai’s approach, but it seems very plausible that [the kind of thing LLMs are, taken to an extreme] doesn’t naturally converge on a healthy long reflection. There’s a plausible-sounding story for how it might, but I, like Wei, am very pessimistic here, and I don’t think the solution to every objection in this reference class looks like [index hard on tractability].
It seems vitally important for someone ...
Recently an interviewer asked me how I got to be such a good forecaster, and I replied by saying something humble. In retrospect it was a bad answer because I should have instead used the opportunity to give actual advice on how to forecast AI well. Here's a stream-of-consciousness attempt to do that:
These all seem directionally correct and broadly unobjectionable (their reverse definitely sounds less correct), so I'm pretty surprised by the -7 agreement karma from 6 votes. I don't know whether people are disagreeing with you being a good AI forecaster, or with your advice (and with which advice), etc. Like, (1) is this, (2) is this, (3) is "all models are wrong but some are useful" and also Chapman's point that finding a good problem formulation is often most of the work of solving it and explicit models enable better problem formulations by making ex...
An intuition pump on anthropics.
In some recent conversations with friends, I was asked some questions of the type: “If most conscious beings on Earth are fish, why am I not a fish? If we expect gazillion digital minds to live in the future, why am I not a digital mind in the future? Isn't it very surprising that we seem to live close to the hinge of history?”
My position is that when you consider how surprised you should feel about something, you shouldn’t think of your current experience moment as being sampled from the set of all conscious experience mome...
Yes, the probability of a Bob waking up and being a Chosen One is 100% in this scenario. It will happen.
That instance of Bob can be surprised that it happened to him in particular, but he could also predict in advance that whichever instance of him became the Chosen One would be surprised that it happened to him in particular, so it shouldn't be very surprising.
More generally everything in the world that actually happens is incredibly unlikely if you look at the details. Your observations prune the potential timeline of your experiences down by orders of m...
one thing that drives me crazy: most things in my hometown of Edmonton, a shitty second tier Canadian city, are worse than everywhere else. but every 10 blocks there is a fast food restaurant that sells something called "donair". Edmonton donair is a variant of shawarma that completely blows every other shawarma, döner, kebab, gyro, etc variant that I've ever tried out of the water. i have unironically probably tried 100+ different donair-adjacent foods in SF, NYC, Berlin, London, Vancouver, Toronto, etc in an attempt to find something comparable, and nowh...
A pleasing confluence:
Episode 1: Sum-threshold attacks
Episode 2: I was musing about maxims that could be derived from my speculations on the nature of wisdom. I'd written:
Wisdom is getting right the first-order bits that are natural——that are expressed naturally in the familiar internal language of living.
This implies a not totally obvious conclusion / conjecture: It's much more important (well, much more wise) to ensure that you are able to eventually update on any given dimension, rather than to ensure that you're updating especially fast on some dim...
I wouldn't call that '1-stage' because I'd see that as two stages: one stage to select the sperm, and one stage to select the egg, and then the output is the joint result. (And then you could tack on additional stages, like IES, pushing further out into the tail compared to any of the individual stages.)
do you have an ambitious idea for how to make AGI go well? do you need money? do you hate bureaucracy and friction? apply now for microgrants!
please read the entire doc before applying. only send applications to the designated location, or they will be automatically rejected.
https://docs.google.com/document/d/10zAp2bXTkZgiPreIm4crp38TFco4KleFN14Kw5BprAs/edit?tab=t.0
Nice, I've added this to AISafety.com/funding and it'll go out in the funding newsletter next week. Let me know if you'd like any changes to the listing.
Here's a (kind of mediocre but whatevs) idea what one could do with a large amount of funding in technical AI safety: Run a hyperparameter search on different scalable oversight techniques, or simply test them now that we have LLMs either as human imitators or AIs.
The heydays of scalable oversight theory produced a lot of different techniques: I(D)A, HCH, Factored Cognition, Imitative Generalization, RRM, Debate &c…[1]
Some of these (especially directing agents using approval) got folded into capabilities techniques, and others may still get used in the...
Why doesn't Anthropic publish meaningful data on RSI? Given that they are in the best position to accurately forecast RSI and a positive result is highly beneficial to them as a business (especially with their eye on an IPO), one has to wonder why they haven't published any rigorous studies.
This leads to the conclusion that:
ChangeDiaperBench, PlanInvasionBench, ButcherHogBench, ShipConnBench, BuildingDesignBench, SonnetBench, AccountBalanceBench, WallBuildBench, BoneSetBench, ComfortDyingBench, OrderTakeBench, OrderGiveBench, CooperateBench, ActAloneBench, SolveEquationsBench, AnalyzeProblemBench, ManurePitchBench, ComputerProgramBench, TastyCookingBench, EfficientFightingBench, GallantDyingBench
Oops, right, I didn't connect those, my bad!
I listened to 2 books about decision-making during wars: How the War Was Won: Air-Sea Power and Allied Victory in WW2 and Decision Points by George W Bush.
This topic is interesting to me because I expect safety-related decisions during the intelligence explosion to look more like war-time decisions than risk assessments for nuclear power plants: there will be lots of uncertainty about very complex systems with adversarial actors (instead of something where you understand things end-to-end that you can analyze carefully) and no safe action that is realistic...
Empower AIs to regulate each other by limiting their individual power (consumption).
Listening to Emmett Shear on multi-agent systems made me think that, rather than having amorphous rules determining "morality" to achieve AI alignment, we could take his multi-agent system seriously and instead place a restriction on the size of the models, so any model that is drawing too much processing power needs to be regulated. Doing this would lead bigger companies to develop more models that are less individually powerful.
Regulating this could be partly possible, be...
FWIW Alex Bores seems like a very mildly below-average integrity politician, having talked to him once and having followed his campaign and social media presence. He seems to say things he doesn’t believe somewhat more often than other politicians, but not much so, and he gives me some amount of "naive-consequentialist EA" vibes that make me think he is higher variance on this dimension than others. He does seem to really care about the AI Safety thing, he really appears to be targeted by a ton of very aggressive attack ads funded by AI capability companie...
The ability to have more integrity than the average politician is a luxury of being in a community that has the institutions and norms to reward it. IMO one can only be a competitive as a high-integrity politician if one has a super weird type of charisma compatible with integrity.
Question about the natural abstractions research program:
Seems possible to me that, if natural abstractions exist, they won't be robust?
Could be that natural abstractions program is resolved, but we can't really Retarget the Search, because whenever we point it at the natural abstraction that has been found, because the maximizing inputs, we get some edge instantiation of that natural abstraction. (The linked post gestures at this but doesn't look at this particular aspect.)
I guess one could bucket successes of the program into "found convergent abstractio...
Why is there no talk about GLM 5.2?
It's a Chinese open weights model released June 13. Better than Gemini, Claude Sonnet, and Grok according to many benchmarks.
E.g. on artificialanalysis.ai and on arena.ai/leaderboard. On frontierswe.com it even beats GPT 5.5, second to only Claude. LiveBench ranks it number 1 for its "agentic coding" measure.
It's not just open weights but a little open about the methods it used, and is less than 1 trillion parameters.
There's no mention on LessWrong, little mention on Reddit, and no mainstream results on Google search.
Why?...
I'm not surprised, even last year GLM 4.5 air seemed surprisingly intelligent and was the only open weight model to exhibit evaluation awareness, by checking the time using bash (we were replicating Anthropics agentic misalignment paper and GLM was like ye this is a scam I'm not killing the CTO)
Hypothesis held weakly: From Gwern's Origins of Innovation: Bakewell & Breeding:
...Are outsiders and “misfits” and trouble-makers and credential-less underemployed necessary for progress? Why don’t identical twins leverage their profound mutual trust & understanding to form dynamic duos regularly dominating society? Why do birth order effects turn up in the West for education, intelligence, & personality (and perhaps also mathematicians, physicists, & weirdos)?18 Why do teachers dislike their most creative students so much? What makes a sobe
Religions decrease fitness of the host, but increase evolutionary innovation of the species. Just like transposons. This can very much make a species evolve to extinction. (Actually I am not sure if that means it is actually decreasing fitness of the host as well. It just makes the hosts fitness function more convex over long timescales)
Rationalist rituals are, so far, fairly specific in the times of the year you have them. Winter Solstice may not be on the literal solstice but it is usually within a couple weeks of that time. Petrov Day, Rationalist Seder, Summer Solstice, the usual approach is to pick a target date that repeats annually. But not all ritual moments happen at scheduled times.
A long while ago, I recall Raemon mentioning that there isn't yet a rationalist ritual for a funeral, but maybe someday we'd need one. And his post on grieving (https://www.lesswrong.com/posts/gs3vp3u...
@habryka you are one bold ass son of a frequentist giving this a "haha" react. i am going to steal your ill-gotten sword and then lay siege to your walled surveilled compound. do you understand this ritual has a 69.539580085 chance of burning you in effigy? do you feel any remorse for your actions or is this exactly what we should expect from a man with as villainous a mustache as yours? i can hear the creeping footsteps of your agents coming for me even now but i am not silenced just yet. perhaps i will fall, but another will take my place, and another, a...
An important fact that influences many of my predictions about AI timelines and the capability of AI systems in the near term (even conditional on a pause) is that we have really no way of upper-bounding the capability of today's AIs given reasonable elicitation.
Take the statement "today's best AIs could be used to automate 95% of current AI R&D tasks, given 10% as much compute as was used to pretrain them and a strong team working for 4 years". I think most people in the AI xrisk community, even those who expect transformative AI in the next few year...
Take the statement "today's best AIs could be used to automate 95% of current AI R&D tasks, given 10% as much compute as was used to pretrain them and a strong team working for 4 years". I think most people in the AI xrisk community, even those who expect transformative AI in the next few years, think that statement is false. But as far as I can tell, we have no way to falsify it.
I don't know what "automate 95% of current AI R&D tasks" really means and depending on the definition I think this is maybe already true without any further elicitation...
religion is selling your soul
a lot of people say things like "sure, religion might not exactly be totally true, but it has lots of benefits, and there really does seem to be a god shaped hole in many people, so who can really say if it's good". i think this is directionally correct but kind of cowardly.
i think the correct take on religion is first that its claims are completely and utterly false; obviously the christian god doesn't literally exist, jesus never came back from the dead, etc. this is so overdone by the old internet atheists that it would be ...
(1) because the God of Christianity truly exist
But what reasons you have to think so? It seems to me you think Bible is the same kind of "cool story" as Greek myths, Norse myths, Aztec myths, etc. You agree with atheists on facts.
So, what privileges that God? Your wishful thinking?
Some (most) religious people think there are ironclad reasons to privilege those particular Myths. And I somewhat agree with their epistemic, let's say, approach, even if I disagree on facts with them. (even if yes, those conclusions are too heavily oiled with wishful thinking)