Quick Takes — LessWrong

microgrant thoughts

one reason for microgrants: assuming you can somehow substantially reduce the admin cost, it obviously makes sense to give more small grants than one big one, for the same reason that giving to the poorest people is better than giving to only kinda poor people. $10k for a well-funded university lab is such a small grant that might not be worth bothering with; $10k is a pretty meaningfully large amount of money for a struggling grad student in a less-well-funded lab, buying a month or two of runway to try something new; $10k is a career a... (read more)

NicholasKross's Shortform

[+]Nicholas Kross12h-5-31

Wei Dai's Shortform

Wei Dai9h600

The first podcast episode I've ever participated in has been released, if anyone wants to get an update on what I've been thinking about recently, in audio form. (A transcript is also available.) Thanks to Fin Moorhouse @fin for the conversation and handling the logistics/production.

BTW, related to the theme of humans being bad at philosophy and also just not caring very much about it, I recently finished The Good Place (spoiler alert), and it bothered me a lot that it ends with Chidi, the philosopher character, deciding to end his (after)life instead of u... (read more)

Showing 3 of 6 replies (Click to show all)

5Noosphere895h

One piece of advice on philosophy is that you should focus on specific, high impact problems and try to avoid conceptions of the problem that encompass everything. You should beware the urge to generalize problems so that you can try to have a solution in more cases, since there might just be no solution. This is taken from Connor Leahy's comment here, and while I have radical disagreements with him on optimal strategy, I still think it's the best comment on why meta-philosophy shouldn't be worked on: [...]

yams2h40

This is just working toward a different kind of goal than the metaphilosophy picture takes interest in.

I have some reservations about (my impression of) Wei Dai’s approach, but it seems very plausible that [the kind of thing LLMs are, taken to an extreme] doesn’t naturally converge on a healthy long reflection. There’s a plausible-sounding story for how it might, but I, like Wei, am very pessimistic here, and I don’t think the solution to every objection in this reference class looks like [index hard on tractability].

It seems vitally important for someone ... (read more)

4Mo Putera3h

What did you think of Wei Dai's reply to Connor there? Given that he didn't seem to find Connor's answer all that edifying back then I'm not sure why you bring it up as if it'll change his mind this time, maybe I'm missing updates to his thinking since then?

Daniel Kokotajlo's Shortform

Daniel Kokotajlo9h344

Recently an interviewer asked me how I got to be such a good forecaster, and I replied by saying something humble. In retrospect it was a bad answer because I should have instead used the opportunity to give actual advice on how to forecast AI well. Here's a stream-of-consciousness attempt to do that:

The heuristic that things which sound weird and sci-fi are less likely to happen in reality, is bad. I suspect that's what really going on is that things which sound weird and sci-fi put you at risk of being judged a weirdo if you talk about them which is not

... (read more)

Mo Putera3h50

These all seem directionally correct and broadly unobjectionable (their reverse definitely sounds less correct), so I'm pretty surprised by the -7 agreement karma from 6 votes. I don't know whether people are disagreeing with you being a good AI forecaster, or with your advice (and with which advice), etc. Like, (1) is this, (2) is this, (3) is "all models are wrong but some are useful" and also Chapman's point that finding a good problem formulation is often most of the work of solving it and explicit models enable better problem formulations by making ex... (read more)

6Vladimir_Nesov8h

Working through details helps with being able to notice and use further clues, improving the ability to ground the models in reality. This works regardless of whether these are the right considerations that matter most, and even regardless of whether the initial detailed models are any good. It's just impractical to notice and think about new details if you haven't thought about related things before, especially for something observed tangentially rather than in the spotlight. Existing fluency buys a lot of sample efficiency.

David Matolcsi's Shortform

David Matolcsi14h*4-1

An intuition pump on anthropics.

In some recent conversations with friends, I was asked some questions of the type: “If most conscious beings on Earth are fish, why am I not a fish? If we expect gazillion digital minds to live in the future, why am I not a digital mind in the future? Isn't it very surprising that we seem to live close to the hinge of history?”

My position is that when you consider how surprised you should feel about something, you shouldn’t think of your current experience moment as being sampled from the set of all conscious experience mome... (read more)

Showing 3 of 11 replies (Click to show all)

JBlack5h20

Yes, the probability of a Bob waking up and being a Chosen One is 100% in this scenario. It will happen.

That instance of Bob can be surprised that it happened to him in particular, but he could also predict in advance that whichever instance of him became the Chosen One would be surprised that it happened to him in particular, so it shouldn't be very surprising.

More generally everything in the world that actually happens is incredibly unlikely if you look at the details. Your observations prune the potential timeline of your experiences down by orders of m... (read more)

2JBlack5h

Would you also think that the credence for "in the video game" was incredibly high if a billion (possibly simulated but still conscious) Bobs were temporarily created to be farmers in the VR game for 1 day (and then erased)? That would make the epistemic situation completely symmetric with respect to coin flips during the first day. If a Bob observes that he is a Chosen One, then he knows that there are a billion other Bobs that experience being farmers in both cases. I think we would both agree that his credence for this being real during the first day should be 50%. The next day, he observes that he is still the Chosen One - but as I see it, he has gained no new information from this. The Chosen One does not die in either flip outcome, and he already knew that he was the Chosen One. He should still hold the credence of this being real at 50%. However, the epistemic situation is now exactly the same as the previous scenario. There is 1 VR Bob in the heads case (with no farmer Bobs), and 1 real Chosen Bob with a billion farmer Bobs in the tails case. Should he suddenly believe that this is actually a VR game on day 2? If not, why does the existence of temporary Bobs (which could last for a microsecond rather than a day if you like) change the answer? If so, why should the hypothesized but unobserved death/cessation of beings who aren't him change his mind about the outcome of the coin flip?

3leogao8h

i agree the video game example is not that important for good decision making, and you should probably ignore simulation argument irl. the actual decision relevant q is p(deluded). it's plausible that a bunch of deluded people acting as if they are the Chosen One is actually very harmful, and p(deluded) >> p(chosen).

leogao's Shortform

leogao6h40

one thing that drives me crazy: most things in my hometown of Edmonton, a shitty second tier Canadian city, are worse than everywhere else. but every 10 blocks there is a fast food restaurant that sells something called "donair". Edmonton donair is a variant of shawarma that completely blows every other shawarma, döner, kebab, gyro, etc variant that I've ever tried out of the water. i have unironically probably tried 100+ different donair-adjacent foods in SF, NYC, Berlin, London, Vancouver, Toronto, etc in an attempt to find something comparable, and nowh... (read more)

TsviBT's Shortform

TsviBT25d190

A pleasing confluence:

Episode 1: Sum-threshold attacks

Episode 2: I was musing about maxims that could be derived from my speculations on the nature of wisdom. I'd written:

Wisdom is getting right the first-order bits that are natural——that are expressed naturally in the familiar internal language of living.

This implies a not totally obvious conclusion / conjecture: It's much more important (well, much more wise) to ensure that you are able to eventually update on any given dimension, rather than to ensure that you're updating especially fast on some dim... (read more)

Showing 3 of 4 replies (Click to show all)

5gwern1d

It seems a lot like what I emphasized in my old embryo selection writeup about the power of multi-stage selection and why I went to the trouble of making an interactive visualization to try to build intution for why 'granularity' (what I'd call multi-stageness) is so important.

2TsviBT9h

Kinda? It's not really about stages. It's just granularity of selection, period. For example, 1-stage chromosome selection, separately on two people's gametes, is more powerful than 2- or 3-stage iterated embryo selection with realistic numbers, probably.

gwern6h40

I wouldn't call that '1-stage' because I'd see that as two stages: one stage to select the sperm, and one stage to select the egg, and then the output is the joint result. (And then you could tack on additional stages, like IES, pushing further out into the tail compared to any of the individual stages.)

leogao's Shortform

leogao4d1710

do you have an ambitious idea for how to make AGI go well? do you need money? do you hate bureaucracy and friction? apply now for microgrants!

please read the entire doc before applying. only send applications to the designated location, or they will be automatically rejected.

https://docs.google.com/document/d/10zAp2bXTkZgiPreIm4crp38TFco4KleFN14Kw5BprAs/edit?tab=t.0

Showing 3 of 4 replies (Click to show all)

Bryce Robertson6h10

Nice, I've added this to AISafety.com/funding and it'll go out in the funding newsletter next week. Let me know if you'd like any changes to the listing.

2Quinn3d

great work, love the "low lift" vibes registering a prediction: you will regret not putting a "short title" field when you're managing a spreadsheet later (there's obviously lots of cheap boutique software you could vibe up to make the backend easier, as you know)

9leogao3d

i don't anticipate being bottlenecked on funding. i will assess grants based on simply whether they should be funded. if the number of grants worth funding is greater than the funding available, i will seek more funding.

shortplav

niplav8h9-1

Here's a (kind of mediocre but whatevs) idea what one could do with a large amount of funding in technical AI safety: Run a hyperparameter search on different scalable oversight techniques, or simply test them now that we have LLMs either as human imitators or AIs.

The heydays of scalable oversight theory produced a lot of different techniques: I(D)A, HCH, Factored Cognition, Imitative Generalization, RRM, Debate &c…^[1]

Some of these (especially directing agents using approval) got folded into capabilities techniques, and others may still get used in the... (read more)

genericname-2's Shortform

genericname-28h52

Why doesn't Anthropic publish meaningful data on RSI? Given that they are in the best position to accurately forecast RSI and a positive result is highly beneficial to them as a business (especially with their eye on an IPO), one has to wonder why they haven't published any rigorous studies.

This leads to the conclusion that:

They simply did not try to collect this data (seems unlikely given how important a positive result could be)
Their studies so far have shown negative or inconclusive results. Or atleast not anything stronger than public data (METR task horizons, etc)

shortplav

niplav11h340

ChangeDiaperBench, PlanInvasionBench, ButcherHogBench, ShipConnBench, BuildingDesignBench, SonnetBench, AccountBalanceBench, WallBuildBench, BoneSetBench, ComfortDyingBench, OrderTakeBench, OrderGiveBench, CooperateBench, ActAloneBench, SolveEquationsBench, AnalyzeProblemBench, ManurePitchBench, ComputerProgramBench, TastyCookingBench, EfficientFightingBench, GallantDyingBench

Showing 3 of 4 replies (Click to show all)

niplav8h20

Oops, right, I didn't connect those, my bad!

10Caleb Biddulph8h

The original quote from Robert Heinlein (thanks @Karl Krueger): [...]

3Mateusz Bagiński10h

niplavBench

Fabien's Shortform

Fabien Roger9h374

I listened to 2 books about decision-making during wars: How the War Was Won: Air-Sea Power and Allied Victory in WW2 and Decision Points by George W Bush.

This topic is interesting to me because I expect safety-related decisions during the intelligence explosion to look more like war-time decisions than risk assessments for nuclear power plants: there will be lots of uncertainty about very complex systems with adversarial actors (instead of something where you understand things end-to-end that you can analyze carefully) and no safe action that is realistic... (read more)

James Stephen Brown's Shortform

James Stephen Brown9h1-2

Empower AIs to regulate each other by limiting their individual power (consumption).

Listening to Emmett Shear on multi-agent systems made me think that, rather than having amorphous rules determining "morality" to achieve AI alignment, we could take his multi-agent system seriously and instead place a restriction on the size of the models, so any model that is drawing too much processing power needs to be regulated. Doing this would lead bigger companies to develop more models that are less individually powerful.

Regulating this could be partly possible, be... (read more)

Habryka's Shortform Feed

habryka3d*1341

FWIW Alex Bores seems like a very mildly below-average integrity politician, having talked to him once and having followed his campaign and social media presence. He seems to say things he doesn’t believe somewhat more often than other politicians, but not much so, and he gives me some amount of "naive-consequentialist EA" vibes that make me think he is higher variance on this dimension than others. He does seem to really care about the AI Safety thing, he really appears to be targeted by a ton of very aggressive attack ads funded by AI capability companie... (read more)

Showing 3 of 22 replies (Click to show all)

Thomas Kwa10h61

The ability to have more integrity than the average politician is a luxury of being in a community that has the institutions and norms to reward it. IMO one can only be a competitive as a high-integrity politician if one has a super weird type of charisma compatible with integrity.

2habryka13h

Appreciate the clarification! I am worried this is a temporary situation and would change if e.g. a democrat is elected next election. Separately, "include" is a relatively weak statement. Do you have any sense of how much of his concern about AI is driven by foreign governments developing powerful AI systems?

3Eric Neyman12h

I don’t, sorry! Although, for what it’s worth, in the two conversations I’ve had with him about risks from AI, China didn’t come up. Edited to add: also, for what it’s worth (and apologies for venturing into explicitly partisan politics), I am personally substantially more worried about AI-enabled authoritarianism under Trump than under most other possible presidents (though my worry definitely doesn’t go away entirely).

shortplav

niplav10h80

Question about the natural abstractions research program:

Seems possible to me that, if natural abstractions exist, they won't be robust?

Could be that natural abstractions program is resolved, but we can't really Retarget the Search, because whenever we point it at the natural abstraction that has been found, because the maximizing inputs, we get some edge instantiation of that natural abstraction. (The linked post gestures at this but doesn't look at this particular aspect.)

I guess one could bucket successes of the program into "found convergent abstractio... (read more)

Knight Lee's Shortform

Knight Lee4d*434

Why is there no talk about GLM 5.2?

It's a Chinese open weights model released June 13. Better than Gemini, Claude Sonnet, and Grok according to many benchmarks.

E.g. on artificialanalysis.ai and on arena.ai/leaderboard. On frontierswe.com it even beats GPT 5.5, second to only Claude. LiveBench ranks it number 1 for its "agentic coding" measure.

It's not just open weights but a little open about the methods it used, and is less than 1 trillion parameters.

There's no mention on LessWrong, little mention on Reddit, and no mainstream results on Google search.

Why?... (read more)

Showing 3 of 17 replies (Click to show all)

michaelwaves10h30

I'm not surprised, even last year GLM 4.5 air seemed surprisingly intelligent and was the only open weight model to exhibit evaluation awareness, by checking the time using bash (we were replicating Anthropics agentic misalignment paper and GLM was like ye this is a scam I'm not killing the CTO)

3DW113d

I would guess ~75% chance that a Chinese lab releasing an open-source Mythos-level model would cause a significant wave of cyberattacks, to the extent that ordinary people would notice the impact on their lives. This is a pretty rough guess just based on the cybersecurity community's reaction to Mythos and Project Glasswing. There seems to be a consensus that Anthropic just releasing Mythos in April with no safeguards would have been extremely dangerous. I'd expect a Chinese lab releasing a Mythos-level in, say, December would be worse in some respects because there is now much greater awareness of the capabilities of AI for cybercrime than there was pre-Mythos. Obviously, the existence of Mythos (and Mythos+n models) for cyberdefense will limit the impact in the OECD, but I don't think by EOY there will have been much AI-driven hardening of cyberdefense in China/the developing world since they're not included in initiatives like Project Glasswing. I'm thinking particularly of institutions like local governments, hospitals and utilities. Main reasons I could be wrong: 1) The cybersecurity, Anthropic and the US government's reaction to Mythos might have been a huge overreaction and it's not so capable after all 2) Maybe cyberdefenders who don't have access to Mythos will take the initiative to use the AIs they do have access to to improve cyberdefense over the next few months, and this will be enough 3) It could take time for awareness to spread among cybercriminals that GLM 5.x is Mythos-level and that you can circumvent any safeguards, and this will give time for cyberdefenders to use it to fix bugs.

61a3orn2d

Yeah, conditional on a Chinese lab releasing an open-source Mythos level model before December, I overall expect most people won't notice anything. An uptick in stories of hacks in newspapers, sure, some billions of damage overall, sure -- but, meh, a few billion of damage is small in the world economy, and small relative to the level of positive use the model would get.

Morpheus's Shortform

Morpheus11h10

Hypothesis held weakly: From Gwern's Origins of Innovation: Bakewell & Breeding:

Are outsiders and “misfits” and trouble-makers and credential-less underemployed necessary for progress? Why don’t identical twins leverage their profound mutual trust & understanding to form dynamic duos regularly dominating society? Why do birth order effects turn up in the West for education, intelligence, & personality (and perhaps also mathematicians, physicists, & weirdos)?18 Why do teachers dislike their most creative students so much? What makes a sobe

... (read more)

Morpheus10h20

Religions decrease fitness of the host, but increase evolutionary innovation of the species. Just like transposons. This can very much make a species evolve to extinction. (Actually I am not sure if that means it is actually decreasing fitness of the host as well. It just makes the hosts fitness function more convex over long timescales)

Screwtape's Shortform

Screwtape1d136

Rationalist rituals are, so far, fairly specific in the times of the year you have them. Winter Solstice may not be on the literal solstice but it is usually within a couple weeks of that time. Petrov Day, Rationalist Seder, Summer Solstice, the usual approach is to pick a target date that repeats annually. But not all ritual moments happen at scheduled times.

A long while ago, I recall Raemon mentioning that there isn't yet a rationalist ritual for a funeral, but maybe someday we'd need one. And his post on grieving (https://www.lesswrong.com/posts/gs3vp3u... (read more)

Screwtape11h40

@habryka you are one bold ass son of a frequentist giving this a "haha" react. i am going to steal your ill-gotten sword and then lay siege to your walled surveilled compound. do you understand this ritual has a 69.539580085 chance of burning you in effigy? do you feel any remorse for your actions or is this exactly what we should expect from a man with as villainous a mustache as yours? i can hear the creeping footsteps of your agents coming for me even now but i am not silenced just yet. perhaps i will fall, but another will take my place, and another, a... (read more)

2Screwtape1d

(yes, I'm totally using SF and Berkeley interchangeably here.)

Aaron_Scher's Shortform

Aaron_Scher1d24-2

An important fact that influences many of my predictions about AI timelines and the capability of AI systems in the near term (even conditional on a pause) is that we have really no way of upper-bounding the capability of today's AIs given reasonable elicitation.

Take the statement "today's best AIs could be used to automate 95% of current AI R&D tasks, given 10% as much compute as was used to pretrain them and a strong team working for 4 years". I think most people in the AI xrisk community, even those who expect transformative AI in the next few year... (read more)

Showing 3 of 4 replies (Click to show all)

ryan_greenblatt11h42

Take the statement "today's best AIs could be used to automate 95% of current AI R&D tasks, given 10% as much compute as was used to pretrain them and a strong team working for 4 years". I think most people in the AI xrisk community, even those who expect transformative AI in the next few years, think that statement is false. But as far as I can tell, we have no way to falsify it.

I don't know what "automate 95% of current AI R&D tasks" really means and depending on the definition I think this is maybe already true without any further elicitation... (read more)

1Amalthea13h

I don't see how "nontrivial to enforce a pause" follows (beyond it requiring some amount of international coordination at some point which is unrelated to your point). It was my impression that the continued progress in e.g. Moore's law requires a comparable steady increase in investment and R&D spending, so if the government does a large scale intervention on the inputs, this could effectively halt progress in many ways that matter. (Some forms of algorithmic progress seem indeed hard to halt without aggressive enforcement)

2Thomas Kwa12h

Mostly I meant that government needs to block all the source of progress-- pretraining, posttraining, data, elicitation, others-- rather than just some if it wants to mostly or entirely halt progress. Moore's law has required steady increase in investment, but investment has increased much less than the rate of Moore's law, which is about 1.5x/year. If R&D spending were constant, progress might slow down to quadratic or cubic rather than exponential, still very fast. The situation with AI is even tougher because if a software-only singularity is possible, growth will by definition continue to be exponential or faster if we halt growth in all inputs other than algorithmic efficiency, just a slower exponential than it would have been if investment continued ramping up. If status quo is 6 months from software singularity, than constant compute would mean perhaps 1 year to the singularity, and to delay the singularity until 10 years, governments must reduce inputs by at least 10x.

leogao's Shortform

leogao14d8169

religion is selling your soul

a lot of people say things like "sure, religion might not exactly be totally true, but it has lots of benefits, and there really does seem to be a god shaped hole in many people, so who can really say if it's good". i think this is directionally correct but kind of cowardly.

i think the correct take on religion is first that its claims are completely and utterly false; obviously the christian god doesn't literally exist, jesus never came back from the dead, etc. this is so overdone by the old internet atheists that it would be ... (read more)

Showing 3 of 22 replies (Click to show all)

1Canaletto3d

I think your whole approach is unsound. You have no beliefs about state of the world. Many religious people do have actual beliefs, they expect to find evidence of the flood or Jews in Egypt, or of miracles of Jesus, of all the figures mentioned in bible. As God is Truth, and would not egregiously lie, and stuff. And by that authority they think the principles that the Bible and Church recommend are good and righteous. Why have principle of "being joyful"? Because God says so. They think, why not principle of "being brave" that Thor commends? Because Thor does not exist, and holds no authority, as opposed to how Christian God exists, actually, verifiably. But you, you have no disagreements about facts with atheists. You just think the Christian principles are great for some reason? Why not Neo-Pagan principles?

1snarles1d

I am grateful for your continued engagement! Please be patient with my explanations since these are delicate issues that we are discussing, and I lack experience in fully conveying the complexity of what I know. [...] This site is called "Less Wrong", is it not? You may be correct that my whole approach to truth is fundamentally misguided. However, I am trying my best, and I am willing to change! I trust that you have the same attitude towards finding the truth collaboratively. [...] I am not sure if I am interpreting this correctly. I think I have said plenty so far which indicates that I do have beliefs in the state of the world. However, I did say that there are very few things that I believe 100%. I do not think this is at all unreasonable, since an ideal Bayesian reasoner might also have high probability in many statements but not 100%. I do have beliefs about the state of the world -- it's largely in agreement to the scientific consensus, not because I lack principles, but because I believe in human reason and the effectiveness of the scientific method. I do not think my religious beliefs are incompatible with a belief in human reason and scientific method. [...] I'll grant that there exists a subset of religious believers who justify every belief on the dogma that they were taught. A still larger subset of believers will publicly explain their beliefs this way, but will admit a more complex epistemology in private. My experience has been that literal dogmatic belief tends to be true more of initial converts. As believers mature in their faith, they tend to realize the brittleness of dogma and the need to think in shades of gray and not in simplistic black-and-white rules. I searched to see if there is information on the internet on this -- this is what I found https://en.wikipedia.org/wiki/Zeal_of_the_convert . I am willing to bet that if you find people around you who have belonged to a religious group for 5+ years, they will agree with many of the nu

Canaletto12h10

(1) because the God of Christianity truly exist

But what reasons you have to think so? It seems to me you think Bible is the same kind of "cool story" as Greek myths, Norse myths, Aztec myths, etc. You agree with atheists on facts.

So, what privileges that God? Your wishful thinking?

Some (most) religious people think there are ironclad reasons to privilege those particular Myths. And I somewhat agree with their epistemic, let's say, approach, even if I disagree on facts with them. (even if yes, those conclusions are too heavily oiled with wishful thinking)