LESSWRONG
LW

All of gw's Comments + Replies

2025 Prediction Thread

gw2mo10

Yeah I think I would still make this bet. I think I would still count o3's 25% for the purposes of such a bet.

2025 Prediction Thread

gw4mo*31

I'm somewhat surprised to see the distribution of predictions for 75% on FrontierMath. Does anyone want to bet money on this, at say, 2:1 odds (my two dollars that this won't happen against your one that it will)?

(Edit: I guess the wording doesn’t exclude something like AlphaProof, which I wasn’t considering. I think I might bet 1:1 odds if systems targeted at math are included, as opposed to general purpose models?)

1winstonBosan3mo

Is the bet for general purpose model still open? I guess it depends on the specific resolver/resolution criteria - considering that OpenAI have gotten the answer and solution to most of the hard questions. Does o3's 25% even count?

Began a pay-on-results coaching experiment, made $40,300 since July

gw4mo10

I think you've already given several examples:

Should I count the people I spoke to for 15 minutes for free at the imbue potlucks? That was year-changing for at least one. But if I count them I have to count all of the free people ever, even those who were uninvested. Then people will respond “Okok, how many bounties have you taken on?” Ok sure, but should I include the people who I told “Your case is not my specialty, idk if i’ll be able to help, but I'm interested in trying for a few hours if you’re into it”? Should I include the people who had an amazing

... (read more)

2Chipmonk4mo

oh ok hm. i also don't want to be incentivized to not give easy-for-me help to people with low odds of success though

Began a pay-on-results coaching experiment, made $40,300 since July

gw4mo21

Please, tell me what metric I should use here!

Is it feasible to just generate a bunch of such metrics, with details about what was included or not included in a particular number, and share all of them?

2Chipmonk4mo

could you give a few examples? also seems time-intensive hmmmm also, i thought about it more and i really like the metric of "results generated per hour"

Hire (or Become) a Thinking Assistant

gw4mo61

Hazarding a guess from the frame of 'having the most impact' and not of 'doing the most interesting thing':

It might help a lot if a metacognitive assistant already has a lot of context on the work
If you think someone else is doing better work than you and you can 2x them, that's better than doing your individual work. (And if instead you can 3x or 4x people...)

6Raemon4mo

I actually meant to say "x-risk focused individuals" there (not particularly researchers), and yes was coming from the impact side of things. (i.e. if you care about x-risk, one of the options available to you is to becoming a thinking assistant).

Correct my H5N1 research

gw5mo10

Additional major epidemics or scares that didn’t pan out ($50 for first few, $25 for later)

2014-15 HPAI outbreak in the US, which didn't ultimately make it to humans

2Elizabeth5mo

This is outside the reference class I intended (needed at least one human case), but since I didn't specify that I'll award a token $10. Please let me know what your paypal is.

Subskills of "Listening to Wisdom"

gw5mo80

I want to add two more thoughts to the competitive deliberate practice bit:

Another analogy for the scale of humanity point:

If you try to get better at something but don't have the measuring sticks of competitive games, you end up not really knowing how good you objectively are. But most people don't even try to get better at things. So you can easily find yourself feeling like whatever local optimum you've ended up in is better than it is.

I don't know anything about martial arts, but suppose you wanted to get really good at fighting people. Then an a... (read more)

3jmh4mo

I like the point about the need for some type of external competitive measure but as you say, they might not be a MMA gym where you need one. Shifting the metaphor, I think your observation of the sucker punch fits well with the insight that for those with only a hammer, all problems look like nails. The gym would be someone with a screwdriver or riveter as well as the hammer. But even lacking the external check, we should always ask ourselves "Is this really a nail?" I might only have a hammer but if this isn't a nail while the results might be better than nothing (maybe?) I'm not likely to achieve the best that could be done some other way. And,of course, watching just what happens went the other hammer-only people solve the problem and compare results to those when we know the problem is a nail we might learn something from their mistakes.

7Raemon5mo

(FYI this is George from the essay, in case people were confused)

Reflections on the Metastrategies Workshop

gw6mo30

It is a bit early to tell and seems hard to accurately measure, but I note some concrete examples at the end.

Concrete examples aside, in plan making it's probably more accurate to call it purposeful practice than deliberate practice, but it seems super clear to me that in ~every place where you can deliberately practice, deliberate practice is just way better than whatever your default is of "do the thing a lot and passively gain experience". It would be pretty surprising to me if that mostly failed to be true of purposeful practice for plan making or other metacognitive skills.

2Algon6mo

I agree it's hard to accurately measure. All the more important to figure out some way to test if it's working though. And there's some reasons to think it won't. Deliberate practice works when your practice is as close to real world situations as possible. The workshop mostly covered simple, constrained, clear feedback events. It isn't obvious to me that planning problems in Baba is You are like useful planning problems IRL. So how do you know there's transfer learning? Some data I'd find convincing that Raemon is teaching you things which generalize. If the tools you learnt made you unstuck on some existing big problems you have, which you've been stuck on for a while.

MichaelDickens's Shortform

gw7mo93

As a concrete example, as far as I can piece together from various things I have heard, Open Phil does not want to fund anything that is even slightly right of center in any policy work. I don't think this is because of any COIs, it's because Dustin is very active in the democratic party and doesn't want to be affiliated with anything that is even slightly right-coded. Of course, this has huge effects by incentivizing polarization of AI policy work with billions of dollars, since any AI Open Phil funded policy organization that wants to engage with people

... (read more)

habryka7mo170

Yep, my model is that OP does fund things that are explicitly bipartisan (like, they are not currently filtering on being actively affiliated with the left). My sense is in-practice it's a fine balance and if there was some high-profile thing where Horizon became more associated with the right (like maybe some alumni becomes prominent in the republican party and very publicly credits Horizon for that, or there is some scandal involving someone on the right who is a Horizon alumni), then I do think their OP funding would have a decent chance of being jeopar... (read more)

MATS Alumni Impact Analysis

gw7mo72

Do you have any data on whether outcomes are improving over time? For example, % published / employed / etc 12 months after a given batch

7Ryan Kidd7mo

Great suggestion! We'll publish this in our next alumni impact evaluation, given that we will have longer-term data (with more scholars) soon.

So you want to work on technical AI safety

gw10mo42

I agree! This is mostly focused on the "getting a job" part though, which typically doesn't end up testing those other things you mention. I think this is the thing I'm gesturing at when I say that there are valid reasons to think that the software interview process feels like it's missing important details.

jacquesthibs's Shortform

gw11mo10

This might look like building influence / a career in the federal orgs that would be involved in nationalization, rather than a startup. Seems like positioning yourself to be in charge of nationalized projects would be the highest impact?

2jacquesthibs11mo

I agree that this would be impactful! I'm mostly thinking about a more holistic approach that assumes you'd have reasonable to 'the right people' in those government positions. Similar to the current status quo where you have governance people and technical people filling in the different gaps.

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

gw11mo20

Your GitHub link is broken, it includes the period in the url.

2Ollie J11mo

Fixed, thanks for flagging

Announcing ILIAD — Theoretical AI Alignment Conference

gw11mo411

I
Love
Interesting
Alignment
Donferences

2TsviBT11mo

Insightful Learning Implore Agreed Delta

5TsviBT11mo

honestly i prefer undonfrences

Bird Concept11mo150

ah that makes sense thanks

Linda Linsefors's Shortform

gw1y10

I spoke with some people last fall who were planning to do this, perhaps it's the same people. I think the idea (at least, as stated) was to commercialize regulatory software to fund some alignment work. At the time, they were going by Nomos AI, and it looks like they've since renamed to Norm AI.

2Linda Linsefors1y

I found this on their website I'm not sure if this is worrying, because I don't think AI overseeing AI is a good solution. Or it's actually good, because, again, not a good solution, which might lead to some early warnings?

Would you have a baby in 2024?

gw1y*1512

+ the obvious fact that it might matter to the kid that they're going to die

(edit: fwiw I broadly think people who want to have kids should have kids)

jefftk1y252

I'm sure this varies by kid, but I just asked my two older kids, age 9 and 7, and they both said they're very glad that we decided to have them even if the world ends and everyone dies at some point in the next few years.

Which makes lots of sense to me: they seem quite happy, and it's not surprising they would be opposed to never getting to exist even if it isn't a full lifetime.

2Timothy Underwood1y

Yeah, but assuming your p(doom) isn't really high, this needs to balanced against the chance that AI goes well, and your kid has a really, really, really good life. I don't expect my daughter to ever have a job, but think that in more than half of worlds that seem possible to me right now, she has a very satisfying life -- one that is better than it would be otherwise in part because she never has a job.

6dr_s1y

I think the idea here was sort of "if the kid is unaware and death comes suddenly and swiftly they at least got a few years of life out of it"... cold as it sounds. But anyway this also assume the EY kind of FOOM scenario rather than one of the many others in which people are around, and the world just gets shittier and shittier. It's a pretty difficult topic to grasp with, especially given how much regret can come with not having had children in hindsight. Can't say I have any answers for it. But it's obviously not as simple as this answer makes it.

Rationalist horror movies

gw2y20

Hmm, I have exactly one idea. Are you pressing shift+enter to new line? For me, if I do shift+enter

>! I don't get a spoiler

But if I hit regular enter then type >!, the spoiler tag pops up as I'm typing (don't need to wait to submit the question for it to appear)

2Ninety-Three2y

That's it! Thanks, I have no idea why shift+enter is special there.

Rationalist horror movies

gw2y20

Are you thinking of

Until Dawn?

(also it seems like I can get a spoiler tag to work in comments by starting a line with >! but not by putting text into :::spoiler [text] :::)

1Ninety-Three2y

That's the one. I couldn't get either solution to work: >! I am told this text should be spoilered :::spoiler And this text too:::

Noticing confusion in physics

gw2y30

Interesting, thanks for the detailed responses here and above!

Noticing confusion in physics

gw2y30

Here's a handwavy attempt from another angle:

Suppose you have a container of gas and you can somehow run time at 2x speed in that container. It would be obvious that from an external observer's point of view (where time is running at 1x speed) that sound would appear to travel 2x as fast from one end of the container to the other. But to the external observer, running time at 2x speed is indistinguishable from doubling the velocity of each gas molecule at 1x speed. So increasing the velocity of molecules (and therefore the temperature) should cause sound t... (read more)

4AnthonyC2y

This seems like it should work at first glance, but doesn't. The initial intervention (double particle speed, which quadruples total kinetic energy) at first doubles pressure (each particle imparts twice as much force when it hits a wall and reverses direction), but the velocity distribution is no longer thermal. In a statistical mechanics sense, you've added energy without adding any entropy, and that means the colloquial concept of temperature for a gas doesn't really apply, just like accelerating a car by applying macroscopic kinetic energy doesn't mean you increased it's temperature (but over time, friction will thermalize the kinetic energy). After that initial transformation, I think the gas will thermalize over a fairly short time but I'm not 100% sure. If it does, then in that case you've quadrupled total internal energy (1.5nRT or PV for a monatomic ideal gas), so I think it should stabilize at quadruple the T and P. Which, yes, will double v(sound), but doesn't tell you whether that's because of the T or the P or both. In any case this transformation put the system in an unnaturally low-entropy state, and so a lot of the usual assumptions about ideal gas behavior won't apply.

Noticing confusion in physics

gw2y10

If I make the room bigger or smaller while holding T and P constant, v(sound) does not change. If it did, it would be very obvious in daily life.

This feels a bit too handwavy to me, I could say the same thing about temperature: if the speed of sound were affected by making a room hotter or colder, it would be very obvious in daily life, therefore the speed of sound doesn't depend on temperature. But it isn't obvious in daily life that the speed of sound changes based on temperature either.

So now let's increase T. It doesn't matter what effect this has on P

... (read more)

4AnthonyC2y

To your first point: there is no scenario in daily life where we experience a change in absolute temperature spanning orders of magnitude (at most about 30%, from ~250K to ~325K, but we do experience room sizes that span multiple orders of magnitude (a closet vs. a concert hall vs outdoors in an open grassland). Similarly, our experiences of pressure variation almost never span more than about +/-30% from 1 atm. So I maintain that a dependence on volume would be much more obvious than a dependence on temperature or pressure unless it were something like log(V) or V^.1 (which would be hard to reconcile with ending up with units of m/s), and even then there would be scenarios that should be very odd, like a long, narrow cave with an open mouth where the speed of sound was several times faster along the short axis than the long axis, or vice versa. Or think about it this way. If you stand across a football field in a sealed, airtight indoor stadium bang cymbals together, I hear it about 1/4 of a second later. If we do the same thing without the stadium in a place where there's nothing blocking sound propagation for many miles, I hear it about 1/4 of a second later. This happens even though the volume of gas is at least 6 orders of magnitude larger (say, if we call the distance to the horizon the new relevant room size) or much more than that (if we count the whole atmosphere as the room size). To your second point - I agree this is not obvious. So, we have to dig deeper. What is sound? A pressure wave. What does that mean? Well it means you create a pattern of high and low pressure regions that propagates, for example by vibrating a membrane and imparting force to (initially randomly/thermally moving) molecules. Well why does it propagate? Because all the individual molecules move randomly, so the molecules in the high pressure regions tend to flow into the nearby low pressure regions more than the reverse (diffusion along concentration gradients). Now, could the

AI#28: Watching and Waiting

gw2y30

Worth noting that the scam attempt failed. We keep hearing ‘I almost fell for it’ and keep not hearing from anyone who actually lost money.

Here's a story where someone lost quite a lot of money through an AI-powered scam:

https://www.reuters.com/technology/deepfake-scam-china-fans-worries-over-ai-driven-fraud-2023-05-22/

Why was the AI Alignment community so unprepared for this moment?

gw2y30

We can question things, how it went this way or why we are all here with this problem now - but it does not in add anything IMHO.

I think it adds something. It's a bit strongly worded, but another way to see this is "could we have done any better, and if so, why?" Asking how we could have done better in the past lets us see ways to do better in the future.

3MiguelDev2y

Fair, I agree on this note.

Why was the AI Alignment community so unprepared for this moment?

gw2y104

This post comes to mind as relevant: Concentration of Force

The effectiveness of force application often depends on its concentration—on whether you can amass locally superior force at the actual decisive moment.

Consider Joining the UK Foundation Model Taskforce

gw2y64

As someone who is definitely not a political expert (and not from or super familiar with the UK), my guess would be that you just can't muster up enough political capital or will to try again. Taxpayer money (in the US at least) seems highly scrutinized, you typically can't just fail with a lot of money and have no one say anything about it.

So then if the first try does fail, then it requires more political capital to push for allocating a bunch of money again, and failing again looks really bad for anyone who led or supported that effort. Politician... (read more)

The LessWrong 2021 Review: Intellectual Circle Expansion

gw2y20

Is it possible to purchase the 2018 annual review books anywhere? I can find an Amazon link for the 2019 in stock, but the 2018 is out of stock (is that indefinite?).

6Raemon2y

We've got a new print run of them coming out soon.

Anonymous advice: If you want to reduce AI risk, should you take roles that advance AI capabilities?

gw3y21

Re: "up-skilling": I think this is underestimating the value of developing maturity in an area before trying to do novel research. These are two separate skills, and developing both simultaneously from scratch doesn't seem like the fastest path to proficiency to me. Difficulties often multiply.

There is a long standing certification for "proving you've learned to do novel research", the PhD. A prospective student would find it difficult to enter a grad program without any relevant coursework, and it's not because those institutions think they have equal chances of success as a student who does.

A Data limited future

gw3yΩ394

I think it's more fair to say humans were "trained" over millions of years of transfer learning, and an individual human is fine tuned using much less data than Chinchilla.

2Donald Hobson3y

I think humans and current deep learning models are running sufficiently different algorithms that the scaling curves of one don't apply to the other. This needn't be a huge difference. Convolutional nets are more data efficient than basic dense nets.

4Yitz3y

Is that fair to say? How much kolmogorov complexity can be encoded by evolution at a maximum, considering that all information transferred through evolution must be encoded in a single (stem) cell? Especially when we consider how genetically similar we are to beings which don’t even have brains, I have trouble imagining that the amount of “training data” encoded by evolution is very large.

AGI Ruin: A List of Lethalities

gw3y2-10

Can we join the race to create dangerous AGI in a way that attempts to limit the damage it can cause, but allowing it to cause enough damage to move other pivotal acts into the Overton window?

If the first AGI created is designed to give the world a second chance, it may be able to convince the world that a second chance should not happen. Obviously this could fail and just end the world earlier, but it would certainly create a convincing argument.

In the early days of the pandemic, even though all the evidence was there, virtually no one cared about covid until it was knocking on their door, and then suddenly pandemic preparedness seemed like the most obvious thing to everyone.

[This comment is no longer endorsed by its author]Reply