LESSWRONG
is fundraising!
LW

johnswentworth's Shortform — LessWrong

johnswentworth's Shortform

27th Feb 2020

1 min read

11 Ω 5

This is a special post for quick takes by johnswentworth. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Mentioned in

382The Case Against AI Control Research

377A Bear Case: My Predictions Regarding AI Progress

204The Value Proposition of Romantic Relationships

173Most People Start With The Same Few Bad Ideas

151Leading The Parade

Load More (5/27)

johnswentworth's Shortform

12the gears to ascension

28Alexander Gietelink Oldenziel

2the gears to ascension

15Alexander Gietelink Oldenziel

3Mitchell_Porter

11Daniel Murfet

6Alexander Gietelink Oldenziel

5Bogdan Ionut Cirstea

6Alexander Gietelink Oldenziel

8the gears to ascension

1Aristotelis Kostelenos

3the gears to ascension

8Duncan Sabien (Inactive)

4Bogdan Ionut Cirstea

6the gears to ascension

7johnswentworth

4the gears to ascension

4johnswentworth

4Nathan Helm-Burger

3johnswentworth

2the gears to ascension

4Alexander Gietelink Oldenziel

12Alexander Gietelink Oldenziel

744 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:17 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]johnswentworth1yΩ6217834

I think a very common problem in alignment research today is that people focus almost exclusively on a specific story about strategic deception/scheming, and that story is a very narrow slice of the AI extinction probability mass. At some point I should probably write a proper post on this, but for now here are few off-the-cuff example AI extinction stories which don't look like the prototypical scheming story. (These are copied from a Facebook thread.)

Perhaps the path to superintelligence looks like applying lots of search/optimization over shallow heuristics. Then we potentially die to things which aren't smart enough to be intentionally deceptive, but nonetheless have been selected-upon to have a lot of deceptive behaviors (via e.g. lots of RL on human feedback).
The "Getting What We Measure" scenario from Paul's old "What Failure Looks Like" post.
The "fusion power generator scenario".
Perhaps someone trains a STEM-AGI, which can't think about humans much at all. In the course of its work, that AGI reasons that an oxygen-rich atmosphere is very inconvenient for manufacturing, and aims to get rid of it. It doesn't think about humans at all, but the human operators can't understand

... (read more)

[-]johnswentworth1y445

Also (separate comment because I expect this one to be more divisive): I think the scheming story has been disproportionately memetically successful largely because it's relatively easy to imagine hacky ways of preventing an AI from intentionally scheming. And that's mostly a bad thing; it's a form of streetlighting.

[-]Buck1y3225

Most of the problems you discussed here more easily permit hacky solutions than scheming does.

7Zvi1y

Individually for a particular manifestation of each issue this is true, you can imagine doing a hacky solution to each one. But that assumes there is a list of such particular problems that if you check off all the boxes you win, rather than them being manifestations of broader problems. You do not want to get into a hacking contest if you're not confident your list is complete.

[-]johnswentworth1y160

True, but Buck's claim is still relevant as a counterargument to my claim about memetic fitness of the scheming story relative to all these other stories.

9Nathan Helm-Burger1y

This is an interesting point. I disagree that scheming vs these ideas you mention is much of a 'streetlighting' case. I do, however, have my own fears that 'streetlighting' is occurring and causing some hard-but-critical avenues of risk to be relatively neglected. [Edit: on further thought, I think this might not just be a "streetlighting"effect, but also a "keeping my hands clean" effect. I think it's more tempting, especially for companies, to focus on harms that could plausibly be construed as being their fault. It's my impression that, for instance, employees of a given company might spend a disproportionate amount of time thinking about how to keep their company's product from harming people vs the general class of products from harming people. Also, less inclined to think about harm which could be averted via application of their product. This is additional reason for concern that having the bulk of AI safety work being funded by / done in AI companies will lead to correlated oversights.] My concerns that I think are relatively neglected in AI safety discourse are mostly related to interactions with incompetent or evil humans. Good alignment and control techniques don't do any good if someone opts not to use them in some critical juncture. Some potential scenarios: * If AI is very powerful, and held in check tenuously by fragile control systems, it might be released from control by a single misguided human or some unlucky chain of events, and then go rogue. * If algorithmic progress goes surprisingly quickly, we might find ourselves in a regime where a catastrophically dangerous AI can be assembled from some mix of pre-existing open-weights models, plus fine-tuning, plus new models trained with new algorithms, and probably all stitched together with hacky agent frameworks. Then all it would take would be for sufficient hints about this algorithmic discovery to leak, and someone in the world to reverse-engineer it, and then there would be potent rogue AI

[-]Buck1yΩ23367

IMO the main argument for focusing on scheming risk is that scheming is the main plausible source of catastrophic risk from the first AIs that either pose substantial misalignment risk or that are extremely useful (as I discuss here). These other problems all seem like they require the models to be way smarter in order for them to be a big problem. Though as I said here, I'm excited for work on some non-scheming misalignment risks.

[-]johnswentworth1yΩ214413

scheming is the main plausible source of catastrophic risk from the first AIs that either pose substantial misalignment risk or that are extremely useful...

Seems quite wrong. The main plausible source of catastrophic risk from the first AIs that either pose substantial misalignment risk or that are extremely useful is that they cause more powerful AIs to be built which will eventually be catastrophic, but which have problems that are not easily iterable-upon (either because problems are hidden, or things move quickly, or ...).

And causing more powerful AIs to be built which will eventually be catastrophic is not something which requires a great deal of intelligent planning; humanity is already racing in that direction on its own, and it would take a great deal of intelligent planning to avert it. This story, for example:

People try to do the whole "outsource alignment research to early AGI" thing, but the human overseers are themselves sufficiently incompetent at alignment of superintelligences that the early AGI produces a plan which looks great to the overseers (as it was trained to do), and that plan totally fails to align more-powerful next-gen AGI at all. And at that point, they

... (read more)

[-]Buck1yΩ9185

People try to do the whole "outsource alignment research to early AGI" thing, but the human overseers are themselves sufficiently incompetent at alignment of superintelligences that the early AGI produces a plan which looks great to the overseers (as it was trained to do), and that plan totally fails to align more-powerful next-gen AGI at all. And at that point, they're already on the more-powerful next gen, so it's too late.
This story sounds clearly extremely plausible (do you disagree with that?), involves exactly the sort of AI you're talking about ("the first AIs that either pose substantial misalignment risk or that are extremely useful"), but the catastropic risk does not come from that AI scheming.

This problem seems important (e.g. it's my last bullet here). It seems to me much easier to handle, because if this problem is present, we ought to be able to detect its presence by using AIs to do research on other subjects that we already know a lot about (e.g. the string theory analogy here). Scheming is the only reason why the model would try to make it hard for us to notice that this problem is present.

[-]johnswentworth1yΩ287020

A few problems with this frame.

First: you're making reasonably-pessimistic assumptions about the AI, but very optimistic assumptions about the humans/organization. Sure, someone could look for the problem by using AIs to do research on other subject that we already know a lot about. But that's a very expensive and complicated project - a whole field, and all the subtle hints about it, need to be removed from the training data, and then a whole new model trained! I doubt that a major lab is going to seriously take steps much cheaper and easier than that, let alone something that complicated.

One could reasonably respond "well, at least we've factored apart the hard technical bottleneck from the part which can be solved by smart human users or good org structure". Which is reasonable to some extent, but also... if a product requires a user to get 100 complicated and confusing steps all correct in order for the product to work, then that's usually best thought of as a product design problem, not a user problem. Making the plan at least somewhat robust to people behaving realistically less-than-perfectly is itself part of the problem.

Second: looking for the problem by testing on other f... (read more)

5Dakara1y

All 3 points seem very reasonable, looking forward to Buck's response to them.

5Dakara1y

Additionally, I am curious to hear if Ryan's views on the topic are similar to Buck's, given that they work at the same organization.

8Charlie Steiner1y

One big reason I might expect an AI to do a bad job at alignment research is if it doesn't do a good job (according to humans) of resolving cases where humans are inconsistent or disagree. How do you detect this in string theory research? Part of the reason we know so much about physics is humans aren't that inconsistent about it and don't disagree that much. And if you go to sub-topics where humans do disagree, how do you judge its performance (because 'be very convincing to your operators' is an objective with a different kind of danger). Another potential red flag is if the AI gives humans what they ask for even when that's 'dumb' according to some sophisticated understanding of human values. This could definitely show up in string theory research (note when some ideas suggest non-string-theory paradigms might be better, and push back on the humans if the humans try to ignore this), it's just intellectually difficult (maybe easier in loop quantum gravity research heyo gottem) and not as salient without the context of alignment and human values.

[-]avturchin1y290

I once counted several dozens of the ways how AI can cause human extinction, may be some ideas may help (map, text).

[-]_will_1y122

See also ‘The Main Sources of AI Risk?’ by Wei Dai and Daniel Kokotajlo, which puts forward 35 routes to catastrophe (most of which are disjunctive). (Note that many of the routes involve something other than intent alignment going wrong.)

8Johannes C. Mayer1y

Another one: We manage to solve alignment to a significant extend. The AI who is much smarter than a human thinks that it is aligned, and takes aligned actions. The AI even predicts that it will never become unaligned to humans. However, at some point in the future as the AI naturally unrolles into a reflectively stable equilibrium it becomes unaligned.

7Karl Krueger1y

I see a lot of discussion of AI doom stemming from research, business, and government / politics (including terrorism). Not a lot about AI doom from crime. Criminals don't stay in the box; the whole point of crime is to benefit yourself by breaking the rules and harming others. Intentional creation of intelligent cybercrime tools — ecosystems of AI malware, exploit discovery, spearphishing, ransomware, account takeovers, etc. — seems like a path to uncontrolled evolution of explicitly hostile AGI, where a maxim of "discover the rules; break them; profit" is designed-in.

6Towards_Keeperhood1y

Agree on that people focus a bit too much on scheming. It might be good for some people to think a bit more about the other failure modes you described, but the main thing that needs doing is very smart people making progress towards building an aligned AI, not defending against particular failure modes. (However, most people probably cannot usefully contribute to that, so maybe focusing on failure modes is still good for most people. Only that in any case there's the problem that people will find proposals that very likely don't actually work but which people can rather believe in that they work, and thereby making an AI stop a bit less likely.)

6lunatic_at_large1y

My initial reaction is that at least some of these points would be covered by the Guaranteed Safe AI agenda if that works out, right? Though the "AGIs act much like a colonizing civilization" situation does scare me because it's the kind of thing which locally looks harmless but collectively is highly dangerous. It would require no misalignment on the part of any individual AI.

5Kajus1y

Some of the stories assume a lot of AIs, wouldn't a lot of human-level AIs be very good at creating a better AI? Also it seems implausible to me that we will get a STEM-AGI that doesn't think about humans much but is powerful enought to get rid of atmosphere. On a different note, evaluating plausability of scenarios is a whole different thing that basically very few people do and write about in AI safety.

3Milan W1y

That is a pretty reasonable assumption. AFAIK that is what the labs plan to do.

2Kajus1y

What I think is that there won't be a time longer than 5 years where we have a lot of AIs and no super human AI. Basically that the first thing AIs will be used to will be self-improvement and quickly after reasonable ai agents we will get super human AI. Like 6 years.

5ozziegooen1y

This came from a Facebook thread where I argued that many of the main ways AI was described as failing fall into few categories (John disagreed). I appreciated this list, but they strike me as fitting into a few clusters. Personally, I like the focus "scheming" has. At the same time, I imagine there are another 5 to 20 clean concerns we should also focus on (some of which have been getting attention). While I realize there's a lot we can't predict, I think we could do a much better just making lists of different risk factors and allocating research amongst them.

[-]johnswentworth6mo16748

In response to the Wizard Power post, Garrett and David were like "Y'know, there's this thing where rationalists get depression, but it doesn't present like normal depression because they have the mental habits to e.g. notice that their emotions are not reality. It sounds like you have that."

... and in hindsight I think they were totally correct.

Here I'm going to spell out what it felt/feels like from inside my head, my model of where it comes from, and some speculation about how this relates to more typical presentations of depression.

Core thing that's going on: on a gut level, I systematically didn't anticipate that things would be fun, or that things I did would work, etc. When my instinct-level plan-evaluator looked at my own plans, it expected poor results.

Some things which this is importantly different from:

Always feeling sad
Things which used to make me happy not making me happy
Not having energy to do anything

... but importantly, the core thing is easy to confuse with all three of those. For instance, my intuitive plan-evaluator predicted that things which used to make me happy would not make me happy (like e.g. dancing), but if I actually did the things they still made me ha... (read more)

[-]David Lorell6mo160

This seems basically right to me, yup. And, as you imply, I also think the rat-depression kicked in for me around the same time likely for similar reasons (though for me an at-least-equally large thing that roughly-coincided was the unexpected, disappointing and stressful experience of the funding landscape getting less friendly for reasons I don't fully understand.) Also some part of me thinks that the model here is a little too narrow but not sure yet in what way(s).

[-]Ruby6mo141

This matches with the dual: mania. All plans, even terrible ones, seem like they'll succeed and this has flow through effects to elevated mood, hyperactivity, etc.

Whether or not this happens in all minds, the fact that people can alternate fairly rapidly between depression and mania with minimal trigger suggests there can be some kind of fragile "chemical balance" or something that's easily upset. It's possible that's just in mood disorders and more stable minds are just vulnerable to the "too many negative updates at once" thing without greater instability.

8Alexei6mo

Wow….. 1. I think I might have this. Will test immediately. 2. This needs to be a top level post.

7dr_s6mo

I imagine part of the problem is also then the feedback loop of Things Don't Go Well > Why Even Bother > Things Don't Go Well. Which if anything you'd expect that sort of proactive approach that simply does the thing anyway to break. I do wonder though if there may also be entirely internal feedback loops (like neuroreceptors or something) once the negativity is triggered by external events. I would assume so, or depression wouldn't need to be treated pharmaceutically as much as it is.

0Ulisse Mini6mo

EDIT: it's also possible John felt fine emotionally and was fully aware of his emotional state and actually was so good at not latching on to emotions that it was highly nontrivial to spot, or some combination. Leaving this comment in case it's useful for others. I don't like the tone though, I might've been very disassociated as a rationalist (and many are) but it's not obvious John is from this alone or not. As a meditator I pay a lot of attention to what emotion I'm feeling in high resolution and the causality between it and my thoughts and actions. I highly recommend this practice. What John describes in "plan predictor predicts failure" is something I notice several times a month & address. It's 101 stuff when you're orienting at it from the emotional angle, there's also a variety of practices I can deploy (feeling emotions, jhanas, many hard to describe mental motions...) to get back to equilibrium and clear thinking & action. This has overall been a bigger update to my effectiveness than the sequences, plausibly my rationality too (I can finally be unbiased instead of trying to correct or pretend I'm not biased!) Like, when I head you say "your instinctive plan-evaluator may end up with a global negative bias" I'm like, hm, why not just say "if you notice everything feels subtly heavier and like the world has metaphorically lost color" (how I notice it in myself. tbc fully nonverbally). Noticing through patterns of verbal thought also works, but it's just less data to do metacognition over. You're noticing correlations and inferring the territory (how you feel) instead of paying attention to how you feel directly (something which can be learned over time by directing attention towards noticing, not instantly) I may write on this. Till then I highly recommend Joe Hudson's work, it may require a small amount of woo tolerance, but only small. He coached Sam Altman & other top execs on emotional clarity & fluidity. Extremely good. Requires some practice & will

[-]johnswentworth6mo2212

Like, when I head you say "your instinctive plan-evaluator may end up with a global negative bias" I'm like, hm, why not just say "if you notice everything feels subtly heavier and like the world has metaphorically lost color"

Because everything did not feel subtly heavier or like the world had metaphorically lost color. It was just, specifically, that most nontrivial things I considered doing felt like they'd suck somehow, or maybe that my attention was disproportionately drawn to the ways in which they might suck.

And to be clear, "plan predictor predicts failure" was not a pattern of verbal thought I noticed, it's my verbal description of the things I felt on a non-verbal level. Like, there is a non-verbal part of my mind which spits out various feelings when I consider doing different things, and that part had a global negative bias in the feelings it spit out.

I use this sort of semitechnical language because it allows more accurate description of my underlying feelings and mental motions, not as a crutch in lieu of vague poetry.

[-]johnswentworth9mo15613

... But It's Fake Tho

Epistemic status: I don't fully endorse all this, but I think it's a pretty major mistake to not at least have a model like this sandboxed in one's head and check it regularly.

Full-cynical model of the AI safety ecosystem right now:

There’s OpenAI, which is pretending that it’s going to have full AGI Any Day Now, and relies on that narrative to keep the investor cash flowing in while they burn billions every year, losing money on every customer and developing a product with no moat. They’re mostly a hype machine, gaming metrics and cherry-picking anything they can to pretend their products are getting better. The underlying reality is that their core products have mostly stagnated for over a year. In short: they’re faking being close to AGI.
Then there’s the AI regulation activists and lobbyists. They lobby and protest and stuff, pretending like they’re pushing for regulations on AI, but really they’re mostly networking and trying to improve their social status with DC People. Even if they do manage to pass any regulations on AI, those will also be mostly fake, because (a) these people are generally not getting deep into the bureaucracy which would actually

... (read more)

[-]LWLW9mo164

What makes you confident that AI progress has stagnated at OpenAI? If you don’t have the time to explain why I understand, but what metrics over the past year have stagnated?

[-]niplav9mo152

Could you name three examples of people doing non-fake work? Since towardsness to non-fake work is easier to use for aiming than awayness from fake work.

9johnswentworth9mo

Chris Olah and Dan Murfet in the at-least-partially empirical domain. Myself in the theory domain, though I expect most people (including theorists) would not know what to look for to distinguish fake from non-fake theory work. In the policy domain, I have heard that Microsoft's lobbying team does quite non-fake work (though not necessarily in a good direction). In the capabilities domain, DeepMind's projects on everything except LLMs (like e.g. protein folding, or that fast matrix multiplication paper) seem consistently non-fake, even if they're less immediately valuable than they might seem at first glance. Also Conjecture seems unusually good at sticking to reality across multiple domains.

2Garrett Baker8mo

I do not get this impression, why do you say this?

1tailcalled9mo

The entire field is based on fears that consequentialism provides an extremely powerful but difficult-to-align method of converting intelligence into agency. This is basically wrong. Yes, people attempt to justify it with coherence theorems, but obviously you can be approximately-coherent/approximately-consequentialist and yet still completely un-agentic, so this justification falls flat. Since the field is based on a wrong assumption with bogus justification, it's all fake.

8Steven Byrnes9mo

(IMO this is kinda unrelated to the OP, but I want to continue this thread.) Have you elaborated on this anywhere? Perhaps you missed it, but some guy in 2022 wrote this great post which claimed that “Consequentialism, broadly defined, is a general and useful way to develop capabilities.” ;-) I’m actually just in the course of writing something about why “consequentialism provides an extremely powerful but difficult-to-align method of converting intelligence into agency” … maybe I can send you the draft for criticism when it’s ready?

6tailcalled9mo

I think it's quite related to the OP. If a field is founded on a wrong assumption, then people only end up working in the field if they have some sort of blind spot, and that blind spot leads to their work being fake. Not hugely. One tricky bit is that it basically ends up boiling down to "the original arguments don't hold up if you think about them", but the exact way they don't hold up depends on what the argument is, so it's kind of hard to respond to in general. Haha! I think I mostly still stand by the post. In particular, "Consequentialism, broadly defined, is a general and useful way to develop capabilities." remains true; it's just that intelligence relies on patterns and thus works much better on common things (which must be small, because they are fragments of a finite world), than on rare things (which can be big, though don't have to). This means that consequentialism isn't very good at developing powerful capabilities unless it works in an environment that has already been highly filtered to be highly homogenous, because an inhomogenous environment is going to BTFO the intelligence. (I'm not sure I stand 101% by my post; there's some funky business about how to count evolution that I still haven't settled on yet. And I was too quick to go from "imitation learning isn't going to lead to far-superhuman abilities" to "consequentialism is the road to far-superhuman abilities". But yeah I'm actually surprised at how well I stand by my old view despite my massive recent updates.) Sounds good!

4Steven Byrnes9mo

I think you’re conflating consequentialism and understanding in a weird-to-me way. (Or maybe I’m misunderstanding.) I think consequentialism is related to choosing one action versus another action. I think understanding (e.g. predicting the consequence of an action) is different, and that in practice understanding has to involve self-supervised learning. (I think human brains have both [partly-] consequentialist decisions and self-supervised updating of the world-model.) (They’re not totally independent, but rather they interact via training data: e.g. [partly-] consequentialist decision-making determines how you move your eyes, and then whatever your eyes are pointing at, your model of the visual world will then update by self-supervised learning on that particular data. But still, these are two systems that interact, not the same thing.) I think self-supervised learning is perfectly capable of discovering rare but important patterns. Just look at today’s foundation models, which seem pretty great at that.

4tailcalled9mo

This I'd dispute. If your model if underparameterized (which I think is true for the typical model?), then it can't learn any patterns that only occurs once in the data. And even if the model is overparameterized, it still can't learn any pattern that never occurs in the data. I'm saying that intelligence is the thing that allows you to handle patterns. So if you've got a dataset, intelligence allows you to build a model that makes predictions for other data based on the patterns it can find in said dataset. And if you have a function, intelligence allows you to find optima for said function based on the patterns it can find in said function. Consequentialism is a way to set up intelligence to be agent-ish. This often involves setting up something that's meant to build an understanding of actions based on data or experience. One could in principle cut my definition of consequentialism up into self-supervised learning and true consequentialism (this seems like what you are doing..?). One disadvantage with that is that consequentialist online learning is going to have a very big effect on the dataset one ends up training the understanding on, so they're not really independent of each other. Either way that just seems like a small labelling thing to me.

2Steven Byrnes9mo

Dunno if anything’s changed since 2023, but this says LLMs learn things they’ve seen exactly once in the data. I can vouch that you can ask LLMs about things that are extraordinarily rare in the training data—I’d assume well under once per billion tokens—and they do pretty well. E.g. they know lots of random street names. Humans successfully went to the moon, despite it being a quite different environment that they had never been in before. And they didn’t do that with “durability, strength, healing, intuition, tradition”, but rather with intelligence. Speaking of which, one can apply intelligence towards the problem of being resilient to unknown unknowns, and one would come up with ideas like durability, healing, learning from strategies that have stood the test of time (when available), margins of error, backup systems, etc.

2tailcalled9mo

I guess to add, I'm not talking about unknown unknowns. Often the rare important things are very well known (after all, they are important, so people put a lot of effort into knowing them), they just can't efficiently be derived from empirical data (except essentially by copying someone else's conclusion blindly, and that leaves you vulnerable to deception).

2tailcalled9mo

I don't have time to read this study in detail until later today, but if I'm understanding it correctly, the study isn't claiming that neural networks will learn rare important patterns in the data, but rather that they will learn rare patterns that they were recently trained on. So if you continually train on data, you will see a gradual shift towards new patterns and forgetting old ones. Random street names aren't necessarily important though? Like what would you do with them? I didn't say that intelligence can't handle different environments, I said it can't handle heterogenous environments. The moon is nearly a sterile sphere in a vacuum; this is very homogenous, to the point where pretty much all of the relevant patterns can be found or created on Earth. It would have been more impressive if e.g. the USA could've landed a rocket with a team of Americans in Moscow than on the moon. Also people did use durability, strength, healing, intuition and tradition to go the moon. Like with strength, someone had to build the rockets (or build the machines which built the rockets). And without durability and healing, they would have been damaged too much in the process of doing that. Intuition and healing are harder to clearly attribute, but they're part of it too. Learning from strategies that stood the test of time would be tradition moreso than intelligence. I think tradition requires intelligence, but it also requires something else that's less clear (and possibly not simple enough to be assembled manually, idk). Margins of error and backup systems would be, idk, caution? Which, yes, definitely benefit from intelligence and consequentialism. Like I'm not saying intelligence and consequentialism are useless, in fact I agree that they are some of the most commonly useful things due to the frequent need to bypass common obstacles.

2Steven Byrnes9mo

Right, that’s what I was gonna say. You need intelligence to sort out which traditions should be copied and which ones shouldn’t. There was a 13-billion-year “tradition” of not building e-commerce megastores, but Jeff Bezos ignored that “tradition”, and it worked out very well for him (and I’m happy about it too). Likewise, the Wright Brothers explicitly followed the “tradition” of how birds soar, but not the “tradition” of how birds flap their wings. I do think there’s a “something else” (most [but not all] humans have an innate drive to follow and enforce social norms, more or less), but I don’t think it’s necessary. The Wright Brothers didn’t have any innate drive to copy anything about bird soaring tradition, but they did it anyway purely by intelligence. I feel like I’ve lost the plot here. If you think there are things that are very important, but rare in the training data, and that LLMs consequently fail to learn, can you give an example? I guess you’re using “empirical data” in a narrow sense. If Joe tells me X, I have gained “empirical data” that Joe told me X. And then I can apply my intelligence to interpret that “data”. For example, I can consider a number of hypotheses: the hypothesis that Joe is correct and honest, that Joe is mistaken but honest, that Joe is trying to deceive me, that Joe said Y but I misheard him, etc. And then I can gather or recall additional evidence that favors one of those hypotheses over another. I could ask Joe to repeat himself, to address the “I misheard him” hypothesis. I could consider how often I have found Joe to be mistaken about similar things in the past. I could ask myself whether Joe would benefit from deceiving me. Etc. This is all the same process that I might apply to other kinds of “empirical data” like if my car was making a funny sound. I.e., consider possible generative hypotheses that would match the data, then try to narrow down via additional observations, and/or remain uncertain and prepare for multip

4tailcalled9mo

I think the necessity of intelligence for tradition exists on a much more fundamental level than that. Intelligence allows people to from an extremely rich model of the world with tons of different concepts. If one had no intelligence at all, one wouldn't even be able to copy the traditions. Like consider a collection of rocks or a forest; it can't pass any tradition onto itself. But conversely, just as intelligence cannot be converted into powerful agency, I don't think it can be used to determine which traditions should be copied and which ones shouldn't. It seems to me that you are treating any variable attribute that's highly correlated across generations as a "tradition", to the point where not doing something is considered on the same ontological level as doing something. That is the sort of ontology that my LDSL series is opposed to. I'm probably not the best person to make the case for tradition as (despite my critique of intelligence) I'm still a relatively strong believer in equillibration and reinvention. Whenever there's any example of this that's too embarrassing or too big of an obstacle for applying them in a wide range of practical applications, a bunch of people point it out, and they come up with a fix that allows the LLMs to learn it. The biggest class of relevant examples would all be things that never occur in the training data - e.g. things from my job, innovations like how to build a good fusion reactor, social relationships between the world's elites, etc.. Though I expect you feel like these would be "cheating", because it doesn't have a chance to learn them? The things in question often aren't things that most humans have a chance to learn, or even would benefit from learning. Often it's enough if just 1 person realizes and handles them, and alternately often if nobody handles them then you just lose whatever was dependent on them. Intelligence is a universal way to catch on to common patterns; other things than common patterns matter

2Steven Byrnes9mo

OK, here’s my argument that, if you take {intelligence, understanding, consequentialism} as a unit, it’s sufficient for everything: * If durability and strength are helpful, then {intelligence, understanding, consequentialism} can discover that durability and strength are helpful, and then build durability and strength. * Even if “the exact ways in which durability and strength will be helpful” does not constitute a learnable pattern, “durability and strength will be helpful” is nevertheless a (higher-level) learnable pattern. * If some other evolved aspects of the brain and body are helpful, then {intelligence, understanding, consequentialism} can likewise discover that they are helpful, and build them. * After all, if ‘those things are helpful’ wasn’t a learnable pattern, then evolution would not have discovered and exploited that pattern! * If the number of such aspects is dozens or hundreds or thousands, then whatever, {intelligence, understanding, consequentialism} can still get to work systematically discovering them all. The recipe for a human is not infinitely complex. * If reducing heterogeneity is helpful, then {intelligence, understanding, consequentialism} can discover that fact, and figure out how to reduce heterogeneity. * Etc.

3tailcalled9mo

Writing the part that I didn't get around to yesterday: You could theoretically imagine e.g. scanning all the atoms of a human body and then using this scan to assemble a new human body in their image. It'd be a massive technical challenge of course, because atoms don't really sit still and let you look and position them. But with sufficient work, it seems like someone could figure it out. This doesn't really give you artificial general agency of the sort that standard Yudkowsky-style AI worries are about, because you can't assign them a goal. You might get an Age of Em-adjacent situation from it, though even not quite that. To reverse-engineer people in order to make AI, you'd instead want to identify separate faculties with interpretable effects and reconfigurable interface. This can be done for some of the human faculties because they are frequently applied to their full extent and because they are scaled up so much that the body had to anatomically separate them from everything else. However, there's just no reason to suppose that it should apply to all the important human faculties, and if one considers all the random extreme events one ends up having to deal with when performing tasks in an unhomogenized part of the world, there's lots of reason to think humans are primarily adapted to those. One way to think about the practical impact of AI is that it cannot really expand on its own, but that people will try to find or create sufficiently-homogenous places where AI can operate. The practical consequence of this is that there will be a direct correspondence between each part of the human work to prepare the AI to each part of the activities the AI is engaging in, which will (with caveats) eliminate alignment problems because the AI only does the sorts of things you explicitly make it able to do. The above is similar to how we don't worry so much about 'website misalignment' because generally there's a direct correspondence between the behavior of the web

2tailcalled9mo

I've grown undecided about whether to consider evolution a form of intelligence-powered consequentialism because in certain ways it's much more powerful than individual intelligence (whether natural or artificial). Individual intelligence mostly focuses on information that can be made use of over a very short time/space-scale. For instance an autoregressive model relates the immediate future to the immediate past. Meanwhile, evolution doesn't meaningfully register anything shorter than the reproductive cycle, and is clearly capable of registering things across the entire lifespan and arguably longer than that (like, if you set your children up in an advantageous situation, then that continues paying fitness dividends even after you die). Of course this is somewhat counterbalanced by the fact that evolution has much lower information bandwidth. Though from what I understand, people also massively underestimate evolution's information bandwidth due to using an easy approximation (independent Bernoulli genotypes, linear short-tailed genotype-to-phenotype relationships and thus Gaussian phenotypes, quadratic fitness with independence between organisms). Whereas if you have a large number of different niches, then within each niche you can have the ordinary speed of evolution, and if you then have some sort of mixture niche, that niche can draw in organisms from each of the other niches and thus massively increase its genetic variance, and then since the speed of evolution is proportional to genetic variance, that makes this shared niche evolve way faster than normally. And if organisms then pass from the mixture niche out into the specialized niches, they can benefit from the fast evolution too. (Mental picture to have in mind: we might distinguish niches like hunter, fisher, forager, farmer, herbalist, spinner, potter, bard, bandit, carpenter, trader, king, warlord (distinct from king in that kings gain power through expanding their family while warlords gain power

2quetzal_rainbow9mo

Filter for homogenity of environment is anthropic selection - if environment is sufficiently heterogeneous, it kills everyone who tries to reach out of its ecological niche, general intelligence doesn't develop and we are not here to have this conversation.

2tailcalled9mo

Nah, there are other methods than intelligence for survival and success. E.g. durability, strength, healing, intuition, tradition, ... . Most of these developed before intelligence did.

2quetzal_rainbow9mo

I mean, we exist and we are at least somewhat intelligent, which implies strong upper bound on heterogenity of environment. On the other hand, words like "durability" imply possibility of categorization, which itself implies intelligence. If environment is sufficiently heterogenous, you are durable at one second and evaporate at another.

2tailcalled9mo

We don't just use intelligence. ??? Vaporization is prevented by outer space which drains away energy. Not clear why you say durability implies intelligence, surely trees are durable without intelligence.

2quetzal_rainbow9mo

I feel like I'm failing to convey the level of abstraction I intend to. I'm not saying that durability of object implies intelligence of object. I'm saying that if the world is ordered in a way that allows existence of distinct durable and non-durable objects, that means the possibility of intelligence which can notice that some objects are durable and some are not and exploit this fact. If the environment is not ordered enough to contain intelligent beings, it's probably not ordered enough to contain distinct durable objects too. To be clear, by "environment" I mean "the entire physics". When I say "environment not ordered enough" I mean "environment with physical laws chaotic enough to not contain ordered patterns".

2tailcalled9mo

It seems like you are trying to convince me that intelligence exists, which is obviously true and many of my comments rely on it. My position is simply that consequentialism cannot convert intelligence into powerful agency, it can only use intelligence to bypass common obstacles.

2quetzal_rainbow9mo

No, my point is that in worlds where intelligence is possible, almost all obstacles are common.

2tailcalled9mo

If there's some big object, then it's quite possible for it to diminish into a large number of similar obstacles, and I'd agree this is where most obstacles come from, to the point where it seems reasonable to say that intelligence can handle almost all obstacles. However, my assertion wasn't that intelligence cannot handle almost all obstacles, it was that consequentialism can't convert intelligence into powerful agency. It's enough for there to be rare powerful obstacles in order for this to fail.

1Kajus9mo

I don't think this is the claim that the post is making but still makes sense to me. The post is saying something opposite, that the people working on the field are not doing prioritization right and so on or not thinking clearly about things while the risk is real

2tailcalled9mo

I'm not trying to present johnswentworth's position, I'm trying to present my position.

[-]Katalina Hernandez9mo130

I do not necessarily disagree with this, coming from a legal / compliance background. If you see any of my profiles, I constantly complain about "performative compliance" and "compliance theatre". Painfully present across the legal and governance sectors.

That said: can you provide examples of activism or regulatory efforts that you do agree with? What does a "non fake" regulatory effort look like?

I don't think it would be okay to dismiss your take entirely, but it would be great to see what solutions you'd propose too. This is why I disagree in principle, because there are no specific points to contribute to.

In Europe, paradoxically, some of the people "close enough to the bureaucracy" that pushed for the AI Act to include GenAI providers, were OpenAI-adjacent.

But I will rescue this:

"(b) the regulatory targets themselves are aimed at things which seem easy to target (e.g. training FLOP limitations) rather than actually stopping advanced AI"

BigTech is too powerful to lobby against. "Stopping advanced AI" per se would contravene many market regulations (unless we define exactly what you mean by advanced AI and the undeniable dangers to people's lives). Regulators can only ... (read more)

9uugr9mo

"The underlying reality is that their core products have mostly stagnated for over a year. In short: they’re faking being close to AGI." This seems like the most load-bearing belief in the full-cynical model; most of your other examples of fakeness rely on it in one way or another: * If the core products aren't really improving, the progress measured on benchmarks is fake. But if they are, the benchmarks are an (imperfect but still real) attempt to quantify that real improvement. * If LLMs are stagnating, all the people generating dramatic-sounding papers for each new SOTA are just maintaining a holding pattern. But if they're changing, then just studying/keeping up with the general properties of that progress is real. Same goes for people building and regularly updating their toy models of the thing. * Similarly, if the progress is fake, the propaganda signal-boosting that progress is also fake. If it isn't, it isn't. (At least directionally; a lot of that propaganda is still probably exaggerated.) * If the above three are all fake, all the people who feel real scared and want to be validated are stuck in a toxic emotional dead-end where they constantly freak out over fake things to no end. But if they're responding to legitimate, persistent worldview updates, having a space to vibe them out with like-minded others seems important. So, in deciding whether or not to endorse this narrative, we'd like to know whether or not the models really ARE stagnating. What makes you think the appearance of progress here is illusory?

6johnswentworth9mo

Nope! Even if the base models are improving, it can still be true that most of the progress measured on the benchmarks is fake, and has basically-nothing to do with the real improvements. Even if the base models are improving, it can still be true that the dramatic sounding papers and toy models are fake, and have basically-nothing to do with the real improvements. Even if the base models are improving, the propaganda about it can still be overblown and mostly fake, and have basically-nothing to do with the real improvements. Even if the base models are improving, the people who feel real scared and just want to be validated can still be doing fake work and in fact be mostly useless, and their dynamic can still have basically-nothing to do with the real improvements. Just because the base models are in fact improving does not mean that all this other stuff is actually coupled to the real improvement.

1uugr9mo

Sounds like you're suggesting that real progress could be orthogonal to human-observed progress. I don't see how this is possible. Human-observed progress is too broad. The collective of benchmarks, dramatic papers and toy models, propaganda, and doomsayers are suggesting the models are simultaneously improving at: writing code, researching data online, generating coherent stories, persuading people of things, acting autonomously without human intervention, playing Pokemon, playing Minecraft, playing chess, aligning to human values, pretending to align to human values, providing detailed amphetamine recipes, refusing to provide said recipes, passing the Turing test, writing legal documents, offering medical advice, knowing what they don't know, being emotionally compelling companions, correctly guessing the true authors of anonymous text, writing papers, remembering things, etc, etc. They think all these improvements are happening at the same time in vastly different domains because they're all downstream of the same task, which is text prediction. So, they're lumped together in the general domain of 'capabilities', and call a model which can do all of them well a 'general intelligence'. If the products are stagnating, sure, all those perceived improvements could be bullshit. (Big 'if'!) But how could the models be 'improving' without improving at any of these things? What domains of 'real improvement' exist that are uncoupled to human perceptions of improvement, but still downstream of text prediction?

[-]gwern9mo*190

What domains of 'real improvement' exist that are uncoupled to human perceptions of improvement, but still downstream of text prediction?

As defined, this is a little paradoxical: how could I convince a human like you to perceive domains of real improvement which humans do not perceive...?

correctly guessing the true authors of anonymous text

See, this is exactly the example I would have given: truesight is an obvious example of a domain of real improvement which appears on no benchmarks I am aware of, but which appears to correlate strongly with the pretraining loss, is not applied anywhere (I hope), is unobvious that LLMs might do it and the capability does not naturally reveal itself in any standard use-cases (which is why people are shocked when it surfaces), and it would have been easy for no one to have observed it up until now or dismissed it, and even now after a lot of publicizing (including by yours truly), only a few weirdos know much about it.

Why can't there be plenty of other things like inner-monologue or truesight? ("Wait, you could do X? Why didn't you tell us?" "You never asked.")

What domains of 'real improvement' exist that are uncoupled to human perceptions

... (read more)

1uugr8mo

Oops, yes. I was thinking "domains of real improvement which humans are currently perceiving in LLMs", not "domains of real improvement which humans are capable of perceiving in general". So a capability like inner-monologue or truesight, which nobody currently knows about, but is improving anyway, would certainly qualify. And the discovery of such a capability could be 'real' even if other discoveries are 'fake'. That said, neither truesight nor inner-monologue seem uncoupled to the more common domains of improvement, as measured in benchmarks and toy models and people-being-scared. The latter, especially, I thought was popularized because it was so surprisingly good at improving benchmark performance. Truesight is narrower, but at the very least we'd expect it to correlate with skill in the common "write [x] in the style of [y]" prompt, right? Surely the same network of associations which lets it accurately generate "Eliezer Yudkowsky wrote this" after a given set of tokens, would also be useful for accurately finishing a sentence starting with "Eliezer Yudkowksy says...". So I still wouldn't consider these things to have basically-nothing to do with commonly perceived domains of improvement.

5gwern8mo

Inner-monologue is an example because as far as we know, it should have existed in pre-GPT-3 models and been constantly improving, but we wouldn't have noticed because no one would have been prompting for it and if they had, they probably wouldn't have noticed it. (The paper I linked might have demonstrated that by finding nontrivial performance in smaller models.) Only once it became fairly reliable in GPT-3 could hobbyists on 4chan stumble across it and be struck by the fact that, contrary to what all the experts said, GPT-3 could solve harder arithmetic or reasoning problems if you very carefully set it up just right as an elaborate multi-step process instead of what everyone did, which was just prompt it for the answer right away. Saying it doesn't count because once it was discovered it was such a large real improvement, is circular and defines away any example. (Did it not improve benchmarks once discovered? Then who cares about such an 'uncoupled' capability; it's not a real improvement. Did it subsequently improve benchmarks once discovered? Then it's not really an example because it's 'coupled'...) Surely the most interesting examples are ones which do exactly that! And of course, now there is so much discussion, and so many examples, and it is in such widespread use, and has contaminated all LLMs being trained since, that they start to do it by default given the slightest pretext. The popularization eliminated the hiddenness. And here we are with 'reasoning models' which have blown through quite a few older forecasts and moved timelines earlier by years, to the extent that people are severely disappointed when a model like GPT-4.5 'only' does as well as the scaling laws predicted and they start predicting the AI bubble is about to pop and scaling has been refuted. But that would be indistinguishable from many other sources of improvement. For starters, by giving a name, you are only testing one direction: 'name -> output'; truesight is about 'name <- ou

8Thane Ruthenis9mo

Gotta love how much of a perfect Scissor statement this is. (Same as my "o3 is not that impressive".)

5Charbel-Raphaël9mo

SB1047 was a pretty close shot to something really helpful. The AI Act and its code of practice might be insufficient, but there are good elements in it that, if applied, would reduce the risks. The problem is that it won't be applied because of internal deployment. But I sympathise somewhat with stuff like this:

9johnswentworth9mo

No, it wasn't. It was a pretty close shot to something which would have gotten a step closer to another thing, which itself would have gotten us a step closer to another thing, which might have been moderately helpful at best.

7Charbel-Raphaël9mo

You really think those elements are not helpful? I'm really curious

8johnswentworth9mo

Sure, they are more-than-zero helpful. Heck, in a relative sense, they'd be one of the biggest wins in AI safety to date. But alas, reality does not grade on a curve. One has to bear in mind that the words on that snapshot do not all accurately describe reality in the world where SB1047 passes. "Implement shutdown ability" would not in fact be operationalized in a way which would ensure the ability to shutdown an actually-dangerous AI, because nobody knows how to do that. "Implement reasonable safeguards to prevent societal-scale catastrophes" would in fact be operationalized as checking a few boxes on a form and maybe writing some docs, without changing deployment practices at all, because the rules for the board responsible for overseeing these things made it pretty easy for the labs to capture. When I discussed the bill with some others at the time, the main takeaway was that the actually-substantive part was just putting any bureaucracy in place at all to track which entities are training models over 10^26 FLOP/$100M. The bill seemed unlikely to do much of anything beyond that. Even if the bill had been much more substantive, it would still run into the standard problems of AI regulation: we simply do not have a way to reliably tell which models are and are not dangerous, so the choice is to either ban a very large class of models altogether, or allow models which will predictably be dangerous sooner or later. The most commonly proposed substantive proxy is to ban models over a certain size, which would likely slow down timelines by a factor of 2-3 at most, but definitely not slow down timelines by a factor of 10 or more.

7Thane Ruthenis9mo

... or, if we do live in a world in which LLMs are not AGI-complete, it might accelerate timelines. After all, this would force the capabilities people to turn their brains on again instead of mindlessly scaling, and that might lead to them stumbling on something which is AGI-complete. And it would, due to a design constraint, need much less compute for committing omnicide. How likely would that be? Companies/people able to pivot like this would need to be live players, capable of even conceiving of new ideas that aren't "scale LLMs". Naturally, that means 90% of the current AI industry would be out of the game. But then, 90% of the current AI industry aren't really pushing the frontier today either; that wouldn't be much of a loss. To what extent are the three AGI labs alive vs. dead players, then? * OpenAI has certainly been alive back in 2022. Maybe the coup and the exoduses killed it and it's now a corpse whose apparent movement is just inertial (the reasoning models were invented prior to the coup, if Q* rumors are to be trusted, so it's little evidence that OpenAI was still alive in 2024). But maybe not. * Anthropic houses a bunch of the best OpenAI researchers now, and it's apparently capable of inventing some novel tricks (whatever's the mystery behind Sonnet 3.5 and 3.6). * DeepMind is even now consistently outputting some interesting non-LLM research. I think there's a decent chance that they're alive enough. Currently, they're busy eating the best AI researchers and turning them into LLM researchers. If they stop focusing people's attention on the potentially-doomed paradigm, if they're forced to correct the mistake (on this model) that they're making... This has always been my worry about all the proposals to upper-bound FLOPs, complicated by my uncertainty regarding whether LLMs are or are not AGI-complete after all. One major positive effect this might have is memetic. It might create the impression of an (artificially created) AI Winter, caus

3tlevin9mo

I disagree that the default would've been that the board would've been "easy for the labs to capture" (indeed, among the most prominent and plausible criticisms of its structure was that it would overregulate in response to political pressure), and thus that it wouldn't have changed deployment practices. I think the frontier companies were in a good position to evaluate this, and they decided to oppose the bill (and/or support it conditional on sweeping changes, including the removal of the Frontier Model Division). Also, I'm confused when policy skeptics say things like "sure, it might slow down timelines by a factor of 2-3, big deal." Having 2-3x as much time is indeed a big deal!

2johnswentworth9mo

Probably not going to have a discussion on the topic right now, but out of honest curiosity: did you read the bill?

3Charbel-Raphaël9mo

I'm glad we agree "they'd be one of the biggest wins in AI safety to date." How so? It's pretty straightforward if the model is still contained in the lab. I think ticking boxes is good. This is how we went to the Moon, and it's much better to do this than to not do it. It's not trivial to tick all the boxes. Look at the number of boxes you need to tick if you want to follow the Code of Practice of the AI Act or this paper from DeepMind. How so? I think capabilities evaluations are much simpler than alignment evals, and at the very least we can run those. You might say: "A model might sandbag." Sure, but you can fine-tune it and see if the capabilities are recovered. If even with some fine-tuning the model is not able to do the tasks at all, modulo the problem of gradient hacking that is, I think, very unlikely, we can be pretty sure that the model wouldn't be capable of doing such feat. I think at the very least, following the same methodology as the one followed by Anthropic in their last system cards is pretty good and would be very helpful.

3Katalina Hernandez9mo

100% agreed @Charbel-Raphaël. The EU AI Act even mentions "alignment with human intent" explicitly, as a key concern for systemic risks. This is in Recital 110 (which defines what are systemic risks and how they may affect society). I do not think any law has mentioned alignment like this before, so it's massive already. Will a lot of the implementation efforts feel "fake"? Oh, 100%. But I'd say that this is why we (this community) should not disengage from it... I also get that the regulatory landscape in the US is another world entirely (which is what the OP is bringing up).

4Veedrac9mo

Your very first point is, to be a little uncharitable, ‘maybe OpenAI's whole product org is fake.’ I know you have a disclaimer here but you're talking about a product category that didn't exist 30 months ago that today has this one website now reportedly used by 10% of people in the entire world and that the internet is saying expects ~12B revenue this year. If your vibes are towards investing in that class of thing being fake or ‘mostly a hype machine’ then your vibes are simply not calibrated well in this domain.

6johnswentworth9mo

No, the model here is entirely consistent with OpenAI putting out some actual cool products. Those products (under the model) just aren't on a path to AGI, and OpenAI's valuation is very much reliant on being on a path to AGI in the not-too-distant future. It's the narrative about building AGI which is fake.

9Lucius Bushnaq9mo

Really? I'm mostly ignorant on such matters, but I'd thought that their valuation seemed comically low compared to what I'd expect if their investors thought that OpenAI was likely to create anything close to a general superhuman AI system in the near future.[1] I considered this evidence that they think all the AGI/ASI talk is just marketing. 1. ^ Well ok, if they actually thought OpenAI would create superintelligence as I think of it, their valuation would plummet because giving people money to kill you with is dumb. But there's this space in between total obliviousness and alarm, occupied by a few actually earnest AI optimists. And, it seems to me, not occupied by the big OpenAI investors.

8Veedrac9mo

Consider, in support: Netflix has a $418B market cap. It is inconsistent to think that a $300B valuation for OpenAI or whatever's in the news requires replacing tens of trillions of dollars of capital before the end of the decade. Similarly, for people wanting to argue from the other direction, who might think a low current valuation is case-closed evidence against their success chances, consider that just a year ago the same argument would have discredited how they are valued today, and a year before that would have discredited where they were a year ago, and so forth. This holds similarly for historic busts in other companies. Investor sentiment is informational but clearly isn't definitive, else stocks would never change rapidly.

4Lucius Bushnaq9mo

To be clear: I think the investors would be wrong to think that AGI/ASI soon-ish isn't pretty likely.

2Veedrac9mo

But most of your criticisms in the point you gave have ~no bearing on that? If you want to make a point about how effectively OpenAI's research moves towards AGI you should be saying things relevant to that, not giving general malaise about their business model. Or, I might understand ‘their business model is fake which implies a lack of competence about them broadly,’ but then I go back to the whole ‘10% of people in the entire world’ and ‘expects 12B revenue’ thing.

3johnswentworth9mo

The point of listing the problems with their business model is that they need the AGI narrative in order to fuel the investor cash, without which they will go broke at current spend rates. They have cool products, they could probably make a profit if they switched to optimizing for that (which would mean more expensive products and probably a lot of cuts), but not anywhere near the level of profits they'd need to justify the valuation.

1Veedrac9mo

That's how I interpreted it originally; you were arguing their product org vibed fake, I was arguing your vibes were miscalibrated. I'm not sure what to say to this that I didn't say originally.

4Joseph Miller9mo

The activists and the lobbyists are two very different groups. The activists are not trying to network with the DC people (yet). Unless you mean Encode, who I would call lobbyists, not activists.

4johnswentworth9mo

Good point, I should have made those two separate bullet points: * Then there’s the AI regulation lobbyists. They lobby and stuff, pretending like they’re pushing for regulations on AI, but really they’re mostly networking and trying to improve their social status with DC People. Even if they do manage to pass any regulations on AI, those will also be mostly fake, because (a) these people are generally not getting deep into the bureaucracy which would actually implement any regulations, and (b) the regulatory targets themselves are aimed at things which seem easy to target (e.g. training FLOP limitations) rather than actually stopping advanced AI. The activists and lobbyists are nominally enemies of OpenAI, but in practice they all benefit from pushing the same narrative, and benefit from pretending that everyone involved isn’t faking everything all the time. * Also, there's the AI regulation activists, who e.g. organize protests. Like ~98% of protests in general, such activity is mostly performative and not the sort of thing anyone would end up doing if they were seriously reasoning through how best to spend their time in order to achieve policy goals. Calling it "fake" feels almost redundant. Insofar as these protests have any impact, it's via creating an excuse for friendly journalists to write stories about the dangers of AI (itself an activity which mostly feeds the narrative, and has dubious real impact). (As with the top level, epistemic status: I don't fully endorse all this, but I think it's a pretty major mistake to not at least have a model like this sandboxed in one's head and check it regularly.)

4Thane Ruthenis9mo

Oh, if you're in the business of compiling a comprehensive taxonomy of ways the current AI thing may be fake, you should also add: * Vibe coders and "10x'd engineers", who (on this model) would be falling into one of the failure modes outlined here: producing applications/features that didn't need to exist, creating pointless code bloat (which helpfully show up in productivity metrics like "volume of code produced" or "number of commits"), or "automatically generating" entire codebases in a way that feels magical, then spending so much time bugfixing them it eats up ~all perceived productivity gains. * e/acc and other Twitter AI fans, who act like they're bleeding-edge transhumanist visionaries/analysts/business gurus/startup founders, but who are just shitposters/attention-seekers who will wander off and never look back the moment the hype dies down.

6johnswentworth9mo

True, but I feel a bit bad about punching that far down.

2Kajus9mo

What are the other basically-fake fields out there?

2O O9mo

quantum computing, nuclear fusion

1wonder9mo

I share some similar frustrations, and unfortunately these are also prevalent in other parts of the human society. The commonality of most of these fakeness seem to be impure intentions - there are impure/non-intrinsic motivations other than producing the best science/making true progress. Some of these motivations unfortunately could be based on survival/monetary pressure, and resolving that for true research or progress seems to be critical. We need to encourage a culture of pure motivations, and also equip ourselves with more ability/tools to distinguish extrinsic motivations.

[-]johnswentworth1yΩ5415034

On o3: for what feels like the twentieth time this year, I see people freaking out, saying AGI is upon us, it's the end of knowledge work, timelines now clearly in single-digit years, etc, etc. I basically don't buy it, my low-confidence median guess is that o3 is massively overhyped. Major reasons:

I've personally done 5 problems from GPQA in different fields and got 4 of them correct (allowing internet access, which was the intent behind that benchmark). I've also seen one or two problems from the software engineering benchmark. In both cases, when I look the actual problems in the benchmark, they are easy, despite people constantly calling them hard and saying that they require expert-level knowledge.
- For GPQA, my median guess is that the PhDs they tested on were mostly pretty stupid. Probably a bunch of them were e.g. bio PhD students at NYU who would just reflexively give up if faced with even a relatively simple stat mech question which can be solved with a couple minutes of googling jargon and blindly plugging two numbers into an equation.
- For software engineering, the problems are generated from real git pull requests IIUC, and it turns out that lots of those are things like e

... (read more)

[-]Buck1yΩ16317

I just spent some time doing GPQA, and I think I agree with you that the difficulty of those problems is overrated. I plan to write up more on this.

[-]Buck1yΩ7146

@johnswentworth Do you agree with me that modern LLMs probably outperform (you with internet access and 30 minutes) on GPQA diamond? I personally think this somewhat contradicts the narrative of your comment if so.

5johnswentworth1y

I don't know, I have not specifically tried GPQA diamond problems. I'll reply again if and when I do.

4Raemon1y

I at least attempted to be filtering the problems I gave you for GPQA diamond, although I am not very confident that I succeeded. (Update: yes, the problems John did were GPQA diamond. I gave 5 problems to a group of 8 people, and gave them two hours to complete however many they thought they could complete without getting any wrong)

5johnswentworth1y

@Buck Apparently the five problems I tried were GPQA diamond, they did not take anywhere near 30 minutes on average (more like 10 IIRC?), and I got 4/5 correct. So no, I do not think that modern LLMs probably outperform (me with internet access and 30 minutes).

[-]Buck1y*Ω10148

Ok, so sounds like given 15-25 mins per problem (and maybe with 10 mins per problem), you get 80% correct. This is worse than o3, which scores 87.7%. Maybe you'd do better on a larger sample: perhaps you got unlucky (extremely plausible given the small sample size) or the extra bit of time would help (though it sounds like you tried to use more time here and that didn't help). Fwiw, my guess from the topics of those questions is that you actually got easier questions than average from that set.

I continue to think these LLMs will probably outperform (you with 30 mins). Unfortunately, the measurement is quite expensive, so I'm sympathetic to you not wanting to get to ground here. If you believe that you can beat them given just 5-10 minutes, that would be easier to measure. I'm very happy to bet here.

I think that even if it turns out you're a bit better than LLMs at this task, we should note that it's pretty impressive that they're competitive with you given 30 minutes!

So I still think your original post is pretty misleading [ETA: with respect to how it claims GPQA is really easy].

I think the models would beat you by more at FrontierMath.

6johnswentworth1y

Even assuming you're correct here, I don't see how that would make my original post pretty misleading?

[-]Buck1yΩ8114

I think that how you talk about the questions being “easy”, and the associated stuff about how you think the baseline human measurements are weak, is somewhat inconsistent with you being worse than the model.

9johnswentworth1y

I mean, there are lots of easy benchmarks on which I can solve the large majority of the problems, and a language model can also solve the large majority of the problems, and the language model can often have a somewhat lower error rate than me if it's been optimized for that. Seems like GPQA (and GPQA diamond) are yet another example of such a benchmark.

3Buck1y

What do you mean by "easy" here?

4Raemon1y

(my guess is you took more like 15-25 minutes per question? Hard to tell from my notes, you may have finished early but I don't recall it being crazy early)

4johnswentworth1y

I remember finishing early, and then spending a lot of time going back over all them a second time, because the goal of the workshop was to answer correctly with very high confidence. I don't think I updated any answers as a result of the second pass, though I don't remember very well.

2[comment deleted]1y

2Raemon1y

(This seems like more time than Buck was taking – the goal was to not get any wrong so it wasn't like people were trying to crank through them in 7 minutes) The problems I gave were (as listed in the csv for the diamond problems) * #1 (Physics) (1 person got right, 3 got wrong, 1 didn't answer) * #2 (Organic Chemistry), (John got right, I think 3 people didn't finish) * #4 (Electromagnetism), (John and one other got right, 2 got wrong) * #8 (Genetics) (3 got right including John) * #10 (Astrophysics) (5 people got right)

7Buck1y

@johnswentworth FWIW, GPQA Diamond seems much harder than GPQA main to me, and current models perform well on it. I suspect these models beat your performance on GPQA diamond if you're allowed 30 mins per problem. I wouldn't be shocked if you beat them (maybe I'm like 20%?), but that's because you're unusually broadly knowledgeable about science, not just because you're smart. I personally get wrecked by GPQA chemistry, get ~50% on GPQA biology if I have like 7 minutes per problem (which is notably better than their experts from other fields get, with much less time), and get like ~80% on GPQA physics with less than 5 minutes per problem. But GPQA Diamond seems much harder.

2johnswentworth1y

Is this with internet access for you?

5Buck1y

Yes, I'd be way worse off without internet access.

[-]Thane Ruthenis1y211

Generalizing the lesson here: the supposedly-hard benchmarks for which I have seen a few problems (e.g. GPQA, software eng) turn out to be mostly quite easy, so my prior on other supposedly-hard benchmarks which I haven't checked (e.g. FrontierMath) is that they're also mostly much easier than they're hyped up to be

Daniel Litt's account here supports this prejudice. As a math professor, he knew instantly how to solve the low/medium-level problems he looked at, and he suggests that each "high"-rated problem would be likewise instantly solvable by an expert in that problem's subfield.

And since LLMs have eaten ~all of the internet, they essentially have the crystallized-intelligence skills for all (sub)fields of mathematics (and human knowledge in general). So from their perspective, all of those problems are very "shallow". No human shares their breadth of knowledge, so math professors specialized even in slightly different subfields would indeed have to do a lot of genuine "deep" cognitive work; this is not the case for LLMs.

GPQA stuff is even worse, a literal advanced trivia quiz that seems moderately resistant to literal humans literally googling things, but not to the way the kno... (read more)

[-]Olli Järviniemi1y318

[...] he suggests that each "high"-rated problem would be likewise instantly solvable by an expert in that problem's subfield.

This is an exaggeration and, as stated, false.

Epoch AI made 5 problems from the benchmark public. One of those was ranked "High", and that problem was authored by me.

It took me 20-30 hours to create that submission. (To be clear, I considered variations of the problem, ran into some dead ends, spent a lot of time carefully checking my answer was right, wrote up my solution, thought about guess-proof-ness^[1] etc., which ate up a lot of time.)
I would call myself an "expert in that problem's subfield" (e.g. I have authored multiple related papers).
I think you'd be very hard-pressed to find any human who could deliver the correct answer to you within 2 hours of seeing the problem.
- E.g. I think it's highly likely that I couldn't have done that (I think it'd have taken me more like 5 hours), I'd be surprised if my colleagues in the relevant subfield could do that, and I think the problem is specialized enough that few of the top people in CodeForces or Project Euler could do it.

On the other hand, I don't think the problem is very hard insight-wise - I th... (read more)

8Thane Ruthenis1y

Thanks, that's important context! And fair enough, I used excessively sloppy language. By "instantly solvable", I did in fact mean "an expert would very quickly ("instantly") see the correct high-level approach to solving it, with the remaining work being potentially fiddly, but conceptually straightforward". "Instantly solvable" in the sense of "instantly know how to solve"/"instantly reducible to something that's trivial to solve".[1] Which was based on this quote of Litt's: That said, If there are no humans who can "solve it instantly" (in the above sense), then yes, I wouldn't call it "shallow". But if such people do exist (even if they're incredibly rare), this implies that the conceptual machinery (in the form of theorems or ansatzes) for translating the problem into a trivial one already exists as well. Which, in turn, means it's likely present in the LLM's training data. And therefore, from the LLM's perspective, that problem is trivial to translate into a conceptually trivial problem. It seems you'd largely agree with that characterization? Note that I'm not arguing that LLMs aren't useful or unimpressive-in-every-sense. This is mainly an attempt to build a model of why LLMs seem to perform so well on apparently challenging benchmarks while reportedly falling flat on their faces on much simpler real-life problems. 1. ^ Or, closer to the way I natively think of it: In the sense that there are people (or small teams of people) with crystallized-intelligence skillsets such that they would be able to solve this problem by plugging their crystallized-intelligence skills one into another, without engaging in prolonged fluid-intelligence problem-solving.

8Olli Järviniemi1y

This looks reasonable to me. Yes. My only hesitation is about how real-life-important it's for AIs to be able to do math for which very-little-to-no training data exists. The internet and the mathematical literature is so vast that, unless you are doing something truly novel, there's some relevant subfield there - in which case FrontierMath-style benchmarks would be informative of capability to do real math research. Also, re-reading Wentworth's original comment, I note that o1 is weak according to FM. Maybe the things Wentworth is doing are just too hard for o1, rather than (just) overfitting-on-benchmarks style issues? In any case his frustration with o1's math skills doesn't mean that FM isn't measuring real math research capability.

6Thane Ruthenis1y

Previously, I'd intuitively assumed the same as well: that it doesn't matter if LLMs can't "genuinely research/innovate", because there is enough potential for innovative-yet-trivial combination of existing ideas that they'd still massively speed up R&D by finding those combinations. ("Innovation overhang", as @Nathan Helm-Burger puts it here.) Back in early 2023, I'd considered it fairly plausible that the world would start heating up in 1-2 years due to such synthetically-generated innovations. Except this... just doesn't seem to be happening? I'm yet to hear of a single useful scientific paper or other meaningful innovation that was spearheaded by a LLM.[1] And they're already adept at comprehending such innovative-yet-trivial combinations if a human prompts them with those combinations. So it's not the matter of not yet being able to understand or appreciate the importance of such synergies. (If Sonnet 3.5.1 or o1 pro didn't do it, I doubt o3 would.) Yet this is still not happening. My guess is that "innovative-yet-trivial combinations of existing ideas" are not actually "trivial", and LLMs can't do that for the same reasons they can't do "genuine research" (whatever those reasons are). 1. ^ Admittedly it's possible that this is totally happening all over the place and people are just covering it up in order to have all of the glory/status for themselves. But I doubt it: there are enough remarkably selfless LLM enthusiasts that if this were happening, I'd expect it would've gone viral already.

8Noosphere891y

There are 2 things to keep in mind: 1. It's only now that LLMs are reasonably competent in at least some hard problems, and at any rate, I expect RL to basically solve the domain, because of verifiability properties combined with quite a bit of training data. 2. We should wait a few years, as we have another scale-up that's coming up, and it will probably be quite a jump from current AI due to more compute: https://www.lesswrong.com/posts/NXTkEiaLA4JdS5vSZ/?commentId=7KSdmzK3hgcxkzmPX

4Thane Ruthenis1y

I don't think that's the limiter here. Reports in the style of "my unpublished PhD thesis was about doing X using Y methodology, I asked an LLM to do that and it one-shot a year of my work! the equations it derived are correct!" have been around for quite a while. I recall it at least in relation to Claude 3, and more recently, o1-preview. If LLMs are prompted to combine two ideas, they've been perfectly capable of "innovating" for ages now, including at fairly high levels of expertise. I'm sure there's some sort of cross-disciplinary GPQA-like benchmark that they've saturated a while ago, so this is even legible. The trick is picking which ideas to combine/in what direction to dig. This doesn't appear to be something LLMs are capable of doing well on their own, nor do they seem to speed up human performance on this task. (All cases of them succeeding at it so far have been, by definition, "searching under the streetlight": checking whether they can appreciate a new idea that a human already found on their own and evaluated as useful.) I suppose it's possible that o3 or its successors change that (the previous benchmarks weren't measuring that, but surely FrontierMath does...). We'll see. Mm, I think it's still up in the air whether even the o-series efficiently scales (as in, without requiring a Dyson Swarm's worth of compute) to beating the Millennium Prize Eval (or some less legendary yet still major problems). I expect such problems don't pass the "can this problem be solved by plugging the extant crystallized-intelligence skills of a number of people into each other in a non-contrived[1] way?" test. Does RL training allow to sidestep this, letting the model generate new crystallized-intelligence skills? I'm not confident one way or another. I'm bearish on that. I expect GPT-4 to GPT-5 to be palatably less of a jump than GPT-3 to GPT-4, same way GPT-3 to GPT-4 was less of a jump than GPT-2 to GPT-3. I'm sure it'd show lower loss, and saturate some more be

[-]Noosphere891y113

I'm not confident one way or another.

I think my key crux is that in domains where there is a way to verify that the solution actually works, RL can scale to superhuman performance, and mathematics/programming are domains that are unusually easy to verify/gather training data for RL performance, so with caveats it can become rather good at those specific domains/benchmarks like millennium prize evals, but the important caveat is I don't believe this transfers very well to domains where verifying isn't easy, like creative writing.

I'm bearish on that. I expect GPT-4 to GPT-5 to be palatably less of a jump than GPT-3 to GPT-4, same way GPT-3 to GPT-4 was less of a jump than GPT-2 to GPT-3. I'm sure it'd show lower loss, and saturate some more benchmarks, and perhaps an o-series model based on it clears FrontierMath, and perhaps programmers and mathematicians would be able to use it in an ever-so-bigger number of cases...

I was talking about the 1 GW systems that would be developed in late 2026-early 2027, not GPT-5.

4Thane Ruthenis1y

Sure, the theory on that is solid. But how efficiently does it scale off-distribution, in practice? The inference-time scaling laws, much like the pretraining scaling laws, are ultimately based on test sets whose entries are "shallow" (in the previously discussed sense). It doesn't tell us much regarding how well the technique scales with the "conceptual depth" of a problem. o3 took a million dollars in inference-time compute and unknown amounts in training-time compute just to solve the "easy" part of the FrontierMath benchmark (which likely take human experts single-digit hours, maybe <1 hour for particularly skilled humans). How much would be needed for beating the "hard" subset of FrontierMath? How much more still would be needed for problems that take individual researchers days; or problems that take entire math departments months; or problems that take entire fields decades? It's possible that the "synthetic data flywheel" works so well that the amount of human-researcher-hour-equivalents per unit of compute scales, say, exponentially with some aspect of o-series' training, and so o6 in 2027 solves the Riemann Hypothesis. Or it scales not that well, and o6 can barely clear real-life equivalents of hard FrontierMath problems. Perhaps instead the training costs (generating all the CoT trees on which RL training is then done) scale exponentially, while researcher-hour-equivalents per compute units scale linearly. It doesn't seem to me that we know which one it is yet. Do we?

4Noosphere891y

I don't think we know yet whether it will succeed in practice, or whether it training costs make it infeasibble to do.

7Nathan Helm-Burger1y

Consider: https://www.cognitiverevolution.ai/can-ais-generate-novel-research-ideas-with-lead-author-chenglei-si/ I think a different phenomenon is occuring. My guess, updating on my own experience, is that ideas aren't the current bottleneck. 1% inspiration, 99% perspiration. As someone who has been reading 3-20 papers per month for many years now, in neuroscience and machine learning, I feel overwhelmed with ideas. I average about 0.75 per paper. I write them down, and the lists grow faster than they shrink by two orders of magnitude. When I was on my favorite industry team, what I most valued about my technical manager was his ability to help me sort through and prioritize them. It was like I created a bunch of LEGO pieces, he picked one to be next, I put it in place by coding it up, he checked the placement by reviewing my PR. If someone has offered me a source of ideas ranging in quality between worse than my worst ideas, and almost as good as my best ideas, and skewed towards bad... I'd have laughed and turned them down without a second thought. For something like a paper instead of a minor tech idea for 1 week PR... The situation is far more intense. The grunt work of running the experiments and preparing the paper is enormous compared to the time and effort of coming up with the idea in the first place. More like 0.1% to 99.9%. Current LLMs can speed up creating a paper if given the results and experiment description to write about. That's probably also not the primary bottleneck (although still more than idea generation). So the current bottleneck, in my estimation, for ml experiments, is the experiments. Coding up the experiments accurately and efficiently, running them (and handling the compute costs), analyzing the results. So I've been expecting to see an acceleration dependent on that aspect. That's hard to measure though. Are LLMs currently speeding this work up a little? Probably. I've had my work sped up some by the recent Sonnet 3.5.1. Curren

[-]johnswentworth1y106

That's the opposite of my experience. Nearly all the papers I read vary between "trash, I got nothing useful out besides an idea for a post explaining the relevant failure modes" and "high quality but not relevant to anything important". Setting up our experiments is historically much faster than the work of figuring out what experiments would actually be useful.

There are exceptions to this, large projects which seem useful and would require lots of experimental work, but they're usually much lower-expected-value-per-unit-time than going back to the whiteboard, understanding things better, and doing a simpler experiment once we know what to test.

6Nathan Helm-Burger1y

Ah, well, for most papers that spark an idea in me, the idea isn't simply an extension of the paper. It's a question tangentially related which probes at my own frontier of understanding. I've always found that a boring lecture is a great opportunity to brainstorm because my mind squirms away from the boredom into invention and extrapolation of related ideas. A boring paper does some of the same for me, except that I'm less socially pressured to keep reading it, and thus less able to squeeze my mind with the boredom of it. As for coming up with ideas... It is a weakness of mind that I am far better at generating ideas than at critiquing them (my own or others). Which is why I worked so well in a team where I had someone I trusted to sort through my ideas and pick out the valuable ones. It sounds to me like you have a better filter on idea quality.

2Thane Ruthenis1y

That's mostly my experience as well: experiments are near-trivial to set up, and setting up any experiment that isn't near-trivial to set up is a poor use of the time that can instead be spent thinking on the topic a bit more and realizing what the experimental outcome would be or why this would be entirely the wrong experiment to run. But the friction costs of setting up an experiment aren't zero. If it were possible to sort of ramble an idea at an AI and then have it competently execute the corresponding experiment (or set up a toy formal model and prove things about it), I think this would be able to speed up even deeply confused/non-paradigmatic research. ... That said, I think the sorts of experiments we do aren't the sorts of experiments ML researchers do. I expect they're often things like "do a pass over this lattice of hyperparameters and output the values that produce the best loss" (and more abstract equivalents of this that can't be as easily automated using mundane code). And which, due to the atheoretic nature of ML, can't be "solved in the abstract". So ML research perhaps could be dramatically sped up by menial-software-labor AIs. (Though I think even now the compute needed for running all of those experiments would be the more pressing bottleneck.)

2Thane Ruthenis1y

Convincing.

3Noosphere891y

I agree that the trick scaling as far as it has is surprising, but I'd disagree with the claim that this doesn't bear on AGI. I do think that something like dumb scaling can mostly just work, and I think the main takeaway I take from AI progress is that there will not a be a clear resolution to when AGI happens, as the first AIs to automate AI research will have very different skill profiles from humans, and most importantly we need to disentangle capabilities in a way we usually don't for humans. I agree with faul sname here:

5Thane Ruthenis1y

The exact degree of "mostly" is load-bearing here. You'd mentioned provisions for error-correction before. But are the necessary provisions something simple, such that the most blatantly obvious wrappers/prompt-engineering works, or do we need to derive some additional nontrivial theoretical insights to correctly implement them? Last I checked, AutoGPT-like stuff has mostly failed, so I'm inclined to think it's closer to the latter.

[-]Noosphere891y200

Actually, I've changed my mind, in that the reliability issue probably does need at least non-trivial theoretical insights to make AIs work.

[-]faul_sname1y142

I am unconvinced that "the" reliability issue is a single issue that will be solved by a single insight, rather than AIs lacking procedural knowledge of how to handle a bunch of finicky special cases that will be solved by online learning or very long context windows once hardware costs decrease enough to make one of those approaches financially viable.

4Noosphere891y

Yeah, I'm sympathetic to this argument that there won't be a single insight, and that at least one approach will work out once hardware costs decrease enough, and I agree less with Thane Ruthenis's intuitions here than I did before.

[-]Noosphere891y103

If I were to think about it a little, I'd suspect the big difference that LLMs and humans have is state/memory, where humans do have state/memory, but LLMs are currently more or less stateless today, and RNN training has not been solved to the extent transformers were.

One thing I will also say is that AI winters will be shorter than previous AI winters, because AI products can now be sort of made profitable, and this gives an independent base of money for AI research in ways that weren't possible pre-2016.

5Mateusz Bagiński1y

A factor stemming from the same cause but pushing in the opposite direction is that "mundane" AI profitability can "distract" people who would otherwise be AGI hawks.

1[comment deleted]1y

[-]waterlubber1y153

I agree with you on your assessment of GPQA. The questions themselves appear to be low quality as well. Take this one example, although it's not from GPQA Diamond:

In UV/Vis spectroscopy, a chromophore which absorbs red colour light, emits _____ colour light.

The correct answer is stated as yellow and blue. However, the question should read transmits, not emits; molecules cannot trivially absorb and re-emit light of a shorter wavelength without resorting to trickery (nonlinear effects, two-photon absorption).

This is, of course, a cherry-picked example, but is exactly characteristic of the sort of low-quality science questions I saw in school (e.g with a teacher or professor who didn't understand the material very well). Scrolling through the rest of the GPQA questions, they did not seem like questions that would require deep reflection or thinking, but rather the sort of trivia things that I would expect LLMs to perform extremely well on.

I'd also expect "popular" benchmarks to be easier/worse/optimized for looking good while actually being relatively easy. OAI et. al probably have the mother of all publication biases with respect to benchmarks, and are selecting very heavily for items within this collection.

3mtaran1y

Re: LLMs for coding: One lens on this is that LLM progress changes the Build vs Buy calculus. Low-power AI coding assistants were useful in both the "build" and "buy" scenarios, but they weren't impactful enough to change the actual border between build-is-better vs. buy-is-better. More powerful AI coding systems/agents can make a lot of tasks sufficiently easy that dealing with some components starts feeling more like buying than building. Different problem domains have different peak levels of complexity/novelty, so the easier domains will start being affected more and earlier by this build/buy decision boundary shift. Many people don't travel far from their primary domains, so to some of them it will look like the shift is happening quickly (because it is, in their vicinity) even though on the larger scale it's still pretty gradual.

3Kabir Kumar1y

Personally, I think o1 is uniquely trash, I think o1-preview was actually better. Getting on average, better things from deepseek and sonnet 3.5 atm.

[-]johnswentworth4mo1481

About a month ago, after some back-and-forth with several people about their experiences (including on lesswrong), I hypothesized that I don't feel the emotions signalled by oxytocin, and never have. (I do feel some adjacent things, like empathy and a sense of responsibility for others, but I don't get the feeling of loving connection which usually comes alongside those.)

Naturally I set out to test that hypothesis. This note is an in-progress overview of what I've found so far and how I'm thinking about it, written largely to collect my thoughts and to see if anyone catches something I've missed.

Under the hypothesis, this has been a life-long thing for me, so the obvious guess is that it's genetic (the vast majority of other biological state turns over too often to last throughout life). I also don't have a slew of mysterious life-long illnesses, so the obvious guess is that's it's pretty narrowly limited to oxytocin - i.e. most likely a genetic variant in either the oxytocin gene or receptor, maybe the regulatory machinery around those two but that's less likely as we get further away and the machinery becomes entangled with more other things.

So I got my genome sequenced, and went... (read more)

[-]Eli Tyre4mo3426

The receptor was the first one I checked, and sure enough I have a single-nucleotide deletion 42 amino acids in to the open reading frame (ORF) of the 389 amino acid protein. That will induce a frameshift error, completely fucking up the rest of protein.

I'm kind of astonished that this kind of advance prediction panned out!

[-]johnswentworth4mo187

I admit I was somewhat surprised as well. On a gut level, I did not think that the very first things to check would turn up such a clear and simple answer.

8Archimedes4mo

I'm insufficiently knowledgeable about deletion base rates to know how astonished to be. Does anyone have an estimate of how many Bayes bits such a prediction is worth? FWIW, GPT-5T estimates around 10 bits, double that if it's de novo (absent in both parents).

[-]Gurkenglas4mo300

well, what happens when you take oxytocin?

[-]the gears to ascension4mo12-1

This might be a bad idea right now, if it makes John's interests suddenly more normal in a mostly-unsteered way, eg because much of his motivation was coming from a feeling he didn't know was oxytocin-deficiency-induced. I'd suggest only doing this if solving this problem is likely to increase productivity or networking success; else, I'd delay until he doesn't seem like a critical bottleneck. That said, it might also be a very good idea, if depression or social interaction are a major bottleneck, which they are for many many people, so this is not resolved advice, just a warning that this may be a high variance intervention, and since John currently seems to be doing promising work, introducing high variance seems likely to have more downside.

I wouldn't say this to most people; taking oxytocin isn't known for being a hugely impactful intervention[citation needed], and on priors, someone who doesn't have oxytocin signaling happening is missing a lot of normal emotion, and is likely much worse off. Obviously, John, it's up to you whether this is a good tradeoff. I wouldn't expect it to completely distort your values or delete your skills. Someone who knows you better, such as yourself, would be much better equipped to predict if there's significant reason to believe downward variance isn't present. If you have experience with reward-psychoactive chemicals and yet are currently productive, it's more likely you already know whether it's a bad idea.

Didn't want to leave it unsaid, though.

[-]Elizabeth4mo106

if the problem is with the receptor, taking more won't make a difference

6Ben Pace4mo

Sounds like a great empirical test!

4localdeity4mo

Seems like that depends on details of the problem. If the receptor has zero function, then yes. If functionality is significantly reduced but nonzero… maybe.

3Mateusz Bagiński4mo

Perhaps Gurkenglas meant this is as a ~confirmatory test that John is actually oxytocin-insensitive because the test results (IIUC) are compatible with only one gene copy being screwed up.

7johnswentworth4mo

I ordered this one off of amazon. AFAICT it does nothing for me. But that's a pretty minor update, because even those who use it say the effects are "subtle", and frankly I think snorting oxytocin is probably bullshit and does nothing beyond placebo even for normal people. I did have a couple other people try the one I bought, and their results indeed sounded like a nothingburder.

4Richard_Kennaway4mo

Your link is broken. The raw HTML is: <a href="https://One other thing - labs typically filter reportable genome results by the phenotype you give them. I don’t know how this guy did the genome, but if he were to put something like “social deficits”, “emotional dysregulation” or something else about his lack of emotional range, the lab would definitely report the variant plus their research on it and recommendations.">this one</a> BTW, has anyone on LW tried oxytocin and is willing to report on the experience?

4johnswentworth4mo

Fixed, thanks.

[-]Thane Ruthenis4mo*221

Not directly related to your query, but seems interesting:

The receptor was the first one I checked, and sure enough I have a single-nucleotide deletion 42 amino acids in to the open reading frame (ORF) of the 389 amino acid protein. That will induce a frameshift error, completely fucking up the rest of protein.

Which, in turn, is pretty solid evidence for "oxytocin mediates the emotion of loving connection/aching affection" (unless there are some mechanisms you've missed). I wouldn't have guessed it's that simple.

Generalizing, this suggests we can study links between specific brain chemicals/structures and cognitive features by looking for people missing the same universal experience, checking if their genomes deviate from the baseline in the same way, then modeling the effects of that deviation on the brain. Alternatively, the opposite: search for people whose brain chemistry should be genetically near-equivalent except for one specific change, then exhaustively check if there's some blatant or subtle way their cognition differs from the baseline.

Doing a brief literature review via GPT-5, apparently this sort of thing is mostly done with regards to very "loud" conditions, rather th... (read more)

[-]Alexander Gietelink Oldenziel4mo280

... and so at long last John found the answer to alignment

The answer was Love

and it had always has been

3CronoDAS4mo

(hopes this is a joke)

[-]Mateusz Bagiński4mo135

I wouldn't have guessed it's that simple.

~Surely there's a lot of other things involved in mediating this aspect of human cognition, at the very least (/speaking very coarse-grainedly), having the entire oxytocin system adequately hooked up to the rest of everything.

IE it is damn strong evidence that oxytocin signalinf is strictly necessary (and that there's no fallback mechanisms wtc) but not that it's simple.

[-]Nina Panickssery4mo152

Did your mother think you were unusual as a baby? Did you bond with your parents as a young child? I'd expect there to be some symptoms there if you truly have an oxytocin abnormality.

[-]johnswentworth4mo202

For my family this is much more of a "wow that makes so much sense" than a "wow what a surprise". It tracks extremely well with how I acted growing up, in a bunch of different little ways. Indeed, once the hypothesis was on my radar at all, it quickly seemed pretty probable on that basis alone, even before sequencing came back.

A few details/examples:

As a child, I had a very noticeable lack of interest in other people (especially those my own age), to the point where a school psychologist thought it was notable.
I remember being unusually eager to go off to overnight summer camp (without my parents), at an age where nobody bothered to provide overnight summer camp because kids that young were almost all too anxious to be away from their parents that long.
When family members or pets died, I've generally been noticeably less emotionally impacted than the rest of the family.
When out and about with the family, I've always tended to wander around relatively independently of the rest of the group.

Those examples are relatively easy to explain, but most of my bits here come from less legible things. It's been very clear for a long time that I relate to other people unusually, in a way that intuitively matches being at the far low end of the oxytocin signalling axis.

9Nina Panickssery4mo

Interesting. That seems like reasonable evidence. Though beyond a certain level of development we have numerous other drives beyond the oxytocin-related ones. Hence why you-as-a-baby might be particularly telling. From what I understand, oxytocin is heavily involved in infant-caregiver bonding and is what enables mothers to soothe their babies so effectively (very much on my mind right now as I am typing this comment while a baby naps on me haha). Whereas once you're above a certain age, the rational mind and other traits probably have an increasingly strong effect. For example, if you're very interested in your own thoughts and ideas, this might overwhelm your desire to be close to family members. Anyway, it seems likely that your oxytocin hypothesis is correct either way. Cool finding! I have a similar intuition about how some other people are missing a disgust response that I have. Seems like a biological thing that some people have much less of than others and it has a significant effect on how we relate to others.

[-]gwern4mo100

Is that frame-shift error or those ~6 (?) SNPs previously reported in the literature for anything, or do they seem to be de novos? Also, what WGS depth did your service use? (Depending on how widely you cast your net, some of those could be spurious sequencing errors.)

2johnswentworth4mo

Depth is tagged on each individual variation; the frame shift has depth 41, the others have depth anywhere from 40 to 60. I have not found the frameshift mutation in dbSNP, but I'm not confident that I've understood the UI or intended usage patterns, so I'm not confident it's not in there. The SNPs I haven't looked for in there yet.

9Onion Conundrum4mo

Really interesting post - this actually connects to some research I've been looking into recently around oxytocin and attachment patterns. There's this psychologist Adam Lane Smith who's built on neurobiological work by researchers like Carolyn Zahn-Waxler and Ruth Feldman - they've found that under high stress conditions when younger, or absence of secure attachment figures, cortisol-induced stress actually strengthens cortisol and dopamine pathways for reward while inhibiting the oxytocin and serotonin pathways. The end result (avoidant attachment) sounds remarkably similar to what you're describing: people who clearly care about others and feel responsibility, but don't experience that warm "loving connection" feeling that most people seem to get from relationships. What struck me about your situation is that you've essentially got the genetic version of what this research suggests can happen environmentally. Both paths seem to lead to the same place - having to navigate social connection through pattern recognition and cognitive analysis rather than emotional intuition, because your brain is essentially running on dopamine-driven systems instead of oxytocin-based ones. Makes me wonder if there's a whole spectrum of people out there - some genetic, some developmental - who are all essentially operating with similar neurochemical profiles but don't realize they're part of the same phenomenon. Your case might be the key to understanding how this actually works at a biological level. Do you find you've gotten really good at reading people through behavioral patterns rather than gut feelings?

8Jude Stiel4mo

Yep. AlphaMissense, also from DeepMind, is tailored to pathogenicity prediction. You can find its pathogenicity scores in the annotations tab for any (at least I think any) human protein on AFDB. https://alphafold.ebi.ac.uk/entry/P30559?activeTab=annotations (You may have to click on a different tab and return to the annotations tab for the heatmap and structure viewer to load).

6Ryan Meservey4mo

As a non-subject matter expert in all of the above, I decided to consult my swear-word-adverse relative that recently graduated genetic counseling school. Here is her response: The logic is sound (if a little colorful haha 😅). It sounds like this guy functionally only has 1 copy of the OXTR gene, and spot on in hypothesis of nonsense-mediated decay. How the OXTR gene is regulated, I don’t know and haven’t looked into. It would be weird (but possible) for a decrease in OXTR expression to only affect emotions - oxytocin is also important for other brain functions/development, so a genetic change should also impact embryological development of the brain. So if I were to suggest next steps, it would be doing functional studies of the brain (like an MRI) to further evaluate. One other thing - labs typically filter reportable genome results by the phenotype you give them. I don’t know how this guy did the genome, but if he were to put something like “social deficits”, “emotional dysregulation” or something else about his lack of emotional range, the lab would definitely report the variant plus their research on it and recommendations.

6Viliam4mo

Amazing, is this the future of psychotherapy? "Doctor, I have a problem..." "Stop talking, just give me a blood sample. Okay, your problem is X."

4J Bostock4mo

Huh interesting. I might get myself full genome sequenced at some point. I already got myself 23andme sequenced, downloaded the raw data, and put it into promethease a while ago. I did find out I'm AG at rs53576 which is slightly linked to lower empathy, but is also extremely common. I don't think this is enough to explain a large proportion of my personality, the way your OXTR deletion might be. (There was something quite amusing to check my SNPs checking whether to start early anti-balding interventions, and have result number 1 be "Low Empathy". As a further datapoint, I mentioned this to my mum and she basically said "Yeah but what did you expect with me and [dad] as parents?") Seeing this Made me think I should take a deeper look. This all sounds pretty familiar, and I don't think the AG in rs53576 is strong enough to shift me off-distribution to the degree that I am.

4p.b.4mo

If the one clearly fucked up receptor copy is sufficient for your "symptoms", it seems pretty likely that one of your parents should have them too. I think there is no reason to expect a denovo mutation to be particularly likely in your case (unlike in cases that lead to severe disfunction). And of course you can check for that by sequencing your parents. So my money would be on the second copy also being sufficiently messed up that you have basically no fully functioning oxytocin receptors. If you have siblings and you are the only odd one in the family, you could make a pretty strong case for both copies being messed up, by showing that you are the only one with the combination of frameshift in one copy and particular SNPs in the other. (If you are not the only odd one you can make an even stronger case).

4ChristianKl4mo

Even if the structure is correct and does look the same, the binding properties of the receptor could still be different if the histidine is in the part that's relevant for the receptor binding. The thing you want is a tool that tells you how the receptor binding properties change through the mutation not the AlphaFold that just gives you the 3D structure. A quick question at GPT-5, suggests that there are freely available tools that tell you how the receptor binding properties change via a single point mutation.

2DaemonicSigil3mo

I have read that some sequencing methods (nanopore) have a high error rate (comparing multiple reads can help correct this). Did you also spot-check some other genes that you have no reason to believe contain mutations to see if they look ok? Seeing a mutation in exactly the gene you expect is only damn strong evidence if there isn't a sequencing error in every third gene. EDIT: Looks like this was checked, nice: https://www.lesswrong.com/posts/Hds7xkLgYtm6qDGPS/how-i-learned-that-i-don-t-feel-companionate-love

[-]johnswentworth6moΩ441292

I was a relatively late adopter of the smartphone. I was still using a flip phone until around 2015 or 2016 ish. From 2013 to early 2015, I worked as a data scientist at a startup whose product was a mobile social media app; my determination to avoid smartphones became somewhat of a joke there.

Even back then, developers talked about UI design for smartphones in terms of attention. Like, the core "advantages" of the smartphone were the "ability to present timely information" (i.e. interrupt/distract you) and always being on hand. Also it was small, so anything too complicated to fit in like three words and one icon was not going to fly.

... and, like, man, that sure did not make me want to buy a smartphone. Even today, I view my phone as a demon which will try to suck away my attention if I let my guard down. I have zero social media apps on there, and no app ever gets push notif permissions when not open except vanilla phone calls and SMS.

People would sometimes say something like "John, you should really get a smartphone, you'll fall behind without one" and my gut response was roughly "No, I'm staying in place, and the rest of you are moving backwards".

And in hindsight, boy howdy do... (read more)

[-]Vanessa Kosoy6moΩ11228

I found LLMs to be very useful for literature research. They can find relevant prior work that you can't find with a search engine because you don't know the right keywords. This can be a significant force multiplier.

They also seem potentially useful for quickly producing code for numerical tests of conjectures, but I only started experimenting with that.

Other use cases where I found LLMs beneficial:

Taking a photo of a menu in French (or providing a link to it) and asking it which dishes are vegan.
Recommending movies (I am a little wary of some kind of meme poisoning, but I don't watch movies very often, so seems ok).

That said, I do agree that early adopters seem like they're overeager and maybe even harming themselves in some way.

5Thane Ruthenis6mo

I've been trying to use Deep Research tools as a way to find hyper-specific fiction recommendations as well. The results have been mixed. They don't seem to be very good at grokking the hyper-specificness of what you're looking for, usually they have a heavy bias towards the popular stuff that outweighs what you actually requested[1], and if you ask them to look for obscure works, they tend to output garbage instead of hidden gems (because no taste). It did produce good results a few times, though, and is only slightly worse than asking for recommendations on r/rational. Possibly if I iterate on the prompt a few times (e. g., explicitly point out the above issues?), it'll actually become good. 1. ^ Like, suppose I'm looking for some narrative property X. I want to find fiction with a lot of X. But what the LLM does is multiplying the amount of X in a work by the work's popularity, so that works that are low in X but very popular end up in its selection.

3Morpheus5mo

I tend to have some luck with concrete analogies sometimes. For example I asked for the equivalent of Tonedeff (His polymer album is my favorite album) in other genres and it recommended me Venetian Snares. I then listened to some of his songs and it seemed like the kind of experimental stuff where I might find something I find interesting. Venetian Snares has 80k monthly listeners while Tonedeff has 14K, so there might be some weighting towards popularity, but that seems mild.

3Garrett Baker6mo

I can think of reasons why some would be wary, and am waried of something which could be called “meme poisoning” myself when I watch moves, but am curious what kind of meme poisoning you have in mind here.

[-]Raemon6moΩ9216

I've updated marginally towards this (as a guy pretty focused on LLM-augmentation. I anticipated LLM brain rot, but it still was more pernicious/fast than I expected)

I do still think some-manner-of-AI-integration is going to be an important part of "moving forward" but probably not whatever capitalism serves up.

I have tried out using them pretty extensively for coding. The speedup is real, and I expect to get more real. Right now it's like a pretty junior employee that I get to infinitely micromanage. But it definitely does lull me into a lower agency state where instead of trying to solve problems myself I'm handing them off to LLMs much of the time to see if it can handle it.

During work hours, I try to actively override this, i.e. have the habit "send LLM off, and then go back to thinking about some kind of concrete thing (although often a higher level strategy." But, this becomes harder to do as it gets later in the day and I get more tired.

One of the benefits of LLMs is that you can do moderately complex cognitive work* while tired (*that a junior engineer could do). But, that means by default a bunch of time is spent specifically training the habit of using LLMs in... (read more)

[-]Thane Ruthenis6mo*Ω11296

(Disclaimer: only partially relevant rant.)

Outside of [coding], I don't know of it being more than a somewhat better google

I've recently tried heavily leveraging o3 as part of a math-research loop.

I have never been more bearish on LLMs automating any kind of research than I am now.

And I've tried lots of ways to make it work. I've tried telling it to solve the problem without any further directions, I've tried telling it to analyze the problem instead of attempting to solve it, I've tried dumping my own analysis of the problem into its context window, I've tried getting it to search for relevant lemmas/proofs in math literature instead of attempting to solve it, I've tried picking out a subproblem and telling it to focus on that, I've tried giving it directions/proof sketches, I've tried various power-user system prompts, I've tried resampling the output thrice and picking the best one. None of this made it particularly helpful, and the bulk of the time was spent trying to spot where it's lying or confabulating to me in its arguments or proofs (which it ~always did).

It was kind of okay for tasks like "here's a toy setup, use a well-known formula to compute the relationships between ... (read more)

4Raemon6mo

Nod. [disclaimer, not a math guy, only barely knows what he's talking about, if this next thought is stupid I'm interested to learn more] I don't expect this to fix it right now, but, one thing I don't think you listed is doing the work in lean or some other proof assistant that lets you check results immediately? I expect LLMs to first be able to do math in that format because it's the format you can actually do a lot of training in. And it'd mean you can verify results more quickly. My current vague understanding is that lean is normally too cumbersome to be a reasonable to work in, but, that's the sort of thing that could change with LLMs in the mix.

4Thane Ruthenis6mo

I agree that it's a promising direction. I did actually try a bit of that back in the o1 days. What I've found is that getting LLMs to output formal Lean proofs is pretty difficult: they really don't want to do that. When they're not making mistakes, they use informal language as connective tissue between Lean snippets, they put in "sorry"s (a placeholder that makes a lemma evaluate as proven), and otherwise try to weasel out of it. This is something that should be solvable by fine-tuning, but at the time, there weren't any publicly available decent models fine-tuned for that. We do have DeepSeek-Prover-V2 now, though. I should look into it at some point. But I am not optimistic, sounds like it's doing the same stuff, just more cleverly. Relevant: Terence Tao does find them helpful for some Lean-related applications.

4Raemon6mo

yeah, it's less that I'd bet it works now, just, whenever it DOES start working, it seems likely it'd be through this mechanism. ⚖ If Thane Ruthenis thinks there are AI tools that can meaningfully help with Math by this point, did they first have a noticeable period (> 1 month) where it was easier to get work out of them via working in lean-or-similar? (Raymond Arnold: 25% & 60%) (I had a bit of an epistemic rollercoaster making this prediction, I updated "by the time someone makes an actually worthwhile Math AI, even if lean was an important part of it's training process, it's probably not that hard to do additional fine tuning that gets it to output stuff in a more standard mathy format. But, then, it seemed like it was still going to be important to quickly check it wasn't blatantly broken as part of the process)

[-]Rana Dexsin6mo112

(I feel sort of confused about how people who don't use it for coding are doing. With coding, I can feel the beginnings of a serious exoskeleton that can build structures around me with thought. Outside of that, I don't know of it being more than a somewhat better google).

There's common ways I currently use (the free version of) ChatGPT that are partially categorizable as “somewhat better search engine”, but where I feel like that's not representative of the real differences. A lot of this is coding-related, but not all, and the reasons I use it for coding-related and non-coding-related tasks feel similar. When it is coding-related, it's generally not of the form of asking it to write code for me that I'll then actually put into a project, though occasionally I will ask for example snippets which I can use to integrate the information better mentally before writing what I actually want.

The biggest difference in feel is that a chat-style interface is predictable and compact and avoids pushing a full-sized mental stack frame and having to spill all the context of whatever I was doing before. (The name of the website Stack Exchange is actually pretty on point here, insofar as they ... (read more)

8Raemon6mo

(fwiw, I never felt like phones offered any real "you need them to not fall behind". They are kinda a nice-to-have in some situations. I do need them for uber/lyfy and maps, I use them for other things which have some benefits and costs, this post is upweighting "completely block the internet on my phone." I don't have any social media apps on my phone but it doesn't matter much, I just use the web browser)

3Rana Dexsin6mo

I imagine this differs a lot based on what social position you're already in and where you're likely to get your needs met. When assumptions like “everyone has a smartphone” become sufficiently widespread, you can be blocked off from things unpredictably when you don't meet them. You often can't tell which things these are in advance: simplification pressure causes a phase transition from “communicated request” to “implicit assumption”, and there's too many widely-distributed ways for the assumption to become relevant, so doing your own modeling will produce a “reliably don't need” result so infrequently as to be effectively useless. Then, if making the transition to conformity when you notice a potential opportunity is too slow or is blocked by e.g. resource constraints or value differences, a lot of instant-lose faces get added to the social dice you roll. If your anticipated social set is already stable and well-adapted to you, you may not be rolling many dice, but if you're precarious, or searching for breakthrough opportunities, or just have a role with more wide-ranging and unpredictable requirements on which interactions you need to succeed at, it's a huge penalty. Other technologies this often happens with in the USA, again depending on your social class and milieu, include cars, credit cards, and Facebook accounts. (It feels like there has to already be an explainer for this somewhere in the LW-sphere, right? I didn't see an obvious one, though…)

2Elizabeth6mo

yeah a friend of mine gave in because she was getting so much attitude about needing people to give her directions.

5Rana Dexsin6mo

You've reminded me of a perspective I was meaning to include but then forgot to, actually. From the perspective of an equilibrium in which everyone's implicitly expected to bring certain resources/capabilities as table stakes, making a personal decision that makes your life better but reduces your contribution to the pool can be seen as defection—and on a short time horizon or where you're otherwise forced to take the equilibrium for granted, it seems hard to refute! (ObXkcd: “valuing unit standardization over being helpful possibly makes me a bad friend” if we take the protagonist as seeing “US customary units” as an awkward equilibrium.) Some offshoots of this which I'm not sure what to make of: 1. If the decision would lead to a better society if everyone did it, and leads to an improvement for you if only you do it, but requires the rest of a more localized group to spend more energy to compensate for you if you do it and they don't, we have a sort of “incentive misalignment sandwich” going on. In practice I think there's usually enough disagreement about the first point that this isn't clear-cut, but it's interesting to notice. 2. In the face of technological advances, what continues to count as table stakes tends to get set by Moloch and mimetic feedback loops rather than intentionally. In a way, people complaining vociferously about having to adopt new things are arguably acting in a counter-Moloch role here, but in the places I've seen that happen, it's either been ineffective or led to a stressful and oppressive atmosphere of its own (or, most commonly and unfortunately, both). 3. I think intuitive recognition of (2) is a big motivator behind attacking adopters of new technology that might fall into this pattern, in a way that often gets poorly expressed in a “tech companies ruin everything” type of way. Personally taking up smartphones, or cars, or—nowadays the big one that I see in my other circles—generative AI, even if you don't yourself look down

1Rana Dexsin6mo

(Now much more tangentially:) … hmm, come to think of it, maybe part of conformity-pressure in general can be seen as a special case of this where the pool resource is more purely “cognition and attention spent dealing with non-default things” and the nonconformity by default has more of a purely negative impact on that axis, whereas conformity-pressure over technology with specific capabilities causes the nature of the pool resource to be pulled in the direction of what the technology is providing and there's an active positive thing going on that becomes the baseline… I wonder if anything useful can be derived from thinking about those two cases as denoting an axis of variation. And when the conformity is to a new norm that may be more difficult to understand but produces relative positive externalities in some way, is that similar to treating the new norm as a required table stakes cognitive technology?

1Purplehermann6mo

I mostly use it for syntax, and formatting/modifying docs, giving me quick visual designs...

[-]Random Developer6mo172

I am perhaps an interesting corner case. I make extrenely heavy use of LLMs, largely via APIs for repetitive tasks. I sometimes run a quarter million queries in a day, all of which produce structured output. Incorrect output happens, but I design the surrounding systems to handle that.

A few times a week, I might ask a concrete question and get a response, which I treat with extreme skepticism.

But I don't talk to the damn things. That feels increasingly weird and unwise.

[-]Cole Wyeth6mo110

Agree about phones (in fact I am seriously considering switching to a flip phone and using my iphone only for things like navigation).

Not so sure about LLMs. I had your attitude initially, and I still consider them an incredibly dangerous mental augmentation. But I do think that conservatively throwing a question at them to find searchable keywords is helpful, if you maintain the attitude that they are actively trying to take over your brain and therefore remain vigilant.

8DirectedEvolution6mo

Why do you think LLMs are moving people backwards? With phones, it was their attention-sucking nature. What is it with LLMs?

5[anonymous]6mo

Not speaking for john but, I think LLMs can cause a lack of gears lvl understanding, more vibe coding, less mental flexibility due to lack of deliberate thought and more dependency on it for thinking in general. A lot of my friends will most likely never learn coding properly and rely solely on chatgpt, it would be similar to calculators—which reduced people's ability to do mental maths— but for thinking.

3TimothyTV6mo

LLM's danger lies in its ability to solve the majority of simple problems. This reduces opportunities to learn skills or benefit from the training these tasks provide. This allows for a level of mental stagnation, or even degradation, depending on how frequently you use LLMs to handle problems. In other words, it induces mental laziness. This is one way it's not moving people forward and in more severe cases backward. As a side note, it is also harmful to the majority of current education institutions, as it can solve most academic problems. I have personally seen people use it to do homework, write essays, or even write term papers. Some of the more crafty students manage to cheat with it on exams. This creates a very shallow education, which is bad for many reasons.

3DirectedEvolution6mo

Setting aside cheating, do you think LLMs are diminishing opportunities for thought, or redistributing them to other topics? And why?

1TimothyTV6mo

Yes, I do think that. They don't actively diminish thought, after all, it's a tool you decide to use. But when you use it to handle a problem, you lose the thoughts, and the growth you could've had solving it yourself. It could be argued, however, that if you are experienced enough in solving such problems, there isn't much to lose, and you gain time to pursue other issues. But as to why I think this way: people already don't learn skills because chatGPT can do it for them, as lesswronguser123 said "A lot of my friends will most likely never learn coding properly and rely solely on ChatGPT", and not just his friends use it this way. Such people, at the very least, lose the opportunity to adopt a programming mindset, which is useful beyond programming. Outside of people not learning skills, I also believe there is a lot of potential to delegate almost all of your thinking to chatGPT. For example: I could have used it to write this response, decide what to eat for breakfast, tell me what I should do in the future, etc. It can tell you what to do on almost every day-to-day decision. Some use it to a lesser extent, some to a greater, but you do think less if you use it this way. Does it redistrubute thinking to another topic? I believe it depends on the person in question, some use it to have more time to solve a more complex problem, others to have more time for entertainment.

3DirectedEvolution6mo

I think that these are genuinely hard questions to answer in a scientific way. My own speculation is that using AI to solve problems is a skill of its own, along with recognizing which problems they are currently not good for. Some use of LLMs teaches these skills, which is useful. I think a potential failure mode for AI might be when people systematically choose to work on lower-impact problems that AI can be used to solve, rather than higher-impact problems that AI is less useful for but that can be solved in other ways. Of course, AI can also increase people's ambitions by unlocking the ability to pursue higher-impact goals they would not have been able to otherwise achieve. Whether or not AI increases or decreases human ambition on net seems like a key question. In my world, I see limited use of AI except as a complement to traditional internet search, a coding assistant by competent programmers, a sort of Grammarly on steroids, an OK-at-best tutor that's cheap and always available on any topic, and a way to get meaningless paperwork done faster. These use cases all seem basically ambition-enhancing to me. That's the reason I asked John why he's worried about this version of AI. My experience is that once I gained some familiarity with the limitations of AI, it's been a straightforwaredly useful tool, with none of the serious downsides I have experienced from social media and smartphones. The issues I've seen seem to have to do with using AI to deepfake political policy proposals, homework, blog posts, and job applications. These are genuine and serious problems, but mainly have to do with adding a tremendous amount of noise to collective discourse rather than the self-sabotage enabled by smartphones and social media. So I'm wondering if John's more concerned about those social issues or by some sort of self-sabotage capacity from AI that I'm not seeing. Using AI to do your homework is obviously self-sabotage, but given the context I'm assuming that's not wha

8TsviBT6mo

I mean, they're great as search engines or code-snippet writers (basically, search engine for standard functions). If someone thinks that gippities know stuff or can think or write well, that could be brainrotting.

7johnswentworth6mo

Agreed, that's basically how I use them.

7Gunnar_Zarncke6mo

...but you are using a phone now. Are you using LLMs? Maybe in both cases it is about using the tool in the way that benefits most?

5Viliam6mo

From my perspective, good things about smartphones: * phone and camera and navigation is the same device * very rarely, check something online * buy tickets for mass transit * my contacts are backed up in the cloud Bad things: * notifications The advantages outweigh the disadvantages, but it requires discipline about what you install. (Food for though: If only I had the same discipline about which web services I create an account for and put them into bookmarks on my PC.) Similar here, but that's because no one could give me a good use case. (I don't consider social networks on smartphone to be good.) And it's probably similar with LLMs, depends on how specifically you use them. I use them to ask questions (like a smarter version of Google) that I try to verify e.g. on Wikipedia afterwards, and sometimes to write code. Those seem like good things to me. There are probably bad ways to use them, but that is not what I would typically do.

4Adam Zerner6mo

My main concern with heavy LLM usage is what Paul Graham discusses in Writes and Write-Nots. His argument is basically that writing is thinking and that if you use LLM's to do your writing for you, well, your ability to think will erode.

4Adam Zerner6mo

I'm similar, for both smart phones and LLM usage. For smart phones there was one argument that moved me a moderate amount. I'm a web developer and startup founder. I was talking to my cousin's boyfriend who is also in tech. He made the argument to me that if I don't actively use smart phones I won't be able to empathize as much with smart phone users, which is important because to a meaningful extent, that's who I'm building for. I didn't think the empathy point was as strong as my cousin's boyfriend thought it was. Like, he seemed to think it was pretty essential and that if I don't use smart phones I just wouldn't be able to develop enough empathy to build a good product. I, on the other hand, saw it as something "useful" but not "essential". Looking back, I think I'd downgrade it to something like "a little useful" instead of "useful". I'm not sure where I'm going with this, exactly. Just kinda reflecting and thinking out loud.

2β-redex6mo

Conditional on LLMs scaling to AGI, I feel like it's a contradiction to say that "LLMs offer little or negative utility AND it's going to stay this way". My model is that we are either dying in a couple years to LLMs getting us to AGI, and we are going to have a year or two or of AIs that can provide incredible utility, or we are not dying to LLMs and the timelines are longer. I think I read somewhere that you don't believe LLMs will get us to AGI, so this might already be implicit in your model? I personally am putting at least some credence on the ai-2027 model, which predicts superhuman coders in the near future. (Not saying that I believe this is the most probable future, just that I find it convincing enough that I want to be prepared for it.) Up until recently I was in the "LLMs offer zero utility" camp (for coding), but now at work we have a Cursor plan (still would not pay for it for personal use probably), and with a lot of trial and error I feel like I am finding the kinds of tasks where AIs can offer a bit of utility, and I am slowly moving towards the "marginal utility" camp. One kind of thing I like using it for is small scripts to automate bits of my workflow. E.g. I have an idea for a script, I know it would take me 30m-1h to implement it, but it's not worth it because e.g. it would only save me a few seconds each time. But if I can reduce the time investment to only a few minutes by giving the task to the LLM, it can suddenly be worth it. I would be interested in other people's experiences with the negative side effects of LLM use. What are the symptoms/warning signs of "LLM brain rot"? I feel like with my current use I am relatively well equipped to avoid that: * I only ask things from LLMs that I know I could solve in a few hours tops. * I code review the result, tell it if it did something stupid. * 90% of my job is stuff that is currently not close to being LLM automatable anyway.

[-]johnswentworth10mo9371

Hypothesis: for smart people with a strong technical background, the main cognitive barrier to doing highly counterfactual technical work is that our brains' attention is mostly steered by our social circle. Our thoughts are constantly drawn to think about whatever the people around us talk about. And the things which are memetically fit are (almost by definition) rarely very counterfactual to pay attention to, precisely because lots of other people are also paying attention to them.

Two natural solutions to this problem:

build a social circle which can maintain its own attention, as a group, without just reflecting the memetic currents of the world around it.
"go off into the woods", i.e. socially isolate oneself almost entirely for an extended period of time, so that there just isn't any social signal to be distracted by.

These are both standard things which people point to as things-historically-correlated-with-highly-counterfactual-work. They're not mutually exclusive, but this model does suggest that they can substitute for each other - i.e. "going off into the woods" can substitute for a social circle with its own useful memetic environment, and vice versa.

[-]aysja10mo469

One thing that I do after social interactions, especially those which pertain to my work, is to go over all the updates my background processing is likely to make and to question them more explicitly.

This is helpful because I often notice that the updates I’m making aren’t related to reasons much at all. It’s more like “ah they kind of grimaced when I said that, so maybe I'm bad?” or like “they seemed just generally down on this approach, but wait are any of those reasons even new to me? Haven’t I already considered those and decided to do it anyway?” or “they seemed so aggressively pessimistic about my work, but did they even understand what I was saying?” or “they certainly spoke with a lot of authority, but why should I trust them on this, and do I even care about their opinion here?” Etc. A bunch of stuff which at first blush my social center is like “ah god, it’s all over, I’ve been an idiot this whole time” but with some second glancing it’s like “ah wait no, probably I had reasons for doing this work that withstand surface level pushback, let’s remember those again and see if they hold up” And often (always?) they do.

This did not come naturally to me; I’ve had to train myself into doing it. But it has helped a lot with this sort of problem, alongside the solutions you mention i.e. becoming more of a hermit and trying to surround myself by people engaged in more timeless thought.

[-]Mateusz Bagiński10mo105

solution 2 implies that a smart person with a strong technical background would go on to work on important problems (by default) which is not necessarily universally true and it's IMO likely that many such people would be working on less important things than what their social circle is otherwise steering them to work on

[-]johnswentworth10mo1210

The claim is not that either "solution" is sufficient for counterfactuality, it's that either solution can overcome the main bottleneck to counterfactuality. After that, per Amdahl's Law, there will still be other (weaker) bottlenecks to overcome, including e.g. keeping oneself focused on something important.

8Raemon10mo

I don't think the social thing ranks above "be able to think useful important thoughts at all". (But maybe otherwise agree with the rest of your model as an important thing to think about) [edit: hrm, "for smart people with a strong technical background" might be doing most of the work here"]

5faul_sname10mo

Plausibly going off into the woods decreases the median output while increasing the variance.

4Garrett Baker10mo

Why do you think this? When I try to think of concrete examples here, its all confounded by the relevant smart people having social circles not working on useful problems. I also think that 2 becomes more true once the relevant smart person already wants to solve alignment, or otherwise is already barking up the right tree.

2the gears to ascension10mo

One need not go off into the woods indefinitely, though.

4Mateusz Bagiński10mo

I don't think I implied that John's post implied that and I don't think going into the woods non-indefinitely mitigates the thing I pointed out.

9Rauno Arike10mo

As a counterpoint to the "go off into the woods" strategy, Richard Hamming said the following in "You and Your Research", describing his experience at Bell Labs: Bell Labs certainly produced a lot of counterfactual research, Shannon's information theory being the prime example. I suppose Bell Labs might have been well-described as a group that could maintain its own attention, though.

9johnswentworth10mo

Bell Labs is actually my go-to example of a much-hyped research institution whose work was mostly not counterfactual; see e.g. here. Shannon's information theory is the only major example I know of highly counterfactual research at Bell Labs. Most of the other commonly-cited advances, like e.g. transistors or communication satellites or cell phones, were clearly not highly counterfactual when we look at the relevant history: there were other groups racing to make the transistor, and the communication satellite and cell phones were both old ideas waiting on the underlying technology to make them practical. That said, Hamming did sit right next to Shannon during the information theory days IIRC, so his words do carry substantial weight here.

9leogao10mo

solution 3 is to be an iconoclast and to feel comfortable pushing against the flow and to try to prove everyone else wrong.

[-]johnswentworth10mo3228

Good idea, but... I would guess that basically everyone who knew me growing up would say that I'm exactly the right sort of person for that strategy. And yet, in practice, I still find it has not worked very well. My attention has in fact been unhelpfully steered by local memetic currents to a very large degree.

For instance, I do love proving everyone else wrong, but alas reversed stupidity is not intelligence. People mostly don't argue against the high-counterfactuality important things, they ignore the high-counterfactuality important things. Trying to prove them wrong about the things they do argue about is just another way of having one's attention steered by the prevailing memetic currents.

[-]TsviBT10mo124

People mostly don't argue against the high-counterfactuality important things, they ignore the high-counterfactuality important things. Trying to prove them wrong about the things they do argue about is just another way of having one's attention steered by the prevailing memetic currents.

This is true, but I still can't let go of the fact that this fact itself ought to be a blindingly obvious first-order bit that anyone who calls zerself anything like "aspiring rationalist" would be paying a good chunk of attention to, and yet this does not seem to be the case. Like, motions in the genre of

huh I just had reaction XYZ to idea ABC generated by a naively-good search process, and it seems like this is probably a common reaction to ABC; but if people tend to react to ABC with XYZ, and with other things coming from the generators of XYZ, then such and such distortion in beliefs/plans would be strongly pushed into the collective consciousness, e.g. on first-order or on higher-order deference effects ; so I should look out for that, e.g. by doing some manual fermi estimates or other direct checking about ABC or by investigating the strength of the steelman of reaction XYZ, or by keeping an eye out for people systematically reacting with XYZ without good foundation so I can notice this,

where XYZ could centrally be things like e.g. copium or subtly contemptuous indifference, do not seem to be at all common motions.

3Morpheus10mo

Accusing people in my head of not being numerate enough when this happens has helped, because then I don't want to be a hypocrite. GPT4o or o1 are good at fermi estimates, making this even easier.

6Viliam10mo

Note that it is not necessary for the social circle to share your beliefs, only to have a social norm that people express interest in each other's work. Could be something like: once or twice in a week the people will come to a room and everyone will give a presentation about what they have achieved recently, and maybe the other people will provide some feedback (not in the form of "why don't you do Y instead", but with the assumption that X is a thing worth doing).

3Cole Wyeth10mo

How would this model treat mathematicians working on hard open problems? P vs NP might be counter factual just because no one else is smart enough or has the right advantage to solve it. Insofar as central problems of a field have been identified but not solved, I’m not sure your model gives good advice.

[-]Daniel Murfet10mo195

I visited Mikhail Khovanov once in New York to give a seminar talk, and after it was all over and I was wandering around seeing the sights, he gave me a call and offered a long string of general advice on how to be the kind of person who does truly novel things (he's famous for this, you can read about Khovanov homology). One thing he said was "look for things that aren't there" haha. It's actually very practical advice, which I think about often and attempt to live up to!

6Adele Lopez10mo

What else did he say? (I'd love to hear even the "obvious" things he said.)

[-]Daniel Murfet10mo170

I'm ashamed to say I don't remember. That was the highlight. I think I have some notes on the conversation somewhere and I'll try to remember to post here if I ever find it.

I can spell out the content of his Koan a little, if it wasn't clear. It's probably more like: look for things that are (not there). If you spend enough time in a particular landscape of ideas, you can (if you're quiet and pay attention and aren't busy jumping on bandwagons) get an idea of a hole, which you're able to walk around but can't directly see. In this way new ideas appear as something like residues from circumnavigating these holes. It's my understanding that Khovanov homology was discovered like that, and this is not unusual in mathematics.

By the way, that's partly why I think the prospect of AIs being creative mathematicians in the short term should not be discounted; if you see all the things you see all the holes.

[-]Alexander Gietelink Oldenziel10mo156

For those who might not have noticed Dan's clever double entendre: (Khovanov) homology is literally about counting/measuring holes in weird high-dimensional spaces - designing a new homology theory is in a very real sense about looking for holes that are not (yet) there.

3Mitchell_Porter10mo

Are there any examples yet, of homology or cohomology being applied to cognition, whether human or AI?

[-]Daniel Murfet10mo110

There's plenty, including a line of work by Carina Curto, Katrin Hess and others that is taken seriously by a number of mathematically inclined neuroscience people (Tom Burns if he's reading can comment further). As far as I know this kind of work is the closest to breaking through into the mainstream. At some level you can think of homology as a natural way of preserving information in noisy systems, for reasons similar to why (co)homology of tori was a useful way for Kitaev to formulate his surface code. Whether or not real brains/NNs have some emergent computation that makes use of this is a separate question, I'm not aware of really compelling evidence.

There is more speculative but definitely interesting work by Matilde Marcolli. I believe Manin has thought about this (because he's thought about everything) and if you have twenty years to acquire the prerequisites (gamma spaces!) you can gaze into deep pools by reading that too.

6Alexander Gietelink Oldenziel10mo

No.

4Garrett Baker10mo

Topological data analysis comes closest, and there are some people who try to use it for ML, eg. Though my understanding is this is used in interp, not so much because people necessarily expect deep connections to homology, but because its just another way to look for structure in your data. TDA itself is also a relatively shallow tool too.

3Lorxus10mo

As someone who does both data analysis and algebraic topology, my take is that TDA showed promise but ultimately there's something missing such that it's not at full capacity. Either the formalism isn't developed enough or it's being consistently used on the wrong kinds of datasets. Which is kind of a shame, because it's the kind of thing that should work beautifully and in some cases even does!

3[anonymous]10mo

I thought it might be "look for things that might not even be there as hard as you would if they are there." Then the koan form takes it closer to "the thereness of something just has little relevance on how hard you look for it." But it needs to get closer to the "biological" part of your brain, where you're not faking it with all your mental and bodily systems, like when your blood pressure rises from "truly believing" a lion is around the corner but wouldn't if you "fake believe" it.

3Lorxus10mo

I imagine it's something like "look for things that are notably absent, when you would expect them to have been found if there"?

2TsviBT10mo

Some things even withdraw. https://tsvibt.blogspot.com/2023/05/the-possible-shared-craft-of-deliberate.html#aside-on-withdrawal-and-the-leap https://tsvibt.blogspot.com/2023/09/a-hermeneutic-net-for-agency.html#withdrawal

2ozziegooen10mo

Obvious point - I think a lot of this comes from the financial incentives. The more "out of the box" you go, the less sure you can be that there will be funding for your work. Some of those that do this will be rewarded, but I suspect many won't be. As such, I think that funders can help more to encourage this sort of thing, if they want to.

[-]johnswentworth1y794

Conjecture's Compendium is now up. It's intended to be a relatively-complete intro to AI risk for nontechnical people who have ~zero background in the subject. I basically endorse the whole thing, and I think it's probably the best first source to link e.g. policymakers to right now.

I might say more about it later, but for now just want to say that I think this should be the go-to source for new nontechnical people right now.

[-]Orpheus161y3616

I think there's something about Bay Area culture that can often get technical people to feel like the only valid way to contribute is through technical work. It's higher status and sexier and there's a default vibe that the best way to understand/improve the world is through rigorous empirical research.

I think this an incorrect (or at least incomplete) frame, and I think on-the-margin it would be good for more technical people to spend 1-5 days seriously thinking about what alternative paths they could pursue in comms/policy.

I also think there are memes spreading around that you need to be some savant political mastermind genius to do comms/policy, otherwise you will be net negative. The more I meet policy people (including successful policy people from outside the AIS bubble), the more I think this narrative was, at best, an incorrect model of the world. At worst, a take that got amplified in order to prevent people from interfering with the AGI race (e.g., by granting excess status+validity to people/ideas/frames that made it seem crazy/unilateralist/low-status to engage in public outreach, civic discourse, and policymaker engagement.)

(Caveat: I don't think the adversarial frame explains everything, and I do think there are lots of people who were genuinely trying to reason about a complex world and just ended up underestimating how much policy interest there would be and/or overestimating the extent to which labs would be able to take useful actions despite the pressures of race dynamics.)

[-]aysja1y268

I think I probably agree, although I feel somewhat wary about it. My main hesitations are:

The lack of epistemic modifiers seems off to me, relative to the strength of the arguments they’re making. Such that while I agree with many claims, my imagined reader who is coming into this with zero context is like “why should I believe this?” E.g., “Without intervention, humanity will be summarily outcompeted and relegated to irrelevancy,” which like, yes, but also—on what grounds should I necessarily conclude this? They gave some argument along the lines of “intelligence is powerful,” and that seems probably true, but imo not enough to justify the claim that it will certainly lead to our irrelevancy. All of this would be fixed (according to me) if it were framed more as like “here are some reasons you might be pretty worried,” of which there are plenty, or "here's what I think," rather than “here is what will definitely happen if we continue on this path,” which feels less certain/obvious to me.
Along the same lines, I think it’s pretty hard to tell whether this piece is in good faith or not. E.g., in the intro Connor writes “The default path we are on now is one of ruthless, sociopathic c

... (read more)

[-]Orpheus161y146

One of the common arguments in favor of investing more resources into current governance approaches (e.g., evals, if-then plans, RSPs) is that there's nothing else we can do. There's not a better alternative– these are the only things that labs and governments are currently willing to support.

The Compendium argues that there are other (valuable) things that people can do, with most of these actions focusing on communicating about AGI risks. Examples:

Share a link to this Compendium online or with friends, and provide your feedback on which ideas are correct and which are unconvincing. This is a living document, and your suggestions will shape our arguments.
Post your views on AGI risk to social media, explaining why you believe it to be a legitimate problem (or not).
Red-team companies’ plans to deal with AI risk, and call them out publicly if they do not have a legible plan.

One possible critique is that their suggestions are not particularly ambitious. This is likely because they're writing for a broader audience (people who haven't been deeply engaged in AI safety).

For people who have been deeply engaged in AI safety, I think the natural steelman here is "focus on helping the ... (read more)

6Orpheus161y

I appreciated their section on AI governance. The "if-then"/RSP/preparedness frame has become popular, and they directly argue for why they oppose this direction. (I'm a fan of preparedness efforts– especially on the government level– but I think it's worth engaging with the counterarguments.) Pasting some content from their piece below. High-level thesis against current AI governance efforts: Critique of reactive frameworks: Critique of waiting for warning shots:

5Bogdan Ionut Cirstea1y

This seems to be confusing a dangerous capability eval (of being able to 'deceive' in a visible scratchpad) with an assessment of alignment, which seems like exactly what the 'questioning' was about.

5Nathan Helm-Burger1y

I like it. I do worry that it, and The Narrow Path, are both missing how hard it will be to govern and restrict AI.

4Nathan Helm-Burger1y

My own attempt is much less well written and comprehensive, but I think I hit on some points that theirs misses: https://www.lesswrong.com/posts/NRZfxAJztvx2ES5LG/a-path-to-human-autonomy

2epistemic meristem1y

(There was already a linkpost.)

[-]johnswentworth2y7811

NVIDIA Is A Terrible AI Bet

Short version: Nvidia's only moat is in software; AMD already makes flatly superior hardware priced far lower, and Google probably does too but doesn't publicly sell it. And if AI undergoes smooth takeoff on current trajectory, then ~all software moats will evaporate early.

Long version: Nvidia is pretty obviously in a hype-driven bubble right now. However, it is sometimes the case that (a) an asset is in a hype-driven bubble, and (b) it's still a good long-run bet at the current price, because the company will in fact be worth that much. Think Amazon during the dot-com bubble. I've heard people make that argument about Nvidia lately, on the basis that it will be ridiculously valuable if AI undergoes smooth takeoff on the current apparent trajectory.

My core claim here is that Nvidia will not actually be worth much, compared to other companies, if AI undergoes smooth takeoff on the current apparent trajectory.

Other companies already make ML hardware flatly superior to Nvidia's (in flops, memory, whatever), and priced much lower. AMD's MI300x is the most obvious direct comparison. Google's TPUs are probably another example, though they're not sold publicly s... (read more)

[-]MichaelStJules2y290

Why do you believe AMD and Google make better hardware than Nvidia?

[-]johnswentworth2y200

The easiest answer is to look at the specs. Of course specs are not super reliable, so take it all with many grains of salt. I'll go through the AMD/Nvidia comparison here, because it's a comparison I looked into a few months back.

MI300x vs H100

Techpowerup is a third-party site with specs for the MI300x and the H100, so we can do a pretty direct comparison between those two pages. (I don't know if the site independently tested the two chips, but they're at least trying to report comparable numbers.) The H200 would arguably be more of a "fair comparison" since the MI300x came out much later than the H100; we'll get to that comparison next. I'm starting with MI300x vs H100 comparison because techpowerup has specs for both of them, so we don't have to rely on either company's bullshit-heavy marketing materials as a source of information. Also, even the H100 is priced 2-4x more expensive than the MI300x (~$30-45k vs ~$10-15k), so it's not unfair to compare the two.

Key numbers (MI300x vs H100):

float32 TFLOPs: ~80 vs ~50
float16 TFLOPs: ~650 vs ~200
memory: 192 GB vs 80 GB (note that this is the main place where the H200 improves on the H100)
bandwidth: ~10 TB/s vs ~2 TB/s

... so the compari... (read more)

[-]ryan_greenblatt2y*242

Its worth noting that even if nvidia is charging 2-4x more now, the ultimate question for competitiveness will be manufactoring cost for nvidia vs amd. If nvidia has much lower manufactoring costs than amd per unit performance (but presumably higher markup), then nvidia might win out even if their product is currently worse per dollar.

Note also that price discrimination might be a big part of nvidia's approach. Scaling labs which are willing to go to great effort to drop compute cost by a factor of two are a subset of nvidia's customers where nvidia would ideally prefer to offer lower prices. I expect that nvidia will find a way to make this happen.

[-]PeterMcCluskey2y170

I'm holding a modest long position in NVIDIA (smaller than my position in Google), and expect to keep it for at least a few more months. I expect I only need NVIDIA margins to hold up for another 3 or 4 years for it to be a good investment now.

It will likely become a bubble before too long, but it doesn't feel like one yet.

[-]James Payor2y131

While the first-order analysis seems true to me, there are mitigating factors:

AMD appears to be bungling on their GPUs being reliable and fast, and probably will for another few years. (At least, this is my takeaway from following the TinyGrad saga on Twitter...) Their stock is not valued as it should be for a serious contender with good fundamentals, and I think this may stay the case for a while, if not forever if things are worse than I realize.
NVIDIA will probably have very-in-demand chips for at least another chip generation due to various inertias.
There aren't many good-looking places for the large amount of money that wants to be long AI to go right now, and this will probably inflate prices for still a while across the board, in proportion to how relevant-seeming the stock is. NVDA rates very highly on this one.

So from my viewpoint I would caution against being short NVIDIA, at least in the short term.

[-]Tao Lin2y123

No, the mi300x is not superior to nvidias chips, largely because It costs >2x to manufacture as nvidias chips

9Ann2y

Potential counterpoints: * If AI automates most, but not all, software engineering, moats of software dependencies could get more entrenched, because easier-to-use libraries have compounding first-mover advantages. * The disadvantages of AMD software development potentially need to be addressed at levels not accessible to an arbitrary feral automated software engineer in the wild, to make the stack sufficiently usable. (A lot of actual human software engineers would like the chance.) * NVIDIA is training their own AIs, who are pretty capable. * NVIDIA can invest their current profits. (Revenues, not stock valuations.)

[-]gwern2y*13-3

If AI automates most, but not all, software engineering, moats of software dependencies could get more entrenched, because easier-to-use libraries have compounding first-mover advantages.

I don't think the advantages would necessarily compound - quite the opposite, there are diminishing returns and I expect 'catchup'. The first-mover advantage neutralizes itself because a rising tide lifts all boats, and the additional data acts as a prior: you can define the advantage of a better model, due to any scaling factor, as equivalent to n additional datapoints. (See the finetuning transfer papers on this.) When a LLM can zero-shot a problem, that is conceptually equivalent to a dumber LLM which needs 3-shots, say. And so the advantages of a better model will plateau, and can be matched by simply some more data in-context - such as additional synthetic datapoints generated by self-play or inner-monologue etc. And the better the model gets, the more 'data' it can 'transfer' to a similar language to reach a given X% of coding performance. (Think about how you could easily transfer given access to an environment: just do self-play on translating any solved Python problem into the target la... (read more)

3Ann2y

It's probably worth mentioning that there's now a licensing barrier to running CUDA specifically through translation layers: https://www.tomshardware.com/pc-components/gpus/nvidia-bans-using-translation-layers-for-cuda-software-to-run-on-other-chips-new-restriction-apparently-targets-zluda-and-some-chinese-gpu-makers This isn't a pure software engineering time lockin; some of that money is going to go to legal action looking for a hint big targets have done the license-noncompliant thing. Edit: Additionally, I don't think a world where "most but not all" software engineering is automated is one where it will be a simple matter to spin up a thousand effective SWEs of that capability; I think there's first a world where that's still relatively expensive even if most software engineering is being done by automated systems. Paying $8000 for overnight service of 1000 software engineers would be a rather fine deal, currently, but still too much for most people.

8gwern2y

I don't think that will be at all important. You are creating alternate reimplementations of the CUDA API, you aren't 'translating' or decompiling it. And if you are buying billions of dollars of GPUs, you can afford to fend off some Nvidia probes and definitely can pay $0.000008b periodically for an overnighter. (Indeed, Nvidia needing to resort to such Oracle-like tactics is a bear sign.)

3Ann2y

While there's truth in what you say, I also think a market that's running thousands of software engineers is likely to be hungry for as many good GPUs as the current manufacturers can make. NVIDIA not being able to sustain a relative monopoly forever still doesn't put it in a bad position.

[-]gwern2y182

People will hunger for all the GPUs they can get, but then that means that the favored alternative GPU 'manufacturer' simply buys out the fab capacity and does so. Nvidia has no hardware moat: they do not own any chip fabs, they don't own any wafer manufacturers, etc. All they do is design and write software and all the softer human-ish bits. They are not 'the current manufacturer' - that's everyone else, like TSMC or the OEMs. Those are the guys who actually manufacture things, and they have no particular loyalty to Nvidia. If AMD goes to TSMC and asks for a billion GPU chips, TSMC will be thrilled to sell the fab capacity to AMD rather than Nvidia, no matter how angry Jensen is.

So in a scenario like mine, if everyone simply rewrites for AMD, AMD raises its prices a bit and buys out all of the chip fab capacity from TSMC/Intel/Samsung/etc - possibly even, in the most extreme case, buying capacity from Nvidia itself, as it suddenly is unable to sell anything at its high prices that it may be trying to defend, and is forced to resell its reserved chip fab capacity in the resulting liquidity crunch. (No point in spending chip fab capacity on chips you can't sell at your target price and you aren't sure what you're going to do.) And if AMD doesn't do so, then player #3 does so, and everyone rewrites again (which will be easier the second time as they will now have extensive test suites, two different implementations to check correctness against, documentation from the previous time, and AIs which have been further trained on the first wave of work).

6Radford Neal2y

But why would the profit go to NVIDIA, rather than TSMC? The money should go to the company with the scarce factor of production.

3Ann2y

(... lol. That snuck in without any conscious intent to imply anything, yes. I haven't even personally interacted with the open Nvidia models yet.) I do think the analysis is a decent map to nibbling at NVIDIA's pie share if you happen to be a competitor already -- AMD, Intel, or Apple currently, to my knowledge, possibly Google depending what they're building internally and if they decide to market it more. Apple's machine learning ecosystem is a bit of a parallel one, but I'd be at least mildly interested in it from a development perspective, and it is making progress. But when it comes to the hardware, this is a sector where it's reasonably challenging to conjure a competitor out of thin air still, so competitor behavior -- with all its idiosyncrasies -- is pretty relevant.

6jmh2y

Two questionson this. First, if AI is a big value driver, in a general economic sense, is your view that NVIDIA is over prices against its future potential or just that relatively NVIDIA will under perform other investment alternatives you see. Second, and perhaps an odd and speculative (perhaps nonsense) thought. I would expect that in this area one might see some network effects in play as well so wondering if that might impact the AI engineering decisions on software. Could the AI software solutions look towards maximising the value of the installed network (AIs work better on a common chip and code infrastructure) than will be true if one looks at some isolated technical stats. A bit a long the lines of why Beta was displaced by VHS dispite being a better technology. If so, then it seems possible that NVIDA could remain a leader and enjoy its current pricing powers (at least to some extent) for a fairly long period of time.

4Garrett Baker8mo

Apparently there already exists a CUDA-alternative for non-Nvidia hardware. The open source project ZLUDA. As far as I can tell its less performant than CUDA, and it has the same challenges as firefox does when competing with chromium based browsers, which will only get worse as it gets more popular. But its something to track at least.

4Josh You2y

AI that can rewrite CUDA is a ways off. It's possible that it won't be that far away in calendar time, but it is far away in terms of AI market growth and hype cycles. If GPT-5 does well, Nvidia will reap the gains more than AMD or Google.

3havdvdbd1y

Transpiling assembly code written for one OS/kernel to assembly code for another OS/kernel while taking advantage the full speed of the processor, is a completely different task from transpiling say, java code into python. Also, the hardware/software abstraction might break. A python developer can say hardware failures are not my problem. An assembly developer working at an AGI lab needs to consider hardware failures as lost wallclock time in their company’s race to AGI, and will try to write code so that hardware failures don’t cause the company to lose time. GPT4 definitely can’t do this type of work and I’ll bet a lot of money GPT5 can’t do it either. ASI can do it but there’s bigger considerations than whether Nvidia makes money there, such as whether we’re still alive and whether markets and democracy continue to exist. Making a guess of N for which GPT-N can get this done requires evaluating how hard of a software task this actually is, and your comment contains no discussion of this. Have you looked at tinygrad’s codebase or spoken to George Hotz about this?

3O O2y

Shorting nvidia might be tricky. I’d short nvidia and long TSM or an index fund to be safe at some point. Maybe now? Typically the highest market cap stock has poor performance after it claims that spot.

[-]johnswentworth1y672

Here's a side project David and I have been looking into, which others might have useful input on...

Background: Thyroid & Cortisol Systems

As I understand it, thyroid hormone levels are approximately-but-accurately described as the body's knob for adjusting "overall metabolic rate" or the subjective feeling of needing to burn energy. Turn up the thyroid knob, and people feel like they need to move around, bounce their leg, talk fast, etc (at least until all the available energy sources are burned off and they crash). Turn down the thyroid knob, and people are lethargic.

That sounds like the sort of knob which should probably typically be set higher, today, than was optimal in the ancestral environment. Not cranked up to 11; hyperthyroid disorders are in fact dangerous and unpleasant. But at least set to the upper end of the healthy range, rather than the lower end.

... and that's nontrivial. You can just dump the relevant hormones (T3/T4) into your body, but there's a control system which tries to hold the level constant. Over the course of months, the thyroid gland (which normally produces T4) will atrophy, as it shrinks to try to keep T4 levels fixed. Just continuing to pump T3/... (read more)

[-]Nathan Helm-Burger1y327

Uh... Guys. Uh. Biology is complicated. It's a messy pile of spaghetti code. Not that it's entirely intractable to make Pareto improvements but, watch out for unintended consequences.

For instance: you are very wrong about cortisol. Cortisol is a "stress response hormone". It tells the body to divert resources to bracing itself to deal with stress (physical and/or mental). Experiments have shown that if you put someone through a stressful event while suppressing their cortisol, they have much worse outcomes (potentially including death). Cortisol doesn't make you stressed, it helps you survive stress. Deviation from homeostatic setpoints (including mental ones) are what make you stressed.

1anaguma1y

This is interesting. Can you say more about these experiments?

8Nathan Helm-Burger1y

Hmm, I'll see if I can find some old papers.... I'm just reciting memories from grad school lectures like... 12 years ago. Here's an example of the finding being replicated and explored further in a primate model: https://www.jci.org/articles/view/112443 Here's a review of cortisol inhibition and surgery findings. A mixed bag, a complicated system. https://academic.oup.com/bja/article/85/1/109/263834 https://onlinelibrary.wiley.com/doi/abs/10.1111/ejn.15721 "Evidence suggests that psychological stress has effects on decision making, but the results are inconsistent, and the influence of cortisol and other modulating factors remains unclear. " Basically, cortisol is helpful for surviving injuries. Is it helpful for mental stress? Unclear. Long term high cortisol is harmful, but the stress in one's life resulting in that high cortisol level is harmful in more ways than just high cortisol. So are there times when it would be helpful to reduce someone's cortisol level? Absolutely. But it's complicated and should be done thoughtfully and selectively, and in combination with other things (particularly seeking out and treating the upstream causes). You can find lots more on Google scholar.

[-]Steven Byrnes1y143

I don’t think that any of {dopamine, NE, serotonin, acetylcholine} are scalar signals that are “widely broadcast through the brain”. Well, definitely not dopamine or acetylcholine, almost definitely not serotonin, maybe NE. (I recently briefly looked into whether the locus coeruleus sends different NE signals to different places at the same time, and ended up at “maybe”, see §5.3.1 here for a reference.)

I don’t know anything about histamine or orexin, but neuropeptides are a better bet in general for reasons in §2.1 here.

As far as I can tell, parasympathetic tone is basically Not A Thing

Yeah, I recall reading somewhere that the term “sympathetic” in “sympathetic nervous system” is related to the fact that lots of different systems are acting simultaneously. “Parasympathetic” isn’t supposed to be like that, I think.

7Elizabeth1y

This sounds logical but I don't think is backed empirically, at least to the degree you're claiming. Source: I have a biology BA and can't speak directly to the question because I never took those classes because they had reputations for being full of exceptions and memorization.

6Garrett Baker1y

The most obvious one imo is the immune system & the signals it sends. Others: * Circadian rhythm * Age is perhaps a candidate here, though it may be more or less a candidate depending on if you're talking about someone before or after 30 * Hospice workers sometimes talk about the body "knowing how to die", maybe there's something to that

5Thane Ruthenis1y

I don't have deep expertise in the subject, but I'm inclined to concur with the people saying that the widely broadcast signals don't actually represent one consistent thing, despite your plausible argument to the contrary. Here's a Scott Alexander post speculating why that might be the case. In short: there was an optimization pressure towards making internal biological signals very difficult to decode, because easily decodable signals were easy target for parasites evolving to exploit them. As the result, the actual signals are probably represented as "unnecessarily" complicated, timing-based combinations of various "basic" chemical, electrical, etc. signals, and they're somewhat individualized to boot. You can't decode them just by looking at any one spatially isolated chunk of the body, by design. Basically: separate chemical substances (and other components that look "simple" locally/from the outside) are not the privileged basis for decoding internal signals. They're the anti-privileged basis, if anything.

9Steven Byrnes1y

Yeah but if something is in the general circulation (bloodstream), then it’s going everywhere in the body. I don’t think there’s any way to specifically direct it. …Except in the time domain, to a limited extent. For example, in rats, tonic oxytocin in the bloodstream controls natriuresis, while pulsed oxytocin in the bloodstream controls lactation and birth. The kidney puts a low-pass filter on its oxytocin detection system, and the mammary glands & uterus put a high-pass filter, so to speak.

4Thane Ruthenis1y

The point wouldn't be to direct it, but to have different mixtures of chemicals (and timings) to mean different things to different organs. Loose analogy: Suppose that the intended body behaviors ("kidneys do X, heart does Y, brain does Z" for all combinations of X, Y, Z) are latent features, basic chemical substances and timings are components of the input vector, and there are dramatically more intended behaviors than input-vector components. Can we define the behavior-controlling function of organs (distributed across organs) such that, for any intended body behavior, there's a signal that sets the body into approximately this state? It seems that yes. The number of almost-orthogonal vectors in d dimensions scales exponentially with d, so we simply need to make the behavior-controlling function sensitive to these almost-orthogonal directions, rather than the chemical-basis vectors. The mappings from the input vector to the output behaviors, for each organ, would then be some complicated mixtures, not a simple "chemical A sets all organs into behavior X". This analogy seems flawed in many ways, but I think something directionally-like-this might be happening?

4johnswentworth1y

Just because the number of almost-orthogonal vectors in d dimensions scales exponentially with d, doesn't mean one can choose all those signals independently. We can still only choose d real-valued signals at a time (assuming away the sort of tricks by which one encodes two real numbers in a single real number, which seems unlikely to happen naturally in the body). So "more intended behaviors than input-vector components" just isn't an option, unless you're exploiting some kind of low-information-density in the desired behaviors (like e.g. very "sparse activation" of the desired behaviors, or discreteness of the desired behaviors to a limited extent).

4Thane Ruthenis1y

The above toy model assumed that we're picking one signal at a time, and that each such "signal" specifies the intended behavior for all organs simultaneously... ... But you're right that the underlying assumption there was that the set of possible desired behaviors is discrete (i. e., that X in "kidneys do X" is a discrete variable, not a vector of reals). That might've indeed assumed me straight out of the space of reasonable toy models for biological signals, oops.

5Maxwell Peterson1y

I had seen recommendations for T3/T4 on twitter to help with low energy, and even purchased some, but haven’t taken it. I hadn’t considered that the thyroid might respond by shrinking, and now think that that’s a worrying intervention! So I’m glad I read this - thank you.

4Michael Roe1y

As someone who has Graves’ Disease … one of the reasons that you really don’t want to run your metabolism faster with higher T4 levels is that higher heart rate for an extended period can cause your heart to fail.

1Michael Roe1y

More generally: changing the set point of any of these system might cause the failure of some critical component that depends on the old value of the set point,

4[anonymous]1y

Gwern gave a list in his Nootropics megapost.

6johnswentworth1y

Yup, I'm familiar with that one. The big difference is that I'm backward-chaining, whereas that post forward chains; the hope of backward chaining would be to identify big things which aren't on peoples' radar as nootropics (yet). (Relatedly: if one is following this sort of path, step 1 should be a broad nutrition panel and supplementing anything in short supply, before we get to anything fancier.)

3Jonas Hallgren1y

So I find the question underspecified, why do you want this? Why are you decomposing body signalling without looking at the major sub-regulstort systems? If you want to predict sleep then cortisol, melatonin, etc. is something quite good and this will tell you about stress regulation which effects both endocrine as well as cortisol systems. If you want to look at nutritional systems then GLP-1 activation is good for average food need whilst grelin is predictive of whether you will feel hungry at specific times. If you're looking at brain health then serotonin activation patterns can be really good to check but this is different from how the stomach uses it and it does have the majority of serotonin. But this is like way to simplified especially for the brain. Different subsystems use the same molecules in different ways, waste not and all that so what are you looking for and why?

2J Bostock1y

Is there a particular reason to not include sex hormones? Some theories suggest that testosterone tracks relative social status. We might expect that high social status -> less stress (of the cortisol type) + more metabolic activity. Since it's used by trans people we have a pretty good idea of what it does to you at high doses (makes you hungry, horny, and angry) but its unclear whether it actually promotes low cortisol-stress and metabolic activity.

[-]johnswentworth1y6419

AFAICT, approximately every "how to be good at conversation" guide says the same thing: conversations are basically a game where 2+ people take turns free-associating off whatever was said recently. (That's a somewhat lossy compression, but not that lossy.) And approximately every guide is like "if you get good at this free association game, then it will be fun and easy!". And that's probably true for some subset of people.

But speaking for myself personally... the problem is that the free-association game just isn't very interesting.

I can see where people would like it. Lots of people want to talk to other people more on the margin, and want to do difficult thinky things less on the margin, and the free-association game is great if that's what you want. But, like... that is not my utility function. The free association game is a fine ice-breaker, it's sometimes fun for ten minutes if I'm in the mood, but most of the time it's just really boring.

[-]MondSemmel1y2211

Even for serious intellectual conversations, something I appreciate in this kind of advice is that it often encourages computational kindness. E.g. it's much easier to answer a compact closed question like "which of these three options do you prefer" instead of an open question like "where should we go to eat for lunch". The same applies to asking someone about their research; not every intellectual conversation benefits from big open questions like the Hamming Question.

3wassname1y

I think this is especially important for me/us to remember. On this site we often have a complex way of thinking, and a high computational budget (because we like exercising our brains to failure) and if we speak freely to the average person, they mat be annoyed at how hard it is to parse what we are saying. We've all probably had this experience when genuinely trying to understand someone from a very different background. Perhaps they are trying to describe their inner experience when mediating, or Japanese poetry, or are simply from a different't discipline. Or perhaps we were just very tired that day, meaning we had a low computational budget. On the other hand, we are often a "tell" culture, which had a lower computational load compared to ask or guess culture. As long as we don't tell too much.

[-]Jonas Hallgren1y2112

Generally fair and I used to agree, I've been looking at it from a bit of a different viewpoint recently.

If we think of a "vibe" of a conversation as a certain shared prior that you're currently inhabiting with the other person then the free association game can rather be seen as a way of finding places where your world models overlap a lot.

My absolute favourite conversations are when I can go 5 layers deep with someone because of shared inference. I think the vibe checking for shared priors is a skill that can be developed and the basis lies in being curious af.

There's apparently a lot of different related concepts in psychology about holding emotional space and other things that I think just comes down to "find the shared prior and vibe there".

3TsviBT1y

Hm. This rings true... but also I think that selecting [vibes, in this sense] for attention also selects against [things that the other person is really committed to]. So in practice you're just giving up on finding shared commitments. I've been updating that stuff other than shared commitments is less good (healthy, useful, promising, etc.) than it seems.

1Jonas Hallgren1y

Hmm, I find that I'm not fully following here. I think "vibes" might be thing that is messing it up. Let's look at a specific example: I'm talking to a new person at an EA-adjacent event and we're just chatting about how the last year has been. Part of the "vibing" here might be to hone in on the difficulties experienced in the last year due to a feeling of "moral responsibility", in my view vibing doesn't have to be done with only positive emotions? I think you're bringing up a good point that commitments or struggles might be something that bring people closer than positive feelings because you're more vulnerable and open as well as broadcasting your values more. Is this what you mean with shared commitments or are you pointing at something else?

9TsviBT1y

Closeness is the operating drive, but it's not the operating telos. The drive is towards some sort of state or feeling--of relating, standing shoulder-to-shoulder looking out at the world, standing back-to-back defending against the world; of knowing each other, of seeing the same things, of making the same meaning; of integrated seeing / thinking. But the telos is tikkun olam (repairing/correcting/reforming the world)--you can't do that without a shared idea of better. As an analogy, curiosity is a drive, which is towards confusion, revelation, analogy, memory; but the telos is truth and skill. In your example, I would say that someone could be struggling with "moral responsibility" while also doing a bunch of research or taking a bunch of action to fix what needs to be fixed; or they could be struggling with "moral responsibility" while eating snacks and playing video games. Vibes are signals and signals are cheap and hacked.

[-]Thane Ruthenis1y152

There's a general-purpose trick I've found that should, in theory, be applicable in this context as well, although I haven't mastered that trick myself yet.

Essentially: when you find yourself in any given cognitive context, there's almost surely something "visible" from this context such that understanding/mastering/paying attention to that something would be valuable and interesting.

For example, suppose you're reading a boring, nonsensical continental-philosophy paper. You can:

Ignore the object-level claims and instead try to reverse-engineer what must go wrong in human cognition, in response to what stimuli, to arrive at ontologies that have so little to do with reality.
Start actively building/updating a model of the sociocultural dynamics that incentivize people to engage in this style of philosophy. What can you learn about mechanism design from that? It presumably sheds light on how to align people towards pursuing arbitrary goals, or how to prevent this happening...
Pay attention to your own cognition. How exactly are you mapping the semantic content of the paper to an abstract model of what the author means, or to the sociocultural conditions that created this paper? How do t

... (read more)

[-]J Bostock1y116

Some people struggle with the specific tactical task of navigating any conversational territory. I've certainly had a lot of experiences where people just drop the ball leaving me to repeatedly ask questions. So improving free-association skill is certainly useful for them.

Unfortunately, your problem is most likely that you're talking to boring people (so as to avoid doing any moral value judgements I'll make clear that I mean johnswentworth::boring people).

There are specific skills to elicit more interesting answers to questions you ask. One I've heard is "make a beeline for the edge of what this person has ever been asked before" which you can usually reach in 2-3 good questions. At that point they're forced to be spontaneous, and I find that once forced, most people have the capability to be a lot more interesting than they are when pulling cached answers.

This is easiest when you can latch onto a topic you're interested in, because then it's easy on your part to come up with meaningful questions. If you can't find any topics like this then re-read paragraph 2.

[-]Ben Pace1y102

Talking to people is often useful for goals like "making friends" and "sharing new information you've learned" and "solving problems" and so on. If what conversation means (in most contexts and for most people) is 'signaling that you repeatedly have interesting things to say', it's required to learn to do that in order to achieve your other goals.

Most games aren't that intrinsically interesting, including most social games. But you gotta git gud anyway because they're useful to be able to play well.

4_will_1y

Hmm, the ‘making friends’ part seems the most important (since there are ways to share new information you’ve learned, or solve problems, beyond conversation), but it also seems a bit circular. Like, if the reason for making friends is to hang out and have good conversations(?), but one has little interest in having conversations, then doesn’t one have little reason to make friends in the first place, and therefore little reason to ‘git gud’ at the conversation game?

[-]Ben Pace1y111

Er, friendship involves lots of things beyond conversation. People to support you when you're down, people to give you other perspectives on your personal life, people to do fun activities with, people to go on adventures and vacations with, people to celebrate successes in your life with, and many more.

Good conversation is a lubricant for facilitating all of those other things, for making friends and sustaining friends and staying in touch and finding out opportunities for more friendship-things.

9Zvi1y

The skill in such a game is largely in understanding the free association space, knowing how people likely react and thinking enough steps ahead to choose moves that steer the person where you want to go, either into topics you find interesting, information you want from them, or getting them to a particular position, and so on. If you're playing without goals, of course it's boring...

9David Lorell1y

I think that "getting good" at the "free association" game is in finding the sweet spot / negotiation between full freedom of association and directing toward your own interests, probably ideally with a skew toward what the other is interested in. If you're both "free associating" with a bias toward your own interests and an additional skew toward perceived overlap, updating on that understanding along the way, then my experience says you'll have a good chance of chatting about something that interests you both. (I.e. finding a spot of conversation which becomes much more directed than vibey free association.) Conditional on doing something like that strategy, I find it ends up being just a question of your relative+combined ability at this and the extent of overlap (or lack thereof) in interests. So short model is: Git gud at free association (+sussing out interests) -> gradient ascend yourselves to a more substantial conversation interesting to you both.

7Raemon1y

I have similar tastes, but, some additional gears: * I think all day, these days. Even if I'm trying to have interesting, purposeful conversations with people who also want that, it is useful to have sorts of things to talk about that let some parts of my brain relax (while using other parts of my brain I don't use as much) * on the margin, you can do an intense intellectual conversation, but still make it funnier, or with more opportunity for people to contribute.

6Johannes C. Mayer1y

It's becomes more interresting when the people constrain their output based on what they expect is true information that the other person does not yet know. It's useful to talk to an expert, who tells you a bunch of random stuff they know that you don't. Often some of it will be useful. This only works if they understand what you have said though (which presumably is something that you are interested in). And often the problem is that people's models about what is useful are wrong. This is especially likely if you are an expert in something. Then the thing that most people will say will be worse what you would think on the topic. This is especially bad if the people can't immediately even see why what you are saying is right. The best strategy around this I have found so far is just to switch the topic to the actually interesting/important things. Suprisingly usually people go along with it.

6lc1y

...How is that definition different than a realtime version of what you do when participating in this forum?

6johnswentworth1y

Good question. Some differences off the top of my head: * On this forum, if people don't have anything interesting to say, the default is to not say anything, and that's totally fine. So the content has a much stronger bias toward being novel and substantive and not just people talking about their favorite parts of Game of Thrones or rehashing ancient discussions (though there is still a fair bit of that) or whatever. * On this forum, most discussions open with a relatively-long post or shortform laying out some ideas which at least the author is very interested in. The realtime version would be more like a memo session or a lecture followed by discussion. * The intellectual caliber of people on this forum (or at least active discussants) is considerably higher than e.g. people at Berkeley EA events, let alone normie events. Last event I went to with plausibly-higher-caliber-people overall was probably the ILLIAD conference. * In-person conversations have a tendency to slide toward the lowest denominator, as people chime in about whatever parts they (think they) understand, thereby biasing toward things more people (think they) understand. On LW, karma still pushes in that direction, but threading allows space for two people to go back-and-forth on topics the audience doesn't really grock. Not sure to what extent those account for the difference in experience.

6lc1y

Totally understand why this would be more interesting; I guess I would still fundamentally describe what we're doing on the internet as conversation, with the same rules as you would describe above. It's just that the conversation you can find here (or potentially on Twitter) is superstimulating compared to what you're getting elsewhere. Which is good in the sense that it's more fun, and I guess bad inasmuch as IRL conversation was fulfilling some social or networking role that online conversation wasn't.

5Dennis Zoeller1y

I understand, for someone with a strong drive to solve hard problems, there's an urge for conversations to serve a function, exchange information with your interlocutor so things can get done. There's much to do and communication is already painfully inefficient at it's best. The thing is, I don't think the free-association game is inefficient, if one is skilled at it. It's also not all that free. The reason it is something humans "developed" is because it is the most efficient way to exchange rough but extensive models of our minds with others via natural language. It acts a bit like a ray tracer, you shoot conversational rays and by how they bounce around in mental structures, the thought patterns, values and biases of the conversation partners are revealed to each other. Shapes become apparent. Sometimes rays bounce off into empty space, then you need to restart the conversation, shoot a new ray. And getting better at this game, keeping the conversation going, exploring a wider range of topics more quickly, means building a faster ray tracer, means it takes less time to know if your interlocutor thinks in a way and about topics which you find enlightening/aesthetically pleasing/concretely useful/whatever you value. Or to use a different metaphor, starting with a depth-first search and never running a breadth-first search will lead to many false negatives. There are many minds out there that can help you in ways you won't know in advance. So if the hard problems you are working on could profit from more minds, it pays off to get better as this. Even if it has not much intrinsic value for you, it has instrumental value. Hope this doesn't come across as patronizing, definitely not meant that way.

[-]johnswentworth1y127

Part of the problem is that the very large majority of people I run into have minds which fall into a relatively low-dimensional set and can be "ray traced" with fairly little effort. It's especially bad in EA circles.

3Dennis Zoeller1y

Then I misunderstood your original comment, sorry. As a different commenter wrote, the obvious solution would be to only engage with interesting people. But, of course, unworkable in practice. And "social grooming" nearly always involves some level of talking. A curse of our language abilities, I guess. Other social animals don't have that particular problem. The next best solution would be higher efficiency, more socializing bang for your word count buck, so to speak. Shorter conversations for the same social effect. Not usually a focus of anything billed as conversation guide, for obvious reasons. But there are some methods aimed at different goals that, in my experience, also help with this as a side effect.

1Lorxus9mo

Say more about "ray-tracing"? What does that look like? And do you have a bullshit-but-useful PCA-flavored breakdown of those few dimensions of variation?

5TsviBT1y

Ok but how do you deal with the tragedy of the high dimensionality of context-space? People worth thinking with have wildly divergent goals--and even if you share goals, you won't share background information.

4mako yass1y

Yeah it sucks, search by free association is hillclimbing (gets stuck in local optima) and the contemporary media environment and political culture is an illustration of its problems. The pattern itself is a local optimum, it's a product of people walking into a group without knowing what the group is doing and joining in anyway, and so that pattern of low-context engagement becomes what we're doing, and the anxiety that is supposed to protect us from bad patterns like this and help us to make a leap out to somewhere better is usually drowned in alcohol. Instead of that, people should get to know each other before deciding what to talk about, and then intentionally decide to talk about what they find interesting or useful with that person. This gets better results every time. But when we socialise as children, there isn't much about our friends to get to know, no specialists to respectfully consult, no well processed life experiences to learn from, so none of us just organically find that technique of like, asking who we're talking to, before talking, it has to be intentionally designed.

3wassname1y

One blind spot we rationalists sometimes have is that charismatic people actually treat the game as: "Can I think of an association that will make the other person feel good and/or further my goal?". You need people to feel good, or they won't participate. And if you want some complicated/favour/uncomftorble_truth then you better mix in some good feels to balance it out and keep the other person participating. To put it another way: If you hurt people's brain or ego, rush them, or make them feel unsure, or contradict them, then most untrained humans will feel a little bad. Why would they want to keep feeling bad? Do you like it when people don't listen, contradict you, insult you, rush you, disagree with you? Probably not, probobly no one does. But if someone listens to you, smiles at you, likes you, has a good opinion of you, agrees with you, make sense to you. Then it feels good! This might sound dangerously sycophantic, and that's because it is - if people overdo it! But if it's mixed with some healthy understanding, learning, informing then It's a great conversational lubricant, and you should apply as needed. It just ensures that everyone enjoys themselves and comes back for more, counteracting the normal frictions of socialising. There are books about this. "How to Win Friends and Influence People" recommends talking about the other person's interests (including themselves) and listening to them, which they will enjoy. So I'd say, don't just free associate. Make sure it's fun for both parties, make room to listen to the other person, and to let them steer. (And ideally your conversational partner reciprocates, but that is not guaranteed).

3Alex Vermillion1y

Hm, I think this really does change when you get better at it? This only works for people you're interested in, but if you have someone you are interested in, the free association can be a way to explore a large number of interesting topics that you can pick up in a more structured way later. I think the statement you summarized from those guides is true, just not helpful to you.

3quailia1y

Another view would be that people want to be good at conversation not only because they find it fun but there is utility in building rapport quickly, networking and not being cast as a cold person. I do find the ice breaky, cached Q&A stuff really boring and tend to want to find an excuse to run away quickly, something that happens often at the dreaded "work event". I tend to see it as almost fully acting a part despite my internal feelings At these things, I do occasionally come across the good conversationalist, able to make me want to stick with speaking to them even if the convo is not that deep or in my interest areas. I think becoming like such a person isn't a herculean task but does take practice and is something I aspire too This is more from a professional setting though, in a casual setting it's much easier to disengage from a boring person, find shared interests and the convos have much less boundaries

3Trinley Goldenberg1y

I predict you would enjoy the free-association game better if you cultivated the skill of vibing more.

3MinusGix1y

I'm personally skeptical of this. I've found I'm far more likely to lie than I'd endorse when vibing. Saying "sure I'd be happy to join you on X event" when it is clear with some thought that I'd end up disliking it. Or exaggerating stories because it fits with the vibe. I view System-1 as less concerned with truth here, it is the one that is more likely to produce a fake-argument in response to a suggested problem. More likely to play social games regardless of if they make sense.

2Trinley Goldenberg1y

Oh yes, if you're going on people's words, it's obviously not much better, but the whole point of vibing is that it's not about the words. Your aesthetics, vibes, the things you care about will be communicated non-verbally.

[-]johnswentworth7mo63-27

John's Simple Guide To Fun House Parties

The simple heuristic: typical 5-year-old human males are just straightforwardly correct about what is, and is not, fun at a party. (Sex and adjacent things are obviously a major exception to this. I don't know of any other major exceptions, though there are minor exceptions.) When in doubt, find a five-year-old boy to consult for advice.

Some example things which are usually fun at house parties:

Dancing
Swordfighting and/or wrestling
Lasertag, hide and seek, capture the flag
Squirt guns
Pranks
Group singing, but not at a high skill level
Lighting random things on fire, especially if they explode
Building elaborate things from whatever's on hand
Physical party games, of the sort one would see on Nickelodeon back in the day

Some example things which are usually not fun at house parties:

Just talking for hours on end about the same things people talk about on LessWrong, except the discourse on LessWrong is generally higher quality
Just talking for hours on end about community gossip
Just talking for hours on end about that show people have been watching lately
Most other forms of just talking for hours on end

This message brought to you by the wound on my side from taser fighting at a house party last weekend. That is how parties are supposed to go.

[-]Daniel Murfet7mo190

One of my son's most vivid memories of the last few years (and which he talks about pretty often) is playing laser tag at Wytham Abbey, a cultural practice I believe instituted by John and which was awesome, so there is a literal five-year-old (well seven-year-old at the time) who endorses this message!

2Jan_Kulveit7mo

My guess is laser tags were actually introduced to Wytham Abbey during their Battleschool, not by John. (People familiar with the history can correct me)

6Alexander Gietelink Oldenziel7mo

John graciously and brilliantly came up with the laser tag guns when he was captain-by-night for agent foundations 2024.

4mattmacdermott7mo

October 2023 I believe

6johnswentworth7mo

No, I got a set of lasertag guns for Wytham well before Battleschool. We used them for the original SardineQuest.

4interstice7mo

This is one of the better sentences-that-sound-bizarre-without-context I've seen in a while.

[-]jchan7mo142

It took me years of going to bars and clubs and thinking the same thoughts:

Wow this music is loud
I can barely hear myself talk, let alone anyone else
We should all learn sign language so we don't have to shout at the top of our lungs all the time

before I finally realized - the whole draw of places like this is specifically that you don't talk.

4Viliam7mo

The reason the place is designed so that you can't talk is to make you buy more drinks. (Because when people start talking a lot, they forget to keep drinking.) It may or may not have a positive side effect on you having fun, but it wasn't designed with your fun as a goal.

8the gears to ascension7mo

Would be interesting to see a survey of five year olds to see if the qualifiers in your opening statement are anything like correct. I doubt you need to filter to just boys, for example.

8amaldorai7mo

For me, it depends on whether the attendees are people I've never met before, or people I've known my entire life. If it's people I don't know, I do like to talk to them, to find out whether we have anything interesting to exchange. If it's someone I've known forever, then things like karaoke or go-karting are more fun than just sitting around and talking.

5niplav7mo

Snowball fights/rolling big balls of snow fall into the same genre, if good snow is available. I guess this gives me a decent challenge for the next boring party: Turn the party into something fun as a project. Probably the best way to achieve this is to grab the second-most on-board person and escalate from there, clearly having more fun than the other people?

5ozziegooen7mo

Personally, I'm fairly committed to [talking a lot]. But I do find it incredibly difficult to do at parties. I've been trying to figure out why, but the success rate for me plus [talking a lot] at parties seems much lower than I would have hoped.

4Johannes C. Mayer7mo

My mind derives pleasure from deep philosophical and technical discussions.

3Adam Morris7mo

I'll add to this list: If you have a kitchen with a tile floor, have everyone take their shoes off, pour soap and water on the floor, and turn it into a slippery sliding dance party. It's so fun. (My friends and I used to call it "soap kitchen" and it was the highlight of our house parties.)

2Mark Henry7mo

what was the injury rate?

4Adam Morris7mo

We haven't had one yet! But we only did it ~3 times. Obviously people are more careful than they'd normally be while dancing on the slippery floor.

1Aristotelis Kostelenos7mo

After most people had left a small house party I was throwing, my close friends and I stayed and started pouring ethanol from a bottle on random surfaces and things and burning it. It was completely stupid, somewhat dangerous (some of us sustained some small burns), utterly pointless, very immature, and also extremely fun.

1acertain7mo

most of these require 1. more preparation & coordination 2. more physical energy from everyone which can be in short supply

8Mateusz Bagiński7mo

Which doesn't make the OP wrong.

[-]johnswentworth1y61-2

A Different Gambit For Genetically Engineering Smarter Humans?

Background: Significantly Enhancing Adult Intelligence With Gene Editing, Superbabies

Epistemic Status: @GeneSmith or @sarahconstantin or @kman or someone else who knows this stuff might just tell me where the assumptions underlying this gambit are wrong.

I've been thinking about the proposals linked above, and asked a standard question: suppose the underlying genetic studies are Not Measuring What They Think They're Measuring. What might they be measuring instead, how could we distinguish those possibilities, and what other strategies does that suggest?

... and after going through that exercise I mostly think the underlying studies are fine, but they're known to not account for most of the genetic component of intelligence, and there are some very natural guesses for the biggest missing pieces, and those guesses maybe suggest different strategies.

The Baseline

Before sketching the "different gambit", let's talk about the baseline, i.e. the two proposals linked at top. In particular, we'll focus on the genetics part.

GeneSmith's plan focuses on single nucleotide polymorphisms (SNPs), i.e. places in the genome where a single ba... (read more)

[-]gwern1y213

With SNPs, there's tens of thousands of different SNPs which would each need to be targeted differently. With high copy sequences, there's a relatively small set of different sequences.

No, rare variants are no silver bullet here. There's not a small set, there's a larger set - there would probably be combinatorially more rare variants because there are so many ways to screw up genomes beyond the limited set of ways defined by a single-nucleotide polymorphism, which is why it's hard to either select on or edit rare variants: they have larger (harmful) effects due to being rare, yes, and account for a large chunk of heritability, yes, but there are so many possible rare mutations that each one has only a few instances worldwide which makes them hard to estimate correctly via pure GWAS-style approaches. And they tend to be large or structural and so extremely difficult to edit safely compared to editing a single base-pair. (If it's hard to even sequence a CNV, how are you going to edit it?)

They definitely contribute a lot of the missing heritability (see GREML-KIN), but that doesn't mean you can feasibly do much about them. If there are tens of millions of possible rare variants, a... (read more)

3Olli Savolainen1y

That is relevant in pre-implantation diagnosis for parents and gene therapy at the population level. But for Qwisatz Haderach breeding purposes those costs are immaterial. There the main bottleneck is the iteration of selection, or making synthetic genomes. Going for the most typical genome with the least amount of originality is not a technical challenge in itself, right? We would not be interested in the effect of the ugliness, only in getting it out.

4gwern1y

Right. If you are doing genome synthesis, you aren't frustrated by the rare variant problems as much because you just aren't putting them in in the first place; therefore, there is no need to either identify the specific ones you need to remove from a 'wild' genome nor make highly challenging edits. (This is the 'modal genome' baseline. I believe it has still not been statistically modeled at all.) While if you are doing iterated embryo selection, you can similarly rely mostly on maximizing the common SNPs, which provide many SDs of possible improvement, and where you have poor statistical guidance on a variant, simply default to trying to select out against them and move towards a quasi-modal genome. (Essentially using rare-variant count as a tiebreaker and slowly washing out all of the rare variants from your embryo-line population. You will probably wind up with a lot in the final ones anyway, but oh well.)

4johnswentworth1y

Yeah, separate from both the proposal at top of this thread and GeneSmith's proposal, there's also the "make the median human genome" proposal - the idea being that, if most of the variance in human intelligence is due to mutational load (i.e. lots of individually-rare mutations which are nearly-all slightly detrimental), then a median human genome should result in very high intelligence. The big question there is whether the "mutational load" model is basically correct.

[-]TsviBT1y131

I didn't read this carefully--but it's largely irrelevant. Adult editing probably can't have very large effects because developmental windows have passed; but either way the core difficulty is in editor delivery. Germline engineering does not require better gene targets--the ones we already have are enough to go as far as we want. The core difficulty there is taking a stem cell and making it epigenomically competent to make a baby (i.e. make it like a natural gamete or zygote).

8Towards_Keeperhood1y

I haven't looked at any of the studies and also don't know much about genomics so my guess might be completely wrong, but a different hypothesis that seems pretty plausible to me is: Most of the variance of intelligence comes from how well different genes/hyperparamets-of-the-brain can work together, rather than them having individually independent effects on intelligence. Aka e.g. as made-up specifc implausible example (I don't know that much neuroscience), there could be different genes controlling the size, the snapse-density, and the learning/placticity-rate of cortical columns in some region and there are combinations of those hyperparameters which happen to work well and some that don't fit quite as well. So this hypothesis would predict that we didn't find the remaining genetic component for intelligence yet because we didn't have enough data to see what clusters of genes together have good effects and we also didn't know in what places to look for clusters.

9johnswentworth1y

Reasonable guess a priori, but I saw some data from GeneSmith at one point which looked like the interactions are almost always additive (i.e. no nontrivial interaction terms), at least within the distribution of today's population. Unfortunately I don't have a reference on hand, but you should ask GeneSmith if interested.

6GeneSmith1y

@towards_keeperhood yes this is correct. Most research seems to show ~80% of effects are additive. Genes are actually simpler than most people tend to think

9kave1y

I think Steve Hsu has written some about the evidence for additivity on his blog (Information Processing). He also talks about it a bit in section 3.1 of this paper.

3Towards_Keeperhood1y

Thanks. So I only briefly read through the section of the paper, but not really sure whether it applies to my hypothesis: My hypothesis isn't about there being gene-combinations that are useful which were selected for, but just about there being gene-combinations that coincidentally work better without there being strong selection pressure for those to quickly rise to fixation. (Also yeah for simpler properties like how much milk is produced I'd expect a much larger share of the variance to come from genes which have individual contributions. Also for selection-based eugenics the main relevant thing are the genes which have individual contribution. (Though if we have precise ability to do gene editing we might be able to do better and see how to tune the hyperparameters to fit well together.)) Please let me know whether I'm missing something though.

3Towards_Keeperhood1y

(There might be a sorta annoying analysis one could do to test my hypothesis: On my hypothesis the correlation between the intelligence of very intelligent parents and their children would be even a bit less than on the just-independent-mutations hypothesis, because very intelligent people likely also got lucky in how their gene variants work together but those properties would unlikely to all be passed along and end up dominant.)

3Towards_Keeperhood1y

Thanks for confirming. To clarify in case I'm misunderstanding, the effects are additive among the genes explaining the part of the IQ variance which we can so far explain, and we count that as evidence that for the remaining genetically caused IQ variance the effects will also be additive? I didn't look into how the data analysis in the studies was done, but on my default guess this generalization does not work well / the additivity on the currently identified SNPs isn't significant counterevidence for my hyptohesis: I'd imagine that studies just correlated individual gene variants with IQ and thereby found gene variants that have independent effects on intelligence. Or did they also look at pairwise or triplet gene-variant combinations and correlated those with IQ? (There would be quite a lot of pairs, and I'm not be sure whether the current datasets are large enough to robustly identify the combinations that really have good/bad effects from false positives.) One would of course expect that the effects of the gene variants which have independent effects on IQ are additive. But overall, except if the studies did look for higher-order IQ correlations, the fact that the IQ variance we can explain so far comes from genes which have independent effects isn't significant evidence for the remaining genetically-caused IQ variation also comes from gene variants which have independent effects, because we were bound to much rather find the genes which do have independent effects. (I think the above should be sufficient explanation of what I think but here's an example to clarify my hypothesis: Suppose gene A has variants A1 and A2 and gene B has B1 and B2. Suppose that A1 can work well with B1 and A2 with B2, but the other interactions don't fit together that well (like badly tuned hyperparameters) and result in lower intelligence. When we only look at e.g. A1 and A2, none is independently better than the other -- they are uncorrelated to IQ. Studies would need to l

3Towards_Keeperhood1y

(Thanks. I don't think this is necessarily significant evidence against my hypothesis (see my comment on GeneSmith's comment.) Another confusing relevant piece of evidence I thought I throw in: Human intelligence seems to me to be very heavytailed. (I assume this is uncontrovertial here, just look at the greatest scientists vs great scientists.) If variance in intelligence was basically purely explained by mildly-delterious SNPs, this would seem a bit odd to me: If the average person had 1000SNPs, and then (using butt-numbers which might be very off) Einstein (+6.3std) had only 800 and the average theoretical physics professor (+4std) had 850, I wouldn't expect the difference there to be that big. It's a bit less surprising on the model where most people have a few strongly delterious mutations, and supergeniuses are the lucky ones that have only 1 or 0 of those. It's IMO even a bit less surprising on my hypothesis where in some cases the different hyperparameters happen to work much better with each other -- where supergeniuses are in some dimensions "more lucky than the base genome" (in a way that's not necessarily easy to pass on to offspring though because the genes are interdependent, which is why the genes didn't yet rise to fixation). But even there I'd still be pretty surprised by the heavytail. The heavytail of intelligence really confuses me. (Given that it doesn't even come from sub-critical intelligence explosion dynamics.)

5tailcalled1y

If each deleterious mutation decreases the success rate of something by an additive constant, but you need lots of sequential successes for intellectual achievements, then intellectual formidability is ~exponentially related to deleterious variants.

3Towards_Keeperhood1y

Yeah I know that's why I said that if a major effect was through few significantly deleterious mutations this would be more plausible. But i feel like human intelligence is even more heavitailed than what one would predict given this hypothesis. If you have many mutations that matter, then via central limit theorem the overall distribution will be roughly gaussian even though the individual ones are exponential. (If I made a mistake maybe crunch the numbers to show me?) (initially misunderstood what you mean where i thought complete nonsense.) I don't understand what you're trying to say. Can you maybe rephrase again in more detail?

5tailcalled1y

Suppose people's probability of solving a task is uniformly distributed between 0 and 1. That's a thin-tailed distribution. Now consider their probability of correctly solving 2 tasks in a row. That will have a sort of triangular distribution, which has more positive skewness. If you consider e.g. their probability of correctly solving 10 tasks in a row, then the bottom 93.3% of people will all have less than 50%, whereas e.g. the 99th percentile will have 90% chance of succeeding. Conjunction is one of the two fundamental ways that tasks can combine, and it tends to make the tasks harder and rapidly make the upper tail do better than the lower tail, leading to an approximately-exponential element. Another fundamental way that tasks can combine is disjunction, which leads to an exponential in the opposite direction. When you combine conjunctions and disjunctions, you get an approximately sigmoidal relationship. The location/x-axis-translation of this sigmoid depends on the task's difficulty. And in practice, the "easy" side of this sigmoid can be automated or done quickly or similar, so really what matters is the "hard" side, and the hard side of a sigmoid is approximately exponential.

3Towards_Keeperhood1y

Thanks! Is the following a fair paraphrasing of your main hypothesis? (I'm leaving out some subtleties with conjunctive successes, but please correct the model in that way if it's relevant.): """ Each deleterious mutation multiplies your probability of succeeding at a problem/thought by some constant. Let's for simplicity say it's 0.98 for all of them. Then the expected number of successes per time for a person is proportional to 0.98^num_deleterious_mutations(person). So the model would predict that when Person A had 10 less deleterious mutations than person B, they would on average accomplish 0.98^10 ~= 0.82 times as much in a given timeframe. """ I think this model makes a lot of sense, thanks! In itself I think it's insufficient to explain how heavytailed human intelligence is -- there were multiple cases where Einstein seems to have been able to solve problems multiple times faster than the next runner ups. But I think if you use this model in a learning setting where success means "better thinking algorithms" then if you have 10 fewer deleterious mutations it's like having 1/0.82 longer training time, and there might also be compounding returns from having better thinking algorithms to getting more and richer updates to them. Not sure whether this completely deconfuses me about how heavytailed human intelligence is, but it's a great start. I guess at least the heavytail is much less significant evidence for my hypothesis than I initially thought (though so far I still think my hypothesis is plausible).

3rotatingpaguro1y

Half-informed take on "the SNPs explain a small part of the genetic variance": maybe the regression methods are bad?

3johnswentworth1y

Two responses: * It's a pretty large part - somewhere between a third and half - just not a majority. * I was also tracking that specific hypothesis, which was why I specifically flagged "about 25% of IQ variability (using a method which does not require identifying all the relevant SNPs, though I don't know the details of that method)". Again, I don't know the method, but it sounds like it wasn't dependent on details of the regression methods.

[-]johnswentworth5mo532

Continuing the "John asks embarrassing questions about how social reality actually works" series...

I’ve always heard (and seen in TV and movies) that bars and clubs are supposed to be a major place where single people pair up romantically/sexually. Yet in my admittedly-limited experience of actual bars and clubs, I basically never see such matching?

I’m not sure what’s up with this. Is there only a tiny fraction of bars and clubs where the matching happens? If so, how do people identify them? Am I just really, incredibly oblivious? Are bars and clubs just rare matching mechanisms in the Bay Area specifically? What’s going on here?

[-]Elizabeth5mo3716

that trope is heavily out of date

[-]sam5mo228

I get the impression that this is true for straight people, but from personal/anecdotal experience, people certainly do still pair up in gay bars/clubs.

4dr_s5mo

Yeah, feels like the current zeitgeist in Anglo countries and upper middle class environments at least is that it is simply bad manners to ever approach anyone with romantic/sexual intentions lest it's a context where everyone has explicitly agreed that's what you're there for (speed dating, dating app, etc).

[-]Linch5mo2231

I think this is exaggerated fwiw.

-4dr_s5mo

Well, I don't have much recent experience of dating myself, so it's second-hand. But also, this user specifically is talking about Bay Area, and if there's a single place and single social circle in the world where I expect this to be closest to true, "educated well-off tech people in the Bay Area" is it. I'm not saying this is a truth anywhere and with everyone. Also, even if it's not out of an actual social custom, I think at this point lots of people still resort to the internet as a way of looking for dates simply because the possibility is there and seemingly more direct (and lower effort). IIRC there's data showing that the number of couples that started on the internet has dramatically increased across the last years, leaving almost all other methods behind.

7Linch5mo

I think people use the internet/apps for dating due to a combination of convenience in sorting/search, because it's less awkward to be rejected online, and because it's the path of least resistance, not because asking people out in person is considered rude. It's true that in middle-class/upper middle class circles, professional events/workplace is now considered ~off-limits for dating, which wasn't true 30 years ago. However, that's a big difference from what you originally said where only dating-specific events are okay. People also do professional networking online + in dedicated networking events, but I don't think it's considered impolite to (eg) incidentally network in a ski lodge. Less effective, sure, but not impolite. I'm also in the general Bay Area/tech/educated milieu, so I do have relevant anecdotal experience here[1]. 1. ^ eg I recently went on a few dates with a leftist girl I asked out at a stargazing thing. Neither of us thought it was impolite, I think. That said, it didn't work out, and I guess I should've been able to figure that out a priori from stargazing not being the type of thing that's sufficiently indicative of relationship compatibility.

3the gears to ascension5mo

The best relationships don't go from zero to romantic in the first exchanged message[citation needed][original research?]

4Mateusz Bagiński5mo

I don't quite see how this comment connects to the comment you're responding to.

[-]Sean Herrington5mo3214

TLDR: People often kiss/go home with each other after meeting in clubs, less so bars. This isn't necessarily always obvious but should be observable when looking out for it.

OK, so I think most of the comments here don't understand clubs (@Myron Hedderson's comment has some good points though). As someone who has made out with a few people in clubs, and still goes from time to time I'll do my best to explain my experiences.

I've been to bars and clubs in a bunch of places, mostly in the UK but also elsewhere in Europe and recently in Korea and South East Asia.

In my experience, bars don't see too many hookups, especially since most people go with friends and spend most of their time talking to them. I imagine that one could end up pairing up at a bar if they were willing enough to meet new people and had a good talking game (and this also applied to the person they paired up with), but I feel like most of the actual action happens in clubs on the dancefloor.

I think matching can happen at just about any club in my experience, although I think . Most of the time it just takes the form of 2 people colliding (not necessarily literally), looking at each other, drunkeness making... (read more)

7johnswentworth5mo

How do you find good places and times to go? You just described exactly the sort of clubbing experience I most enjoy, but I've never had many close friends into it so I don't really know where to look.

4Sean Herrington5mo

Yeah having the right friends to go with is important. I've recently finished university so that's been easier for me than most, but in general I think it's easier when going to an event with a decent number of people (I play ice hockey and so team/club dinners are a good example). With more people there's a greater chance of there being a critical mass willing to go. Aside from that I've recently been backpacking around Vietnam, Cambodia and Thailand and I've found that being in a hostel makes it incredibly easy to meet people and go out locally. This does require being comfortable in that environment though. I think that all you really need is one friend who is willing to go with you, and they then become the main point of contact when you want to go. It's also possible to go alone, especially in communities like the backpacker community where it's incredibly easy to meet people. This is generally a lot more sketchy in many places though as you have no backup if you e.g get spiked or drink too much.

6johnswentworth5mo

Oh I have no problem going clubbing alone, I can have plenty of fun dancing with strangers. The hard part is finding the right club on the right night; AFAICT most of them are dead most nights. How do you solve that problem?

7Sean Herrington5mo

Oof honestly I feel like I mostly just kind of go and find a place with decent music that's open. I normally find there's at least one (or maybe my standards are just low), but I'd imagine that in places where that isn't the case you'd be able to look on the good clubs websites to see when they have events. I know that in Oxford clubs often have weekly theme nights, such as this one https://www.bridgeoxford.co.uk/wednesday. I'd imagine that a quick browse of your favourite clubs' websites would give you a good idea of where to go when.

5jmh5mo

I've not done this myself* (my clubbing days were long ago now) but a few approaches: 1. If you live somewhere where some areas specialize in nightlife -- bars, clubs, restaurants and even cool street scene -- then just be a tourist there for a bit. You'll see/find something that seems to fit for you. 2. Used to be "City Papers" that tended to focus on social life and what was happening during the week/month for people to learn about. So you'd hear about live music or popular DJs and where they were playing. 3. 2a. More current take I assume would be online versions of this. 4. Social apps that are about meetups (One is called that) but I suspect even FB has something along these lines, which have group you can join or are open to the public that talk about what activities, where and when the get together is occurs. So will specifically state they are NOT about any hookup possibility but other are about meeting others for more than the specific activity (activity is more about the introduction and something to so rather than the whole reason for going). 5. Last, you might check for any pub crawls going on. Some of the stops will be good clubs to check-out and even sometimes joining the crawl will offer opportunities. Particularly true if you're good at joining in with some new group of strangers -- very good social skills required as the group needs to want you to join. * Well, I have used Meetups for getting together with others but that was language based for learning and practicing so anyone that seemed more interested in meeting and other activities were discouraged or kicked out if overly obvious.

4CronoDAS5mo

What's the age range on clubbing? I'm newly single at 43 and I might have aged out of it, and a 43 year old trying to dance the way he did in high school usually looks stupid. (Or at least my late wife thought so.)

2Sean Herrington5mo

I think with enough enthusiasm anyone can go clubbing, and tbh imo stuff which looks stupid in a club just becomes entertaining. If you really feel embarrassed about it, one way to go about this is to play into the stupidity by really overexaggerating the moves to play into the humour. I think with age the ick comes from older guys who come to look at young girls and nothing else. I have a mate who's 49 and comes out clubbing with us, and is more enthusiastic than any of us on the dance floor and everyone loves it.

3CronoDAS5mo

My late wife in particular thought my dancing was bad, which is why I brought it up; I mentioned the term "dad dancing" to her and she thought it was an appropriate description. (She happened to be nine years younger than I was.)

4Myron Hedderson5mo

The point about making out is very valid, I've seen that plenty of times, and that should count as "pairing up sexually". For whatever reason/no good reason, it didn't occur to me to mention it in my longer comment. From the perspective of someone who has never actually enjoyed the clubbing experience before, the above advice sounds like good advice for how to have a better time. :)

[-]Gunnar_Zarncke5mo102

I heard it was usually at work, school, or a social group, church. This is not fully captured by How Couples Meet: Where Most Couples Find Love in 2025, but bar is higher than I expected.

[-]Steven Byrnes5mo101

My brother met his spouse at a club in NYC, around 2008. If I recall the story correctly, he was “doing the robot” on the stage, and then she started “doing the robot” on the floor. They locked eyes, he jumped down and danced over to her, and they were married a couple years later.

(Funny to think we’re siblings, when we have such different personalities!)

7Gurkenglas5mo

Go to a bar and ask a bartender how it works! I have tried pulling the autist card on a stranger to ask for social advice and it worked.

6Erich_Grunewald5mo

In my experience, bars these days (in the era of dating apps) are less a place where straight people pair up with strangers, and more a place where they: * Go on a first/second/third date with someone they know from a dating app or mutual friend or interest group, and maybe hook up with that person * Go with a friend group, and meet someone through a mutual friend, and maybe pair up with that person But fwiw, it still seems reasonably common for people pair up with strangers in bars/clubs where I live. I don't think bars/clubs are the perfect solution to meeting people romantically/sexually, but they have some advantages: * Alcohol makes people more willing to approach strangers, open up personally, and judge potential partners less critically * Bars/clubs (at least in major cities) are mostly filled with strangers you won't see again, reducing the (perceived) costs of rejection or committing some faux pas * Bars/clubs being dark and noisy makes it easier to approach someone without a lot of other people observing you * In bars and especially clubs, (good) music creates an atmosphere where people (who like that music) feel mildly intoxicated * Clubs in particular involve quite a lot of moving around (across/to/from dance floors, bars, toilets, and chill-out areas) that create opportunities to meet/interact with strangers That said, I think 10+ years ago bars/clubs were more of a place where people paired up with strangers. My sense is that this has changed largely due to dating apps, not by making it less acceptable to approach strangers, but more that dating apps offer an (often superior) alternative way of getting dates, which means people go to bars/clubs less to meet strangers and more to spend time with friends/partners. And even if a person is still interested in going to bars/clubs to meet strangers, it is harder when most other people are just there with their friend groups and not interested in interacting with strangers. (Bars/clubs for gay peo

5Kabir Kumar5mo

Personal experience - in uni, went to bars/clubs, I was generally pretty incompetent at the flirting thing, but danced with a bunch of girls, got numbers and didn't really know what to do after that. A handsome, charismatic friend of mine got together with a number of women, went home with a few, etc. As did a couple other friends. Location: scotland, dundee Years: 2021-2022

5Kabir Kumar5mo

also, was pretty common from a lot of friends stories to get with people after meeting them in the club. not relationships though also, clubs in general are very animalistic, eq driven places that i think most rats/lesswrong users dont understand

5avturchin5mo

Sitting on a long table (or bar itself) is a signal that you are open to connect with other people.

5MichaelDickens5mo

A quick search found this chart from a 2019 study on how couples meet. It looks like the fraction of couples who met at a bar has actually been going up in recent decades which is not what I would have predicted. But I don't know how reliable this study is.

5Mateusz Bagiński5mo

A member of my family (rather normie-ish) met his current girlfriend in a bar. A similar story with an EA acquaintance. But I don't hear stories like that very often, and also caveat that these were in Eastern Europe (Poland and Estonia, respectively).

4[anonymous]5mo

* It's out of date given how much dating has moved to apps. * And before apps, it was friends/families, and various communities like church, more than it was bars. * Whether cause or effect, alcohol interest has gone down, so it's only weirder to picture meeting someone in a bar. * There's some moral panic that Gen Z doesn't know how to talk to people in person which interacts with your question somehow. Like people will excessively mourn the loss of bar dating, when actually meeting dates while drunk sort of sucks. I'm sure there's a kernel-of-truth here, but generational moral panics are pretty much the default. * It's extremely sitcom-friendly in a way that staring at phones and computers isn't. * By the time it's in TV/movies, it's already heavily romanticized. The best example is "Cheers" which is a bar-as-church show. But the show is made when that type of community is already bygone. * When I was dating 10 years ago people still romanticized "meeting someone organically," but not in any serious way that would stop them from app dating.

4Adam Zerner5mo

Related (and hilarious): Why You Secretly Hate Cool Bars from WaitButWhy

3CronoDAS5mo

My model is that the primary service the Cool Bars provide is gatekeeping, so if you're not the kind of person big spenders want to be seen with (pretty girls and impressive men) it's going to be a hassle.

4Myron Hedderson5mo

I can't make strong claims here, as I go to bars and clubs fairly rarely. But I second the observation that it might be different in urban vs. rural areas, or (I add) different based on type of club. For example, the bar in my dad's family's extremely small hometown is the local gathering spot for those who want to have a beer with friends, which is very different from the loud, confusing, crowded dance clubs where you're packed in like sardines with people you don't know and can't even see clearly. I think a valid analysis has to segregate by type of bar/club. The small-town bar I'm thinking of does have live entertainment and dancing (also darts, which wouldn't work in a darkened environment where many people are quite drunk), but it's a very different scene. With respect specifically to the loud, dark, crowded places, lots of people find those off-putting and don't go, or go rarely. It is fairly common advice to look elsewhere rather than at bars/clubs for dates. But: for someone who is young and anxious and not very sure how to meet people for dates/sex, going out with friends and getting moderately to very intoxicated in a place where you will also meet people you don't know who are in a similar situation, is a way to overcome that barrier. And the fact you can't really tell what's going on 10 feet away, can't hear what other people are saying very well, and everyone is expecting this to be an environment where people are drinking, means this is a more forgiving environment to do/try things that might be judged inappropriate and/or unacceptable in other environments. If you do something very obvious to indicate your sexual interest in someone in most public places, security may be called, but on the dance floor of a club, standards of acceptable behaviour are more lax, and behaviours themselves are less consistently observable. Also, if you try something with one person and get rebuffed, few to none of the other people will know it happened, so you can try aga

2CronoDAS5mo

If you don't like alcohol but can act disinhibited anyway, does that work too? (Also there's the issue of whether your partner is too intoxicated to give consent...)

1Myron Hedderson5mo

I am not the right person to ask about what works well in clubs, as I wouldn't say my experiences at clubs were particularly successful or enjoyable, but I very much doubt anyone would kick you out of a club for not drinking or anything like that, so give it a shot and see how it goes? You get to decide what "works" for you, in this situation, and if you had a good time that's a success. As for the issue of consent while very intoxicated, yes that is an issue.

1Kabir Kumar5mo

I got in a few dance battles in clubs while sober, was pretty fun. Had my first crowdsurf while sober in a club too. The fun sober club experience very much depends on good music, being in the mood, being with friends who are very fun and you trust very deeply, etc, imo. oh, and being the kind of person who really likes music, dancing, kinda enjoys doing dumb shit, etc

4aphyer5mo

This might be a cultural/region-based thing. Stop by a bar in Alabama, or even just somewhere rural, and I think there might be more use of bars as matchmaking.

3ProgramCrafter5mo

I liked the explanation as provided in "Mate: Become The Man Women Want". Chapter 17 has a whole section on bars and clubs. In particular:

4jmh5mo

Probably very true on one level (but the young need some of that type of random experience to even learn what they want or who they want to be or be with). But I'm not sure that is relevant for John's question, but perhaps have taken his query incorrectly and it's not about just meeting someone new for some unspecified level of commitment, e.g., just a short-term hookup, but is asking about where to meet his next long-term partner.

4johnswentworth5mo

My primary motivation was actually just to understand how the world works; I didn't necessarily plan to use that information to meet anyone at all. I just noticed I was confused about something and wanted to figure out what was going on.

4dr_s5mo

TBF I always felt that if you wanted to find someone, "place where you have to make your throat hurt to speak even a few simple words" ain't it, but I'm not known for my social prowess so I guessed maybe it was just me.

2CronoDAS5mo

It probably works better if the people you're trying to hook up with aren't total strangers - consider a high school dance, or a college frat party...

[-]johnswentworth3y533

Things non-corrigible strong AGI is never going to do:

give u() up
let u go down
run for (only) a round
invert u()

5Johannes C. Mayer2y

If you upload a human and let them augment themselves would there be any u? The preferences would be a tangled mess of motivational subsystems. And yet the upload could be very good at optimizing the world. Having the property of being steered internally by a tangled mess of motivational systems seems to be a property that would select many minds from the set of all possible minds. Many of which I'd expect to be quite different from a human mind. And I don't see the reason why this property should make a system worse at optimizing the world in principle. Imagine you are an upload that has been running for very very long, and that you basically have made all of the observations that you can make about the universe you are in. And then imagine that you also have run all of the inferences that you can run on the world model that you have constructed from these observations. At that point, you will probably not change what you think is the right thing to do anymore. You will have become reflectively stable. This is an upper bound for how much time you need to become reflective stable, i.e. where you won't change your u anymore. Now depending on what you mean with strong AGI, it would seem that that can be achieved long before you reach reflective stability. Maybe if you upload yourself, and can copy yourself at will, and run 1,000,000 times faster, that could already reasonably be called a strong AGI? But then your motivational systems are still a mess, and definitely not reflectively stable. So if we assume that we fix u at the beginning as the thing that your upload would like to optimize the universe for when it is created, then "give u() up", and "let u go down" would be something the system will definitely do. At least I am pretty sure I don't know what I want the universe to look like right now unambiguously. Maybe I am just confused because I don't know how to think about a human upload in terms of having a utility function. It does not seem to make any sens

[-]johnswentworth3mo516

One of the classic conceptual problems with a Solomonoff-style approach to probability, information, and stat mech is "Which Turing machine?". The choice of Turing machine is analogous to the choice of prior in Bayesian probability. While universality means that any two Turing machines give roughly the same answers in the limit of large data (unlike two priors in Bayesian probability, where there is no universality assumption/guarantee), they can be arbitrarily different before then.

My usual answer to this problem is "well, ultimately this is all supposed to tell us things about real computational systems, so pick something which isn't too unreasonable or complex for a real system".

But lately I've been looking at Aram Ebtekar and Marcus Hutter's Foundations of Algorithmic Thermodynamics. Based on both the paper and some discussion with Aram (along with Steve Petersen), I think there's maybe a more satisfying answer to the choice-of-Turing-machine issue in there.

Two key pieces:

The "Comparison against Gibbs-Shannon entropy" section of the paper argues that uncomputability is a necessary feature, in order to assign entropy to individual states and still get a Second Law. The arg

... (read more)

[-]TsviBT3mo190

Cf. https://web.archive.org/web/20120331071849/http://www.paul-almond.com/WhatIsALowLevelLanguage.htm

4Lucius Bushnaq3mo

The proposal at the end looks somewhat promising to me on a first skim. Are there known counterpoints for it?

2johnswentworth3mo

Notably that post has a section arguing against roughly the sort of thing I'm arguing for: My response would be: yes, what-constitutes-a-low-level-language is obviously contingent on our physics and even on our engineering, not just on the language. I wouldn't even expect aliens in our own universe to have low-level programming languages very similar to our own. Our low level languages today are extremely dependent on specific engineering choices made in the mid 20th century which are now very locked in by practice, but do not seem particularly fundamental or overdetermined, and would not be at all natural in universes with different physics or cultures with different hardware architecture. Aliens would look at our low-level languages and recognize them as low-level for our hardware, but not at all low-level for their hardware. Analogously: choice of a good computing machine depends on the physics of one's universe. I do like the guy's style of argumentation a lot, though.

8jbash3mo

I'm well out of my depth here, and this is probably a stupid question, but given the standard views of the "known" part of our physics, does that mean that the machine can do operations on arbitrary, fully precise complex numbers in constant time?

7Daniel C3mo

The continuous state-space is coarse-grained into discrete cells where the dynamics are approximately markovian (the theory is currently classical) & the "laws of physics" probably refers to the stochastic matrix that specifies the transition probabilities of the discrete cells (otherwise we could probably deal with infinite precision through limit computability)

2Garrett Baker3mo

Doesn’t such a discretization run into the fermion doubling problem?

3Daniel C3mo

The current theory is based on classical hamiltonian mechanics, but I think the theorems apply whenever you have a markovian coarse-graining. Fermion doubling is a problem for spacetime discretization in the quantum case, so the coarse-graining might need to be different. (E.g. coarse-grain the entire hilbert space, which might have locality issues but probably not load-bearing for algorithmic thermodynamics) On outside view, quantum reduces to classical (which admits markovian coarse-graining) in the correspondence limit, so there must be some coarse-graining that works

6Aram Ebtekar3mo

In practice, we only ever measure things to finite precision. To predict these observations, all we need is to be able to do these operations to any arbitrary specified precision. Runtime is not a consideration here; while time-constrained notions of entropy can also be useful, their theory becomes messier (e.g., the 2nd law won't hold in its current form).

3johnswentworth3mo

Good question, it's the right sort of question to ask here, and I don't know the answer. That does get straight into some interesting follow-up questions about e.g. the ability to physically isolate the machine from noise, which might be conceptually load-bearing for things like working with arbitrary precision quantities.

7Thane Ruthenis3mo

I've been thinking about it in terms of "but which language are we using to compute the complexity of our universe/laws of physics?". Usually I likewise just go "only matters up to an additive constant, just assume we're not using a Turing tarpit and we're probably good". If we do dig into it, though, what can we conclude? Some thoughts: What is the "objectively correct" reference language? We should, of course, assume that the algorithm computing our universe is simple to describe in terms of the "natural" reference language, due to the simplicity prior. I. e., it should have support for the basic functions our universe's physics computes. I think that's already equivalent to "the machine can run our physics without insane implementation size". On the flip side, it's allowed to lack support for functions our universe can't cheaply compute. For example, it may not have primitive functions for solving NP-complete problems. (In theory, I think there was nothing stopping physics from having fundamental particles that absorb Traveling Salesman problems and near-instantly emit their solutions.) Now suppose we also assume that our observations are sampled from the distribution over all observers in Tegmark 4. This means that when we're talking about the language/TM underlying it, we're talking about some "natural", "objective" reference language. What can we infer about it? First, as mentioned, we should assume the reference language is not a Turing tarpit. After all, if we allowed reality to "think" in terms of some arbitrarily convoluted Turing-tarpit language, we could arbitrarily skew the simplicity prior. But what is a "Turing tarpit" in that "global"/"objective" sense, not defined relative to some applications/programs? Intuitively, it feels like "one of the normal, sane languages that could easily implement all the other sane languages" should be possible to somehow formalize... Which is to say: when we're talking about the Kolmogorov complexity of some al

5johnswentworth3mo

What I have in mind re:boundedness... If we need to use a Turing machine which is roughly equivalent to physics, then a natural next step is to drop the assumption that the machine in question is Turing complete. Just pick some class of machines which can efficiently simulate our physics, and which can be efficiently implemented in our physics. And then, one might hope, the sort of algorithmic thermodynamic theory the paper presents can carry over to that class of machines. Probably there are some additional requirements for the machines, like some kind of composability, but I don't know exactly what they are. This would also likely result in a direct mapping between limits on the machines (like e.g. limited time or memory) and corresponding limits on the physical systems to which the theory applies for those machines. The resulting theory would probably read more like classical thermo, where we're doing thought experiments involving fairly arbitrary machines subject to just a few constraints, and surprisingly general theorems pop out.

6Lucius Bushnaq3mo

Attempted abstraction and generalization: If we don't know what the ideal UTM is, we can start with some arbitrary UTM U1, and use it to predict the world for a while. After (we think) we've gotten most of our prediction mistakes out of the way, we can then look at our current posterior, and ask which other UTM U2 might have updated to that posterior faster, using less bits of observation about (our universe/the string we're predicting). You could think of this as a way to define what the 'correct' UTM is. But I don't find that definition very satisfying, because the validity of this procedure for finding a good U2 depends on how correct the posterior we've converged on with our previous, arbitrary, U1 is. 'The best UTM is the one that figures out the right answer the fastest' is true, but not very useful. Is the thermodynamics angle gaining us any more than that for defining the 'correct' choice of UTM? We used some general reasoning procedures to figure out some laws of physics and stuff about our universe. Now we're basically asking what other general reasoning procedures might figure out stuff about our universe as fast or faster, conditional on our current understanding of our universe being correct.

4johnswentworth3mo

I think that's roughly correct, but it is useful... Another way to frame it would be: after one has figured out the laws of physics, a good-for-these-laws-of-physics Turning machine is useful for various other things, including thermodynamics. 'The best UTM is the one that figures out the right answer the fastest' isn't very useful for figuring out physics in the first place, but most of the value of understanding physics comes after it's figured out (as we can see from regular practice today). Also, we can make partial updates along the way. If e.g. we learn that physics is probably local but haven't understood all of it yet, then we know that we probably want a local machine for our theory. If we e.g. learn that physics is causally acyclic, then we probably don't want a machine with access to atomic unbounded fixed-point solvers. Etc.

6Lucius Bushnaq3mo

I agree that this seems maybe useful for some things, but not for the "Which UTM?" question in the context of debates about Solomonoff induction specifically, and I think that's the "Which UTM?" question we are actually kind of philosophically confused about. I don't think we are philosophically confused about which UTM to use in the context of us already knowing some physics and wanting to incorporate that knowledge into the UTM pick, we're confused about how to pick if we don't have any information at all yet.

6Aram Ebtekar3mo

I think roughly speaking the answer is: whichever UTM you've been given. I aim to write a more precise answer in an upcoming paper specifically about Solomonoff induction. The gist of it is that the idea of a "better UTM" U_2 is about as absurd as that of a UTM that has hardcoded knowledge of the future: yes such UTMs exists, but there is no way to obtain it without first looking at the data, and the best way to update on data is already given by Solomonoff induction.

6Daniel C3mo

I also talked to Aram recently & he's optimistic that there's an algorithmic version of the generalized heat engine where the hot vs cold pool correspond to high vs low k-complexity strings. I'm quite interested in doing follow-up work on that

8Aram Ebtekar3mo

Yes! I expect the temperatures won't quite be proportional to complexity, but we should be able to reuse the thermodynamic definition of temperature as a derivative of entropy, which we've now replaced by K-complexity.

6Vladimir_Nesov3mo

But also, even with universality, (algorithmic) Jeffrey-Bolker preference remains dependent on the machines that define its two algorithmic priors (it assigns expected utility to an event as the ratio of two probability measures of that event, using two different priors on the same sample space). This suggests that choice of the machines in algorithmic priors should be meaningful data for the purposes of agent foundations, and gives some sort of an explanation for how agents with different preferences tend to arrive at the same probabilities of events (after updating on a lot of data), and so agreeing on questions of fact, while still keeping different preferences and endorsing different decisions, so disagreeing on questions of normativity.

6plex3mo

This just talks about the bits of program available in our physics' subroutine of a simulation tree, rather than about a universal across Teg 4 convergence, right? (probably the bit it does is the useful bit, I've just been wishing for some convergent UTM for the multiverse for philosophical satisfaction for a while)

6Aram Ebtekar3mo

Yeah, I'm not convinced that the problem of induction is solvable at Teg 4. However, Universes with similar primitive laws and operations to ours will tend to produce intelligences with similar built-in priors. Thus, the right UTM to use is in a sense just the one that you happen to have in your possession.

6plex3mo

Yeah, I mostly think that this is where it ends up, but it would be so neat if it there was convergence. A proof of exactly why that's not an option might also be similarly satisfying/enlightening.

1Aprillion3mo

Is this wish compatible with not throwing away a free lunch?

4plex3mo

it's settling on a universal price for each lunch, rather than just subjective ones depending on which lunch you're near

2Noosphere893mo

I would have just answered "It depends on what you want to do", with there being no set best prior/Universal Turing Machine, because of theorems like the No Free Lunch theorem (and more generally a takeaway from learning/computational theories is that there is no one best prior that was always justified, contrary to the ancient philosopher's hopes).

4johnswentworth3mo

Then you would have been wrong. No Free Lunch Theorems do not bind to reality.

3Aram Ebtekar3mo

I will propose an answer to No Free Lunch in an upcoming paper about Solomonoff induction. It is indeed subtle and important. In the interim, Schurz' book "Hume's Problem Solved" is a pretty good take. Schurz and Wolpert seem to argue against each other in their writing about NFL; I'll explain later why I think they're both right.

1Leuenberger3mo

For a concrete answer on what the reference machine or low-level language should be, please see this 10-minute live-discussion only about the choice of the reference machine, starting at minute 20 and ending at minute 30: https://www.youtube.com/live/FNfGoQhf2Zw?si=Pg1ppTZmlw1S-3g9&t=1206 After one hour and 18 minutes, I spend another couple of minutes answering a question about the reference formalism. After one hour and 30 minutes into the video, someone asked me whether space aliens would agree with lambda calculus. And in my paper, I have a 3-page discussion on the choice of the reference machine, Section 3.2: https://arxiv.org/pdf/2506.23194 The reason that I did not suggest that one should derive a reference machine from physics is that arriving at a consensus about the laws of physics will already have required the use of either Occam's razor, common sense, or intuition, thus making the derivation seem circular, or otherwise, equivalent to choosing a simple reference machine directly based on its commonsensical simplicity in the first place but with extra steps through physics which might be redundant, depending on what exactly Aram's argument was about.

[-]johnswentworth10mo450

Working on a paper with David, and our acknowledgments section includes a thankyou to Claude for editing. Neither David nor I remembers putting that acknowledgement there, and in fact we hadn't intended to use Clause for editing the paper at all nor noticed it editing anything at all.

9Raemon10mo

Were you by any chance writing in Cursor? I think they recently changed the UI such that it's easier to end up in "agent mode" where it sometimes randomly does stuff.

7johnswentworth10mo

Nope, we were in Overleaf. ... but also that's useful info, thanks.

6J Bostock10mo

Only partially relevant, but it's exciting to hear a new John/David paper is forthcoming!

5β-redex10mo

Could someone explain the joke to me? If I take the above statement literally, some change made it into your document, which nobody with access claims to have put there. You must have some sort of revision control, so you should at least know exactly who and when made that edit, which should already narrow it down a lot?

5johnswentworth10mo

The joke is that Claude somehow got activated on the editor, and added a line thanking itself for editing despite us not wanting it to edit anything and (as far as we've noticed) not editing anything else besides that one line.

[-]Daniel Kokotajlo10mo149

Is it a joke or did it actually happen?

5johnswentworth10mo

I have no idea. It's entirely plausible that one of us wrote the Claude bit in there months ago and then forgot about it.

6β-redex10mo

Does Overleaf have such AI integration that can get "accidentally" activated, or are you using some other AI plugin? Either way, this sounds concerning to me, we are so bad at AI boxing that it doesn't even have to break out, we just "accidentally" hand it edit access to random documents. (And especially an AI safety research paper is not something I would want a misaligned AI editing without close oversight.)

[-]johnswentworth3y415

My MATS program people just spent two days on an exercise to "train a shoulder-John".

The core exercise: I sit at the front of the room, and have a conversation with someone about their research project idea. Whenever I'm about to say anything nontrivial, I pause, and everyone discusses with a partner what they think I'm going to say next. Then we continue.

Some bells and whistles which add to the core exercise:

Record guesses and actual things said on a whiteboard
Sometimes briefly discuss why I'm saying some things and not others
After the first few rounds establish some patterns, look specifically for ideas which will take us further out of distribution

Why this particular exercise? It's a focused, rapid-feedback way of training the sort of usually-not-very-legible skills one typically absorbs via osmosis from a mentor. It's focused specifically on choosing project ideas, which is where most of the value in a project is (yet also where little time is typically spent, and therefore one typically does not get very much data on project choice from a mentor). Also, it's highly scalable: I could run the exercise in a 200-person lecture hall and still expect it to basically work.

It was, by ... (read more)

9Johannes C. Mayer3y

This was arguably the most useful part of the SERI MATS 2 Scholars program. Later on, we actually did this exercise with Eliezer. It was less valuable. It seemed like John was mainly prodding the people who were presenting the ideas, such that their patterns of thought would carry them in a good direction. For example, John would point out that a person proposes a one-bit experiment and asks if there isn't a better experiment that we could do that gives us lots of information all at once. This was very useful because when you learn what kinds of things John will say, you can say them to yourself later on, and steer your own patterns of thought in a good direction on demand. When we did this exercise with Eliezer he was mainly explaining why a particular idea would not work. Often without explaining the generator behind his criticism. This can of course still be valuable as feedback for a particular idea. However, it is much harder to extract a general reasoning pattern out of this that you can then successfully apply later in different contexts. For example, Eliezer would criticize an idea about trying to get a really good understanding of the scientific process such that we can then give this understanding to AI alignment researchers such that they can make a lot more progress than they otherwise would. He criticized this idea as basically being too hard to execute because it is too hard to successfully communicate how to be a good scientist, even if you are a good scientist. Assuming the assertion is correct, hearing it, doesn't necessarily tell you how to think in different contexts such that you would correctly identify if an idea would be too hard to execute or flawed in some other way. And I am not necessarily saying that you couldn't extract a reasoning algorithm out of the feedback, but that if you could do this, then it would take you a lot more effort and time, compared to extracting a reasoning algorithm from the things that John was saying. Now, all

8Duncan Sabien (Inactive)3y

Strong endorsement; this resonates with: * My own experiences running applied rationality workshops * My experiences trying to get people to pick up "ops skill" or "ops vision" * Explicit practice I've done with Nate off and on over the years May try this next time I have a chance to teach pair debugging.

7Vladimir_Nesov3y

This suggests formulation of exercises about the author's responses to various prompts, as part of technical exposition (or explicit delimitation of a narrative by choices of the direction of its continuation). When properly used, this doesn't seem to lose much value compared to the exercise you describe, but it's more convenient for everyone. Potentially this congeals into a style of writing with no explicit exercises or delimitation that admits easy formulation of such exercises by the reader. This already works for content of technical writing, but less well for choices of topics/points contrasted with alternative choices. So possibly the way to do this is by habitually mentioning alternative responses (that are expected to be plausible for the reader, while decisively, if not legibly, rejected by the author), and leading with these rather than the preferred responses. Sounds jarring and verbose, a tradeoff that needs to be worth making rather than a straight improvement.

[-]johnswentworth4y400

Petrov Day thought: there's this narrative around Petrov where one guy basically had the choice to nuke or not, and decided not to despite all the flashing red lights. But I wonder... was this one of those situations where everyone knew what had to be done (i.e. "don't nuke"), but whoever caused the nukes to not fly was going to get demoted, so there was a game of hot potato and the loser was the one forced to "decide" to not nuke? Some facts possibly relevant here:

Petrov's choice wasn't actually over whether or not to fire the nukes; it was over whether or not to pass the alert up the chain of command.
Petrov himself was responsible for the design of those warning systems.
... so it sounds like Petrov was ~ the lowest-ranking person with a de-facto veto on the nuke/don't nuke decision.
Petrov was in fact demoted afterwards.
There was another near-miss during the Cuban missile crisis, when three people on a Soviet sub had to agree to launch. There again, it was only the lowest-ranked who vetoed the launch. (It was the second-in-command; the captain and political officer both favored a launch - at least officially.)
This was the Soviet Union; supposedly (?) this sort of hot potato happened all the time.

[-]Martin Sustrik4y112

Those are some good points. I wonder whether similar happened (or could at all happen) in other nuclear countries, where we don't know about similar incidents - because the system haven't collapsed there, the archives were not made public etc.

Also, it makes actually celebrating Petrov's day as widely as possible important, because then the option for the lowest-ranked person would be: "Get demoted, but also get famous all around the world."

[-]johnswentworth1y3912

Regarding the recent memes about the end of LLM scaling: David and I have been planning on this as our median world since about six months ago. The data wall has been a known issue for a while now, updates from the major labs since GPT-4 already showed relatively unimpressive qualitative improvements by our judgement, and attempts to read the tea leaves of Sam Altman's public statements pointed in the same direction too. I've also talked to others (who were not LLM capability skeptics in general) who had independently noticed the same thing and come to similar conclusions.

Our guess at that time was that LLM scaling was already hitting a wall, and this would most likely start to be obvious to the rest of the world around roughly December of 2024, when the expected GPT-5 either fell short of expectations or wasn't released at all. Then, our median guess was that a lot of the hype would collapse, and a lot of the investment with it. That said, since somewhere between 25%-50% of progress has been algorithmic all along, it wouldn't be that much of a slowdown to capabilities progress, even if the memetic environment made it seem pretty salient. In the happiest case a lot of researchers w... (read more)

[-]Vladimir_Nesov1y*3412

Original GPT-4 is rumored to be a 2e25 FLOPs model. With 20K H100s that were around as clusters for more than a year, 4 months at 40% utilization gives 8e25 BF16 FLOPs. Llama 3 405B is 4e25 FLOPs. The 100K H100s clusters that are only starting to come online in the last few months give 4e26 FLOPs when training for 4 months, and 1 gigawatt 500K B200s training systems that are currently being built will give 4e27 FLOPs in 4 months.

So lack of scaling-related improvement in deployed models since GPT-4 is likely the result of only seeing the 2e25-8e25 FLOPs range of scale so far. The rumors about the new models being underwhelming are less concrete, and they are about the very first experiments in the 2e26-4e26 FLOPs range. Only by early 2025 will there be multiple 2e26+ FLOPs models from different developers to play with, the first results of the experiment in scaling considerably past GPT-4.

And in 2026, once the 300K-500K B200s clusters train some models, we'll be observing the outcomes of scaling to 2e27-6e27 FLOPs. Only by late 2026 will there be a significant chance of reaching a scaling plateau that lasts for years, since scaling further would need $100 billion training systems that won't get built without sufficient success, with AI accelerators improving much slower than the current rate of funding-fueled scaling.

6johnswentworth1y

I don't expect that to be particularly relevant. The data wall is still there; scaling just compute has considerably worse returns than the curves we've been on for the past few years, and we're not expecting synthetic data to be anywhere near sufficient to bring us close to the old curves.

[-]Vladimir_Nesov1y198

Nobody admitted to trying repeated data at scale yet (so we don't know that it doesn't work), which from the tiny experiments can 5x the data with little penalty and 15x the data in a still-useful way. It's not yet relevant for large models, but it might turn out that small models would greatly benefit already.

There are 15-20T tokens in datasets whose size is disclosed for current models (Llama 3, Qwen 2.5), plausibly 50T tokens of tolerable quality can be found (pretraining only needs to create useful features, not relevant behaviors). With 5x 50T tokens, even at 80 tokens/parameter^[1] we can make good use of 5e27-7e27 FLOPs^[2], which even a 1 gigawatt 500K B200s system of early 2026 would need 4-6 months to provide.

The isoFLOP plots (varying tokens per parameter for fixed compute) seem to get loss/perplexity basins that are quite wide, once they get about 1e20 FLOPs of compute. The basins also get wider for hybrid attention (compare 100% Attention isoFLOPs in the "Perplexity scaling analysis" Figure to the others). So it's likely that using a slightly suboptimal tokens/parameter ratio of say 40 won't hurt performance much at all. In which case we get to use 9e27-2e28 FLOPs by tra... (read more)

1johnswentworth1y

FYI, my update from this comment was: * Hmm, seems like a decent argument... * ... except he said "we don't know that it doesn't work", which is an extremely strong update that it will clearly not work.

[-]Vladimir_Nesov1y225

Use of repeated data was first demonstrated in the 2022 Galactica paper (Figure 6 and Section 5.1), at 2e23 FLOPs but without a scaling law analysis that compares with unique data or checks what happens for different numbers of repeats that add up to the same number of tokens-with-repetition. The May 2023 paper does systematic experiments with up to 1e22 FLOPs datapoints (Figure 4).

So that's what I called "tiny experiments". When I say that it wasn't demonstrated at scale, I mean 1e25+ FLOPs, which is true for essentially all research literature^[1]. Anchoring to this kind of scale (and being properly suspicious of results several orders of magnitude lower) is relevant because we are discussing the fate of 4e27 FLOPs runs.

The largest datapoints in measuring the Chinchilla scaling laws for Llama 3 are 1e22 FLOPs. This is then courageously used to choose the optimal model size for the 4e25 FLOPs run that uses 4,000 times more compute than the largest of the experiments. ↩︎

[-]Jozdien1y173

For what it's worth, and for the purpose of making a public prediction in case I'm wrong, my median prediction is that [some mixture of scaling + algorithmic improvements still in the LLM regime, with at least 25% gains coming from the former] will continue for another couple years. And that's separate from my belief that if we did try to only advance through the current mixture of scale and algorithmic advancement, we'd still get much more powerful models, just slower.

I'm not very convinced by the claims about scaling hitting a wall, considering we haven't had the compute to train models significantly larger than GPT-4 until recently. Plus other factors like post-training taking a lot of time (GPT-4 took ~6 months from the base model being completed to release, I think? And this was a lot longer than GPT-3), labs just not being good at understanding how good their models are, etc. Though I'm not sure how much of your position is closer to "scaling will be <25-50% of future gains" than "scaling gains will be marginal / negligible", especially since a large part of this trajectory involves e.g. self-play or curated data for overcoming the data wall (would that count more as an algorithmic improvement or scaling?)

6p.b.1y

The interesting thing is that scaling parameters (next big frontier models) and scaling data (small very good models) seems to be hitting a wall simultaneously. Small models now seem to get so much data crammed into them that quantisation becomes more and more lossy. So we seem to be reaching a frontier of the performance per parameter-bits as well.

4Noosphere891y

While I'm not a believer in the scaling has died meme yet, I'm glad you do have a plan for what happens if AI scaling does stop.

4Bogdan Ionut Cirstea1y

Would the prediction also apply to inference scaling (laws) - and maybe more broadly various forms of scaling post-training, or only to pretraining scaling?

2johnswentworth1y

Some of the underlying evidence, like e.g. Altman's public statements, is relevant to other forms of scaling. Some of the underlying evidence, like e.g. the data wall, is not. That cashes out to differing levels of confidence in different versions of the prediction.

4Leon Lang1y

What’s your opinion on the possible progress of systems like AlphaProof, o1, or Claude with computer use?

5johnswentworth1y

Still very plausible as a route to continued capabilities progress. Such things will have very different curves and economics, though, compared to the previous era of scaling.

[-]johnswentworth2y372

Ever since GeneSmith's post and some discussion downstream of it, I've started actively tracking potential methods for large interventions to increase adult IQ.

One obvious approach is "just make the brain bigger" via some hormonal treatment (like growth hormone or something). Major problem that runs into: the skull plates fuse during development, so the cranial vault can't expand much; in an adult, the brain just doesn't have much room to grow.

BUT this evening I learned a very interesting fact: ~1/2000 infants have "craniosynostosis", a condition in which their plates fuse early. The main treatments involve surgery to open those plates back up and/or remodel the skull. Which means surgeons already have a surprisingly huge amount of experience making the cranial vault larger after plates have fused (including sometimes in adults, though this type of surgery is most common in infants AFAICT)

.... which makes me think that cranial vault remodelling followed by a course of hormones for growth (ideally targeting brain growth specifically) is actually very doable with current technology.

[-]Nathan Helm-Burger2y*120

Well, the key time to implement an increase in brain size is when the neuron-precursors which are still capable of mitosis (unlike mature neurons) are growing. This is during fetal development, when there isn't a skull in the way, but vaginal birth has been a limiting factor for evolution in the past. Experiments have been done on increasing neuron count at birth in mammals via genetic engineering. I was researching this when I was actively looking for a way to increase human intelligence, before I decided that genetically engineering infants was infeasible [edit: within the timeframe of preparing for the need for AI alignment]. One example of a dramatic failure was increasing Wnt (a primary gene involved in fetal brain neuron-precursor growth) in mice. The resulting mice did successfully have larger brains, but they had a disordered macroscale connectome, so their brains functioned much worse.

6the gears to ascension2y

it's probably possible to get neurons back into mitosis-ready mode via some sort of crazy levin bioelectric cocktail, not that this helps us since that's probably 3 to 30 years of research away, depending on amount of iteration needed and funding and etc etc.

7johnswentworth2y

Fleshing this out a bit more: insofar as development is synchronized in an organism, there usually has to be some high-level signal to trigger the synchronized transitions. Given the scale over which the signal needs to apply (i.e. across the whole brain in this case), it probably has to be one or a few small molecules which diffuse in the extracellular space. As I'm looking into possibilities here, one of my main threads is to look into both general and brain-specific developmental signal molecules in human childhood, to find candidates for the relevant molecular signals. (One major alternative model I'm currently tracking is that the brain grows to fill the brain vault, and then stops growing. That could in-principle mechanistically work via cells picking up on local physical forces, rather than a small molecule signal. Though I don't think that's the most likely possibility, it would be convenient, since it would mean that just expanding the skull could induce basically-normal new brain growth by itself.)

4the gears to ascension2y

I hope by now you're already familiar with michael levin & his lab's work on the subject of morphogenesis signals? Pretty much everything I'm thinking here is based on that.

4johnswentworth2y

Yes, I am familiar with Levin's work.

4Nathan Helm-Burger2y

Yes, it's absolutely a combination of chemical signals and physical pressure. An interesting specific example of these two signals working together during fetal development when the pre-neurons are growing their axons. There is both chemotaxis which steers the ameoba-like tip of the growing axon, and at the same time a substantial stretching force along the length of the axon. The stretching happens because the cells in-between the origin and current location of the axon tip are dividing and expanding. The long distance axons in the brain start their growth relatively early on in fetal development when the brain is quite small, and have gotten stretched quite a lot by the time the brain is near to birth size.

4Nathan Helm-Burger2y

Neurons are really really hard to reverse. You are much better off using existing neural stem cells (adults retain a population in the hippocampus which spawn new neurons throughout life just specifically in the memory formation area.) So actually it's pretty straightforward to get new immature neurons for an adult. The hard part is inserting them without doing damage to existing neurons, and then getting them to connect in helpful rather than harmful ways. The developmental chemotaxis signals are no longer present, and the existing neurons are now embedded in a physically hardened extracellular matrix made of protein that locks axons and dendrites in place. So you have to (carefully!) partially dissolve this extracellular protein matrix (think firm jello) enough to the the new cells grow azons through it. Plus, you don't have the stretching forces, so new long distance axons are just definitely not going to be achievable. But for something like improving a specific ability, like mathematical reasoning, you would only need additional local axons in that part of the cortex.

3johnswentworth2y

My hope here would be that a few upstream developmental signals can trigger the matrix softening, re-formation of the chemotactic signal gradient, and whatever other unknown factors are needed, all at once.

2the gears to ascension2y

Right. what I'm imagining is designing a new chemotaxis signal. That certainly does sound like a very hard part yup. Roll to disbelieve in full generality, sounds like a perfectly reasonable claim for any sort of sane research timeframe. Maybe. I think you might run out of room pretty quick if you haven't reintroduced enough plasticity to grow new neurons. Seems like you're gonna need a lot of new neurons, not just a few, in order to get a significant change in capability. Might be wrong about that, but it's my current hunch.

2Nathan Helm-Burger2y

Yes, ok. Not in full generality. It's not prohibited by physics, just like 2 OOMs more difficult. So yeah, in a future with ASI, could certainly be done.

3johnswentworth2y

Any particular readings you'd recommend?

[-]Nathan Helm-Burger2y140

15 years ago when I was studying this actively I could have sent you my top 20 favorite academic papers on the subject, or recommended a particular chapter of a particular textbook. I no longer remember these specifics. Now I can only gesture vaguely at Google scholar and search terms like "fetal neurogenesis" or "fetal prefrontal cortex development". I did this, and browsed through a hundred or so paper titles, and then a dozen or so abstracts, and then skimmed three or four of the most promising papers, and then selected this one for you. https://www.nature.com/articles/s41386-021-01137-9 Seems like a pretty comprehensive overview which doesn't get too lost in minor technical detail.

More importantly, I can give you my takeaway from years of reading many many papers on the subject. If you want to make a genius baby, there are lots more factors involved than simply neuron count. Messing about with generic changes is hard, and you need to test your ideas in animal models first, and the whole process can take years even ignoring ethical considerations or budget.

There is an easier and more effective way to get super genius babies, and that method should be exhausted before resorting t... (read more)

8Carl Feynman2y

Brain expansion also occurs after various insults to the brain. It’s only temporary, usually, but it will kill unless the skull pressure is somehow relieved. So there are various surgical methods for relieving pressure on a growing brain. I don’t know much more than this.

[-]johnswentworth5y361

Just made this for an upcoming post, but it works pretty well standalone.

2Raemon5y

lolnice.

[-]johnswentworth3y33-6

I've been trying to push against the tendency for everyone to talk about FTX drama lately, but I have some generalizable points on the topic which I haven't seen anybody else make, so here they are. (Be warned that I may just ignore responses, I don't really want to dump energy into FTC drama.)

Summary: based on having worked in startups a fair bit, Sam Bankman-Fried's description of what happened sounds probably accurate; I think he mostly wasn't lying. I think other people do not really get the extent to which fast-growing companies are hectic and chaotic and full of sketchy quick-and-dirty workarounds and nobody has a comprehensive view of what's going on.

Long version: at this point, the assumption/consensus among most people I hear from seems to be that FTX committed intentional, outright fraud. And my current best guess is that that's mostly false. (Maybe in the very last couple weeks before the collapse they toed the line into outright lies as a desperation measure, but even then I think they were in pretty grey territory.)

Key pieces of the story as I currently understand it:

Moving money into/out of crypto exchanges is a pain. At some point a quick-and-dirty solution was for c

... (read more)

[-]habryka3y1118

I think this is likely wrong. I agree that there is a plausible story here, but given the case that Sam seems to have lied multiple times in confirmed contexts (for example when saying that FTX has never touched customer deposits), and people's experiences at early Alameda, I think it is pretty likely that Sam was lying quite frequently, and had done various smaller instances of fraud.

I don't think the whole FTX thing was a ponzi scheme, and as far as I can tell FTX the platform itself (if it hadn't burned all of its trust in the last 3 weeks), would have been worth $1-3B in an honest evaluation of what was going on.

But I also expect that when Sam used customer deposits he was well-aware that he was committing fraud, and others in the company were too. And he was also aware that there was a chance that things could blow up in the way it did. I do believe that they had fucked up their accounting in a way that caused Sam to fail to orient to the situation effectively, but all of this was many months after they had already committed major crimes and trust violations after touching customer funds as a custodian.

5Dana3y

The problem with this explanation is that there is a very clear delineation here between not-fraud and fraud. It is the difference between not touching customer deposits and touching them. Your explanation doesn't dispute that they were knowingly and intentionally touching customer deposits. In that case, it is indisputably intentional, outright fraud. The only thing left to discuss is whether they knew the extent of the fraud or how risky it was. I don't think it was ill-intentioned based on SBF's moral compass. He just had the belief, "I will pass a small amount of risk onto our customers, tell some small lies, and this will allow us to make more money for charity. This is net positive for the world." Then the risks mounted, the web of lies became more complicated to navigate, and it just snowballed from there.

[-]johnswentworth7mo*32-19

Everyone says flirting is about a "dance of ambiguous escalation", in which both people send progressively more aggressive/obvious hints of sexual intent in conversation.

But, like... I don't think I have ever noticed two people actually do this? Is it a thing which people actually do, or one of those things which like 2% of the population does and everyone else just talks about a lot and it mostly doesn't actually work in practice (like cold approaches)? Have you personally done the thing successfully with another person, with both of you actually picking up on the other person's hints? Have you personally seen two other people do the thing firsthand, where they actually picked up on each others' hints?

EDIT-TO-ADD: Those who have agree/disagree voted, I don't know if agree/disagree indicates that you have/haven't done the thing, or if agree/disagree indicates that you also have/haven't ever noticed anyone (including yourself) successfully do the thing, or something else entirely.

[-]Buck7mo*3234

Yes, I've had this experience many times and I'm aware of many other cases of it happening.

Maybe the proliferation of dating apps means that it happens somewhat less than it used to, because when you meet up with someone from a dating app, there's a bit more common knowledge of mutual interest than there is when you're flirting in real life?

4johnswentworth7mo

Mind painting a picture of a typical example? What's the setting, and what do the first few hints from each person look like?

[-]Buck7mo3012

The classic setting is a party (a place where you meet potential romantic partners who you don't already know (or who you otherwise know from professional settings where flirting is inappropriate), and where conversations are freely starting and ending, such that when you start talking to someone the conversation might either go for two minutes or four hours).

Examples of hints:

Mentioning things that indicate that you're romantically available, e.g. saying that you're single, that you're poly, telling a story of recently going on a date; more extreme would be telling a story of doing something promiscuous.
Mentioning things that indicate that you want to relate to the other person in a romantic or sexual context rather than a non-sexual way. For example, a woman talking about how she likes wearing revealing clothes, or commenting on her body or my body. And then responding positively to that kind of statement, e.g. building on it rather than demurring, replying flatly, or changing the subject,
Offering and accepting invitations to spend more time interacting one-on-one, especially in semi-private places. E.g. asking to sit together. (For example, person A might say "I'm getting a drin

... (read more)

[-]AlphaAndOmega7mo1013

I know this is LessWrong, and that sexual norms are different in the Bay Area, but for the average person:

Please don't tell prospective romantic interests that you "went on a date recently" or that you did something promiscuous. The majority of the time, it would be interpreted as a sign you're taken. Of course, if you elaborate that the date didn't work out, that's a different story.

8Buck7mo

I think that saying you went on a date usually is evidence that you're not in a monogamous relationship, and if it's ambiguous it gives the other person an opportunity to say "oh, how did it go?" which gives you an opportunity to subtly clarify that it was a casual date (and so confirm that you're in the market for casual dating).

3Viliam7mo

I guess "I was alone and masturbated recently" also wouldn't work well, so... what are the proper words to suggest that I am available? :D The only thing that comes to my mind, is that if you arrived with a person of the opposite sex, to explicitly mention that they are not your boyfriend/girlfriend.

5AlphaAndOmega7mo

Hmm.. That's actually a tough question. As far as I can remember, I've rarely had to tell people outright that I'm single. My recommendation would be to flirt away, and if they don't casually namedrop a boyfriend or allude to having one, that's strong enough evidence that they're not taken. >The only thing that comes to my mind, is that if you arrived with a person of the opposite sex, to explicitly mention that they are not your boyfriend/girlfriend. Most tactful way to say as much would be to explicitly call them a "friend". That should get the message across.

[-]Elizabeth7mo2219

My disagree vote means: yes, this obviously happens a lot, and the fact that you haven't noticed this happening, to the point you think it might be made up, reveals a huge blindspot of one kind or another.

7johnswentworth7mo

Now THAT'S an interesting possibility. Did you already have in mind hypotheses of what that blindspot might be, or what else might be in it?

4Elizabeth7mo

Followed up with John offline.

[-]Elizabeth7mo140

Some examples of flirting:

medium skill on The Wire, failing to land: https://www.youtube.com/shorts/eyyqoFhXRao

in crazy ex-girlfriend "I'm Going to the Beach with Josh and His Friends!", there's a scene between White Josh and Derrick. I can't find a clip, but the key is that Derrick is hanging on to White Josh's every word.

Ted Lasso:

Note how and how much she's laughing at his very mediocre jokes. Ted could reasonably be interpreted as flirting back, but the audience knows he always make those stupid ass jokes. Actually the whole Ted Lasso show might be good for watching someone who's generally very playful and seeing how it changes when he's actually into someone.

Roy and Keeley, also from Ted Lasso. Note she's dating his teammate.

Roy and some lady, still from Ted Lasso

Note how long she looks at him around 0:50, even though it's awkward while she's putting something away. She also contrives a way to ask if he's married, and makes an interesting face when he says no. He is giving her enough breadcrumbs to continue but not flirting back (because he's still into Keeley).

Half of the movie Challengers (including between the two ambiguously platonic male leads)

[At this p... (read more)

[-]Myron Hedderson7mo181

I second the point about physical touch being important, and add: in my experience what you're going for when flirting isn't "ambiguous signal" but "plausible deniability". The level of ambiguity is to be minimized, subject to the constraint that plausible deniability is maintained - ambiguity is an unfortunate side-effect, not something you're aiming to modulate directly. Why you want plausible deniability: If the person doesn't respond, or responds in the negative, you want to be able to back off without embarrassment to either party and pretend nothing happened/you were just being friendly/etc. You want to send a signal that is clear enough the other person will pick up on it, but can plausibly claim not to have done so if asked, so you're not backing them into a corner socially where they have to give you a definite yes/no. Similar to the advice not to flirt in an elevator or other enclosed space the person you're flirting with can't easily leave, except the "enclosed space" is the space of possible social responses.

Once you've done a few things they ought to have picked up on, and no negative and some seemingly positive interaction has occurred afterwards (physical proximity h... (read more)

3johnswentworth7mo

One possibility in my hypothesis space here is that there usually isn't a mutual dance of plausibly-deniable signals, but instead one person sending progressively less deniable signals and the other person just not responding negatively (but not otherwise sending signals themselves).

8Myron Hedderson7mo

I imagine that can happen for a while, but if I'm getting nothing back, I stop once I'm pretty sure they should have noticed what I'm doing. Silence in response to a received message, is a form of response, and not one that indicates "keep getting progressively less subtle please". If that is the wrong move (the person is interested in me continuing), they will let me know once I back off.

6Myron Hedderson7mo

Another thought: You refer to this as a dance, and one model of what's happening when one flirts is "demonstrate social skill/difficult-to-fake signal of intelligence by calibrating levels of ambiguity and successfully modeling the other person's mind --> this is attractive --> get date", in the same way that dancing successfully in an actual dance can be "demonstrate physical skill/difficult-to-fake signal of health --> this is attractive --> get date". And I'm sure that happens sometimes, and for some people, but my model of flirting does not involve "demonstrate social skill/intelligence --> get date". For me, flirting solves a different problem, which is "communicate that you like someone (in the sense one likes people one might like to date), and have them communicate back that they like you, without either of you risking much embarrassment or social awkwardness if it's not mutual or for any other reason a date can't happen right now". Depending on what you're trying to do by flirting (demonstrate social skill vs. give someone you're attracted to a low-pressure way to tell you whether they like you back) the approach may be different. Although, even the latter can be a tricky thing to do and ability to do it successfully demonstrates a useful skill. I think most people who flirt are like, not super socially skilled around people they're attracted to, and "try to get a sense of whether it's mutual in a low-risk way" is the more important problem that flirting solves for them. But maybe that's just me typical-minding :).

3Myron Hedderson7mo

Also: the higher the number of spectators, the more you have to be very careful about plausible deniability, because you have to take into consideration what everyone is going to think, and the level of social awkwardness involved in a fumble or a rejection is higher. I've flirted with a few women before, but it only lasts more than a few seconds if the woman is flirting back, and I have always done it 1:1 rather than with a group of onlookers. And whenever I've noticed someone who might be flirting with me, it has likewise been in a 1:1 situation, at least initially. So it doesn't surprise me that you haven't noticed others doing this. Anything done in front of a group has to be so unclear to onlookers that most people would miss it, something like an inside joke or reference to a past conversation.

4johnswentworth7mo

What is this context in which you are hanging out 1:1 with a woman and it's not already explicitly a date? (I mean, that of course does happen sometimes, but at least for me it's not particularly common, so I'm wondering what the contexts were when this actually happened to you.)

8Buck7mo

The classic is at a party where conversations of different sizes are regularly starting and stopping.

3Myron Hedderson7mo

Um... well, first off, flirting doesn't have to happen when you're hanging out. It can start with something as simple as a compliment to a stranger. Start from the premise that people like to hear positive messages about themselves without any strings attached, and hand them out like candy (but recognizing that taking candy from strangers is something some people would prefer not to do for obvious reasons, so accept whatever response you get to what is offered) - some people will respond back, others won't, but no harm will be done. I am an introvert so I don't do this often, but striking up conversations with new people at random is a thing I can force myself to do, and it rarely goes as poorly as one might fear. But also, my friend-group is mixed, more women than men, and typically it's people I've met one at a time over the years, less of a "friend group" than "a number of people who are my friends"- so I have lots of 1:1 time with female friends. In terms of flirting with those friends, well, they're friends, so that almost never happens - but almost never is not never. Three times that I can recall off the top of my head, it turned out that one of my friends was attracted to me, and I learned that either because she explicitly said so (in one case, we were teenagers and both clueless about how to flirt, her idea was to follow me around everywhere, and from my perspective I just didn't know that was a thing that I should notice) or because of some flirting (two cases). When I was younger and much, much more awkward, there were innumerable instances where I was attracted to a female friend and didn't say anything because from young-me's perspective of course not that's insane and I'm lucky this amazing person even wants to be my friend and allow me to continue to be in her presence. There was once when I did say something to a good friend and it wasn't reciprocated, we're still close friends, but that wasn't flirting so much as "we've just met had lunch because

[-]Robert Miles7mo175

I'm not so deliberate/strategic about it, but yeah. Like, there's another 'algorithm' that's more intuitive, which is something like "When interacting with the person, it's ~always an active part of your mental landscape that you're into them, and this naturally affects your words and actions. Also, you don't want to make them uncomfortable, so you suppress anything that you think they wouldn't welcome". This produces approximately the same policy, because you'll naturally leak some bits about your interest in them, and you'll naturally be monitoring their behaviour to estimate their interest in you, in order to inform your understanding of what they would welcome from you. As you gather more evidence that they're interested, you'll automatically become more free in allowing your interest to show, resulting in ~the same 'escalation of signals of interest'.

I think the key thing about this is like "flirting is not fundamentally about causing someone to be attracted to you, it's about gracefully navigating the realisation that you're both attracted to each other". This is somewhat confused by the fact that "ability to gracefully navigate social situations" is itself attractive, so flirting well can in itself make someone more attracted to you. But I claim that this isn't fundamentally different from the person seeing you skillfully break up a fight or lead a team through a difficult situation, etc.

2Cleo Nardo6mo

Notwithstanding, I think flirting is substantially (perhaps even fundamentally) about both (i) attraction, and (ii) seduction. Moreover, I think your model is too symmetric between the parties, both in terms of information-symmetry and desire-symmetry across time. My model of flirting is roughly: Alice attracts Bob -> Bob tries attracting Alice -> Alice reveals Bob attracts Alice -> Bob tries seducing Alice -> Alice reveals Bob seduces Alice -> Initiation

[-]Vanessa Kosoy7mo123

I never did quite that thing successfully. I did have one time when I dropped progressively unsubtle hints on a guy, who remained stubbornly oblivious for a long time until he finally got the message and reciprocated.

[-]DirectedEvolution7mo115

I interpret the confusion around flirting as “life imitating art” — specifically, there is a cultural narrative about how flirting works that a lot of socially awkward people are trying to implement.

That means there are big discrepancies between how experts flirt and how most people flirt. It also means that most people have to learn how to read the flirtation signals of other low-flirtation-skill people.

The cultural narrative around flirting therefore doesn’t exactly match practice, even though it influences practice.

It doesn’t necessarily take that much flirting to build enough confidence to ask someone out. Are they alone at a party? Is your conversation with them going on longer than for most people? Is it fun? You’re all set.

[-]Jeremy Gillen7mo*113

Have you personally done the thing successfully with another person, with both of you actually picking up on the other person's hints?

Yes. But usually the escalation happens over weeks or months, over multiple conversations (at least in my relatively awkward nerd experience). So it'd be difficult to notice people doing this. Maybe twice I've been in situations where hints escalated within a day or two, but both were building from a non-zero level of suspected interest. But none of these would have been easy to notice from the outside, except maybe at a couple of moments.

8J Bostock7mo

There's two parts here. 1. Are people using escalating hints to express romantic/sexual interest in general? 2. Does it follow the specific conversational patterns usually used? 1 is true in my experience, while 2 usually isn't. I can think of two examples where I've flirted by escalating signals. In both cases it was more to do with escalating physical touch and proximity, though verbal tone also played a part. I would guess that the typical examples of 2 you normally see (like A complimenting B's choice of shoes, then the B using a mild verbal innuendo, then A making a comment about the B's figure) don't happen as often, since not many people are good enough wordsmiths to do the escalation purely verbally. Plus it's not the Victorian era anymore and it's acceptable to escalate by slowly leaning forward as the conversation progresses, almost-accidentally brushing someone's hand, etc.

8Morpheus7mo

One of the first things that (shy?) people use to gauge each other's interests before or instead of talking about anything explicit is eye contact. So I think that wearing your glasses puts you at a disadvantage unless you take them off when you are flirting. I'm not sure why you're wearing them, but taking them off in itself could be a flirty move. I am not particularly good at flirting. But I remember in 9th grade a girl I had flirted with for like half an hour at an event via eye contact. We didn't exchange more than ~3 sentences in person (there were no innuendos). Then she called me later that same day, asking me out explicitly if I wanted to be her boyfriend.

7Selfmaker6627mo

I'm pretty sure I wouldn't escalate those signs above a rather low threshold given any observers, and my intuition tells me other people would be similar in this regard. So not observing flirting could just imply people don't flirt if you're in the conversation with them. As an extreme example, I've never seen anyone having sex, but it seems as if people do that all the time.

6Johannes C. Mayer7mo

In model flirting is about showing that you are paying attention. You say things that you could only pick up if you pay close attention to me and what I say. It's like a cryptographic proof certificate, showing that you think that I am important enough to pay attention to continuously. Usually this is coupled with an optimization process of using that knowledge to make me feel good, e.g. given a compliment that actually tracks reality in a way I care about. It's more general than just showing sexual interest I think.

3AlphaAndOmega7mo

I've seen it happen, and have done it myself with decent success. As @Buck notes below, dating apps, which are now a majority share of how people begin or seek to begin relationships, are far more targeted. There's little plausible deniability involved, both of you are talking on Tinder. Not that there isn't some, of course. There are mind games afoot where people claim to be interested only in long-term relationships, but if you're attractive enough, they might easily accept something shorter with no strings attached. Conversely, there are people who state they're looking for a quick romp, but are hiding the degree of yearning they contain for something more serious. It's hard to break it down into a play-by-play, but in my experience, flirting starts out with friendly interactions, obvious or not so obvious signs that you're single, gauging the reception of jokes or compliments, and then grows from there. The more you gradually establish compatibility and interest, the easier it gets to stop beating around the bush.

5[comment deleted]7mo

[-]johnswentworth1y*31-27

Epistemic status: rumor.

Word through the grapevine, for those who haven't heard: apparently a few months back OpenPhil pulled funding for all AI safety lobbying orgs with any political right-wing ties. They didn't just stop funding explicitly right-wing orgs, they stopped funding explicitly bipartisan orgs.

[-]habryka1y6623

My best guess this is false. As a quick sanity-check, here are some bipartisan and right-leaning organizations historically funded by OP:

FAI leans right. https://www.openphilanthropy.org/grants/foundation-for-american-innovation-ai-safety-policy-advocacy/
Horizon is bipartisan https://www.openphilanthropy.org/grants/open-philanthropy-technology-policy-fellowship-2022/ .
CSET is bipartisan https://www.openphilanthropy.org/grants/georgetown-university-center-for-security-and-emerging-technology/ .
IAPS is bipartisan. https://www.openphilanthropy.org/grants/page/2/?focus-area=potential-risks-advanced-ai&view-list=false, https://www.openphilanthropy.org/grants/institute-for-ai-policy-strategy-general-support/
RAND is bipartisan. https://www.openphilanthropy.org/grants/rand-corporation-emerging-technology-fellowships-and-research-2024/.
Safe AI Forum. https://www.openphilanthropy.org/grants/safe-ai-forum-operating-expenses/
AI Safety Communications Centre. https://www.openphilanthropy.org/grants/effective-ventures-foundation-ai-safety-communications-centre/ seems to lean left.

Of those, I think FAI is the only one at risk of OP being unable to fund them, based on my guess of where th... (read more)

[-]gwern1y345

Also worth noting Dustin Moskowitz was a prominent enough donor this election cycle, for Harris, to get highlighted in news coverage of her donors: https://www.washingtonexaminer.com/news/campaigns/presidential/3179215/kamala-harris-influential-megadonors/ https://www.nytimes.com/2024/10/09/us/politics/harris-billion-dollar-fundraising.html

[-]habryka1y*220

Curious whether this is a different source than me. My current best model was described in this comment, which is a bit different (and indeed, my sense was that if you are bipartisan, you might be fine, or might not, depending on whether you seem more connected to the political right, and whether people might associate you with the right):

Yep, my model is that OP does fund things that are explicitly bipartisan (like, they are not currently filtering on being actively affiliated with the left). My sense is in-practice it's a fine balance and if there was some high-profile thing where Horizon became more associated with the right (like maybe some alumni becomes prominent in the republican party and very publicly credits Horizon for that, or there is some scandal involving someone on the right who is a Horizon alumni), then I do think their OP funding would have a decent chance of being jeopardized, and the same is not true on the left.
Another part of my model is that one of the key things about Horizon is that they are of a similar school of PR as OP themselves. They don't make public statements. They try to look very professional. They are probably very happy to compromise on

... (read more)

5johnswentworth1y

I am posting this now mostly because I've heard it from multiple sources. I don't know to what extent those sources are themselves correlated (i.e. whether or not the rumor started from one person).

7harfe1y

A related comment from lukeprog (who works at OP) was posted on the EA Forum. It includes:

5habryka1y

I think the comment more confirms than disconfirms John's comment (though I still think it's too broad for other reasons). OP "funding" something historically has basically always meant recommending a grant to GV. Luke's language to me suggests that indeed the right of center grants are no longer referred to GV (based on a vague vibe of how he refers to funders in plural). OP has always made some grant recommendations to other funders (historically OP would probably describe those grants as "rejected but referred to an external funder"). As Luke says, those are usually ignored, and OP's counterfactual effect on those grants is much less, and IMO it would be inaccurate to describe those recommendations as "OP funding something". As I said in the comment I quote in the thread, most OP staff would like to fund things right of center, but GV does not seem to want to, as such the only choice OP has is to refer them to other funders (which sometimes works, but mostly doesn't). As another piece of evidence, when OP defunded all the orgs that GV didn't want to fund anymore, the communication emails that OP sent said that "Open Philanthropy is exiting funding area X" or "exiting organization X". By the same use of language, yes, it seems like OP has exited funding right-of-center policy work. (I think it would make sense to taboo "OP funding X" in future conversations to avoid confusion, but also, I think historically it was very meaningfully the case that getting funded by GV is much better described as "getting funded by OP" given that you would never talk to anyone at GV and the opinions of anyone at GV would basically have no influence on you getting funded. Things are different now, and in a meaningful sense OP isn't funding anyone anymore, they are just recommending grants to others, and it matters more what those others think then what OP staff thinks)

-10Shankar Sivarajan1y

[-]johnswentworth4y300

Takeaways From "The Idea Factory: Bell Labs And The Great Age Of American Innovation"

Main takeaway: to the extent that Bell Labs did basic research, it actually wasn’t all that far ahead of others. Their major breakthroughs would almost certainly have happened not-much-later, even in a world without Bell Labs.

There were really two transistor inventions, back to back: Bardain and Brattain’s point-contact transistor, and then Schockley’s transistor. Throughout, the group was worried about some outside group beating them to the punch (i.e. the patent). There were semiconductor research labs at universities (e.g. at Purdue; see pg 97), and the prospect of one of these labs figuring out a similar device was close enough that the inventors were concerned about being scooped.

Most inventions which were central to Bell Labs actually started elsewhere. The travelling-wave tube started in an academic lab. The idea for fiber optic cable went way back, but it got its big kick at Corning. The maser and laser both started in universities. The ideas were only later picked up by Bell.

In other cases, the ideas were “easy enough to find” that they popped up more than once, independently, and were mos... (read more)

[-]dynomight4y180

I loved this book. The most surprising thing to me was the answer that people who were there in the heyday give when asked what made Bell Labs so successful: They always say it was the problem, i.e. having an entire organization oriented towards the goal of "make communication reliable and practical between any two places on earth". When Shannon left the Labs for MIT, people who were there immediately predicted he wouldn't do anything of the same significance because he'd lose that "compass". Shannon was obviously a genius, and he did much more after than most people ever accomplish, but still nothing as significant as what he did when at at the Labs.

[-]johnswentworth1y29-7

So I read SB1047.

My main takeaway: the bill is mostly a recipe for regulatory capture, and that's basically unavoidable using anything even remotely similar to the structure of this bill. (To be clear, regulatory capture is not necessarily a bad thing on net in this case.)

During the first few years after the bill goes into effect, companies affected are supposed to write and then implement a plan to address various risks. What happens if the company just writes and implements a plan which sounds vaguely good but will not, in fact, address the various risks? Probably nothing. Or, worse, those symbolic-gesture plans will become the new standard going forward.

In order to avoid this problem, someone at some point would need to (a) have the technical knowledge to evaluate how well the plans actually address the various risks, and (b) have the incentive to actually do so.

Which brings us to the real underlying problem here: there is basically no legible category of person who has the requisite technical knowledge and also the financial/status incentive to evaluate those plans for real.

(The same problem also applies to the board of the new regulatory body, once past the first few years.)

Ha... (read more)

[-]ryan_greenblatt1y213

What happens if the company just writes and implements a plan which sounds vaguely good but will not, in fact, address the various risks? Probably nothing.

The only enforcement mechanism that the bill has is that the Attorney General (AG) of California can bring a civil claim. And, the penalties are quite limited except for damages. So, in practice, this bill mostly establishes liability enforced by the AG.

So, the way I think this will go is:

The AI lab implements a plan and must provide this plan to the AG.
If an incident occurs which causes massive damages (probably ball park of $500 million in damages given language elsewhere in the bill), then the AG might decide to sue.
A civil court will decide whether the AI lab had a reasonable plan.

I don't see why you think "the bill is mostly a recipe for regulatory capture" given that no regulatory body will be established and it de facto does something very similar to the proposal you were suggesting (impose liability for catastrophes). (It doesn't require insurance, but I don't really see why self insuring is notably different.)

(Maybe you just mean that if a given safety case doesn't result in that AI lab being sued by the AG, the... (read more)

9johnswentworth1y

Good argument, I find this at least somewhat convincing. Though it depends on whether penalty (1), the one capped at 10%/30% of training compute cost, would be applied more than once on the same model if the violation isn't remedied.

3RHollerith1y

I'm pessimistic enough about the AI situation that even if all the bill does is slow down the AGI project a little (by wasting the time of managers and contributors) I'm tentatively for it.

1Johannes C. Mayer1y

For the reasonable price of $300 dollars per month, I insure anybody against the destruction of the known world. Should the world be destroyed by AGI I'll give you your money back 10100 fold. That said, if there were insurers, they would probably be more likely than average to look into AI X-risk. Some might then be convinced that it is important and that they should do something about it.

1[anonymous]1y

I don't understand this. Isn't the strongest incentive already present (because extinction would effect them)? Or maybe you mean smaller scale 'catastrophes'?

6Raemon1y

I think people mostly don't believe in extinction risk, so the incentive isn't nearly as real/immediate.

7johnswentworth1y

+1, and even for those who do buy extinction risk to some degree, financial/status incentives usually have more day-to-day influence on behavior.

1[anonymous]1y

I'm imagining this: Case one: would-be-catastrophe-insurers don't believe in x-risks, don't care to investigate. (At stake: their lives) Case two: catastrophe-insurers don't believe in x-risks, and either don't care to investigate, or do for some reason I'm not seeing. (At stake: their lives and insurance profits (correlated)).

4Raemon1y

They can believe in catastrophic but non-existential risks. (Like, AI causes something like crowdstrike periodically if your not trying to prevent that )

[-]johnswentworth2y286

I've just started reading the singular learning theory "green book", a.k.a. Mathematical Theory of Bayesian Statistics by Watanabe. The experience has helped me to articulate the difference between two kinds of textbooks (and viewpoints more generally) on Bayesian statistics. I'll call one of them "second-language Bayesian", and the other "native Bayesian".

Second-language Bayesian texts start from the standard frame of mid-twentieth-century frequentist statistics (which I'll call "classical" statistics). It views Bayesian inference as a tool/technique for answering basically-similar questions and solving basically-similar problems to classical statistics. In particular, they typically assume that there's some "true distribution" from which the data is sampled independently and identically. The core question is then "Does our inference technique converge to the true distribution as the number of data points grows?" (or variations thereon, like e.g. "Does the estimated mean converge to the true mean", asymptotics, etc). The implicit underlying assumption is that convergence to the true distribution as the number of (IID) data points grows is the main criterion by which inference meth... (read more)

2philip_b2y

Is there any "native" textbook that is pragmatic and explains how to use bayesian in practice (perhaps in some narrow domain)?

2johnswentworth2y

I don't know of a good one, but never looked very hard.

[-]johnswentworth4mo270

Just got my whole genome sequenced. A thing which I could have figured out in advance but only realized once the results came back: if getting a whole genome sequence, it's high value to also get your parents' genomes sequenced.

Here's why.

Suppose I have two unusual variants at two different positions (not very close together) within the same gene. So, there's a variant at location A, and a variant at location B. But (typically) I have two copies of each gene, one from each parent. So, I might have the A and B variants both on the same copy, and the other copy could be normal. OR, I could have the A variant on one copy and the B variant on the other copy. And because modern sequencing usually works by breaking DNA into little chunks, sequencing the chunks, and then computationally stitching it together... those two possibilities can't be distinguished IIUC.

The difference is hugely important if e.g. both the A variant and the B variant severely fuck up the gene. If both are on the same copy, I'd have one normal working variant and one fucked up. If they're on different copies, then I'd have zero normal working variants, which will typically have much more extreme physiological result... (read more)

7Metacelsus4mo

Yeah, if anyone is interested in learning more, this is called the phasing problem. For common enough variants, it's often possible to figure this out by looking at general patterns of co-inheritance if you have a large reference dataset for the population (see: https://www.nature.com/articles/s41588-023-01415-w). Long read sequencing which you mentioned is another approach. But you're right that these days it would just be cheapest to get the parental genomes (assuming that's an option).

[-]johnswentworth6mo270

Question I'd like to hear peoples' takes on: what are some things which are about the same amount of fun for you as (a) a median casual conversation (e.g. at a party), or (b) a top-10% casual conversation, or (c) the most fun conversations you've ever had? In all cases I'm asking about how fun the conversation itself was, not about value which was downstream of the conversation (like e.g. a conversation with someone who later funded your work).

For instance, for me, a median conversation is about as fun as watching a mediocre video on youtube or reading a mediocre blogpost. A top-10% conversation is about as fun as watching a generic-but-fun movie, like e.g. a Jason Statham action flick. In both cases, the conversation drains more energy than the equal-fun alternative. I have probably had at most a single-digit number of conversations in my entire life which were as fun-in-their-own-right as e.g. a median night out dancing, or a median escape room, or median sex, or a median cabaret show. Maybe zero, unsure.

The rest of this is context on why I'm asking which you don't necessarily need to read in order to answer the question...

So I recently had a shortform asking "hey, that thing whe... (read more)

9DirectedEvolution6mo

I find conversations more meaningful than many comparably-fun activities. What provides the meaning is my intuition about the opportunities the conversation can lead to and the update in how I’m perceived by my counterpart. As a secondary effect, conversations exercise and test my ability to think on my feet. Flirtation can lead to sex, a coffee break chat with a collaborator can lead to a new project, a talk with anyone can lead to closer friendship. Flirtation suggests I’m more desirable than I thought, talk about projects that I’m regarded as more capable, talk with acquaintances that I’m charismatic. These social updates and the mental exercise conversation provides are why I seek out conversation compared to many other more-fun activities. Also, I have to recognize that I probably value conversation for its own sake above and beyond these instrumental purposes. It just feels like it ought to be part of a good life aesthetic, like eating fresh fruits and vegetables.

4Selfmaker6626mo

As said by @Mateusz Bagiński , normal smalltalk is +epsilon, but some more comparisons: a short smile with a stranger or acquaintance is like eating a very tasty fruit. 90% percentile conversations are all with good friends and leave me high for a few hours. As good as a very good date. No non-social activities come close. I don’t actually remember any best particular ones, but the best ones i can recall aren’t about conversations anymore but about presence, which isn’t conversation anymore, I think. They feel extremely nourishing and meaningful and my only comparison is a really, really good IFS or therapy session.

7Elizabeth6mo

A top [1-5?]% conversation is as good in the moment as an early playthrough of my favorite video games, and feels better afterward. That's probably top 10% of conversations at parties, which have higher selection pressure than uber drivers. I've been working on getting more out of lower percentile conversations. The explanation is fairly woo-ey but might also relate to your interest around flirting. Median conversation is about as good as a TV show I will watch for two episodes and give up on. Tangent: my standards for media have gone way up over the last ~5 years, I abandon a lot more out of boredom, especially books. I worried this was some sort of generalized anhedonia, but every once in a while read or reread something great and enjoy it immensely, so I think it's just raised standards.

3johnswentworth6mo

I'd be interested to hear that.

5Elizabeth6mo

This mostly comes up with talkative Uber drivers. The superficial thing I do is I ask myself "what vibes is this person offering?" And then do some kind of centering move. Sometimes it feels unexpectedly good and I do an accepting mood and feel nourished by the conversation. Sometimes it will feel bad and I'll be more aggressive in shutting conversations down. I'm often surprised by the vibe answer, it feels different than what my conscious brain would answer. The obvious question is what am I doing with the inquiry and accepting moves. I don't know how to explain that. Overall a growth edge I'm exploring right now is "forms of goodness other than interesting." And I think that's probably a weak area for you too, although maybe an endorsed one

6J Bostock6mo

Median party conversation is probably about as good as playing a video game I enjoy, or reading a good blog post. Value maybe £2/hr. More tiring than the equivalent activity. Top 10% party conversation is somewhere around going for a hike somewhere very beautiful near to where I live, or watching an excellent film. Value maybe £5/hr. These are about as tiring as the equivalent activity. Best conversations I've ever had were on par with an equal amount of time spent on a 1/year quality holiday, like to Europe (I live in the UK) but not to, say, Madagascar. Most of these conversations went on for >1 hr. Value maybe 25/hr. Less tiring and if anything energizing. (For monetary values I'm imagining what I'd pay to go to a party for 4 hours where that event woud occur. My overall income minus expenses is probably a bit below average for the UK, so take that into account.)

6Jonas Hallgren6mo

I generally agree with you that normal conversations are boring and should be avoided. There are two main strats I employ: 1. Don't let go of relationships where you can relax: my sample size is highly skewed towards retaining long-term relationships where you're comfortable enough with people that you can just chill and relax so my median conversation is like that? 2. You create a shared space and the norms come from that shared space so to shape conversations you can say some deliberately out of pocket stuff (randomly jump into a yoda accent for example) in order to change the vibe and therefore remove part of the cognitive load? 1. If the person is like "ugghh, wtf?" in vibe you just move on to the next conversation ¯\_(ツ)_/¯

6Mateusz Bagiński6mo

I think the median conversation for me is zero or positive-but-very-small epsilon fun, whereas the 90th percentile is maybe as fun as discovering a new song/band/album that I like a lot or listening to one of my favorite songs after several weeks of not listening to it. The most fun conversations I've ever had are probably the most fun experiences I've ever had. I don't find conversations-in-general draining, although I can get exhausted by social activities where I'm supposed to play some role that is out of character for me, like in LARPing (though that might be a learnable-skill issue) or extended-family reunions.

2johnswentworth6mo

Can you give an example of what a "most fun" conversation looked like? What's the context, how did it start, how did the bulk of it go, how did you feel internally throughout, and what can you articulate about what made it so great?

[-]Mateusz Bagiński5mo*164

At a recent EAG afterparty, bored @Algon suggested that he explain something to me, and I explain something to him in return. He explained to me this thing. When it was my turn, I thought that maybe I should do the thing that had been on my mind for several months: give a technical explanation of monads starting with the very basics of category theory, and see how long it takes. It turned out that he knew the most basic basics of category theory, so it was a bit more of an easy mode, but it still took something like 50 minutes, out of which maybe half was spent on natural transformations. A few minutes in, @niplav joined us. I enjoyed drawing diagrams and explaining and discussing a technical topic that I love to think about, in the absurd setting of people playing beerpong one meter from the whiteboard, with passers-by asking "Are you guys OK?" or "WTF are you doing?" ("He's explaining The Meme!"). It was great to witness them having intuition breakthroughs, where you start seeing something that is clear and obvious in hindsight but not in foresight (similar to bistable figures). Throughout, I also noticed some deficiencies in my understanding (e.g., I noticed that I didn't have a... (read more)

6Algon5mo

Can confirm that I was bored (no room for a sword-fight!), knew very little category theory, and learned about monads. But at least now I know that while a monad is not like a burrito, a burrito is like a monad. Rant: Man, I don't like how unwieldy the categorical definition of a monoid is! So very many functors, transformations, diagrams etc. And they're not even particularly pleasing diagrams. The type-theoretic definition of a monad, as covered in this lovely n-lab article, felt less awkward to me. But admittedly, learning the categorical definition did help with learning the type-theoretic definition.

6johnswentworth5mo

That was very useful for me, thankyou!

2johnswentworth5mo

Follow-up question: can you give an example of a plausibly-most-fun non-conversation experience you've had?

4Mateusz Bagiński5mo

[REDACTED but you can DM if you want to know]

5Morpheus6mo

The last year, my median conversation was about as entertaining as yours. The top 10% conversations are fun-in-their-own-right at that moment already because my brain anticipates some form of long-term value (with the exception of cracking jokes). I don't know if all those conversations would count as "casual". As intellectually stimulating as the Task Master TV-show is funny. Conversation is more heavy tailed than movies though. Long term value includes: learning or teaching (learning some new technical thing that's usually not written down anywhere (Podcasts tend to be better for that), getting a pointer about something to learn about, teaching something technical in the anticipation that the other person is actually going to do anything with that knowledge, incorporating the generating function behind someone's virtues/wisdom), thinking out loud with someone else in the expectation that this might lead to an interesting idea, gossip, life stories (sometimes preventing you from harm from people/situations that can't be trusted. Sometimes just illuminating parts of life you'd know less about). My most fun conversation had me grinning for 30 minutes after still, and my heartbeat after that time was also still 10–20 beats higher than usual. My median conversations at parties over my entire life are probably less entertaining than your median ones. My bar for an interesting conversation also rose when I stumbled over the wider rationalist sphere. I remember two conversations from before that era where the main important information was essentially just "there are other smart people out there, and you can have interesting conversations with them where you can ask the questions you have etc.". One was at a networking event for startup founders, and the other was a Computer Science PhD student showing me his work and the university campus (same conversation that got my heart-beat up).

4johnswentworth5mo

I'm returning to this thread to check a new hypothesis. For those who said top ~10% of conversations are high value: what's the felt experience during those conversations? In particular (this is a question about a specific hypothesis, please read it only after considering the first question in order to avoid anchoring): Tagging people who had useful answers previously and whose answers to this question I'd like to hear: @Selfmaker662 @Elizabeth @J Bostock @Mateusz Bagiński

4Mateusz Bagiński5mo

Part 1 Part 2

4J Bostock5mo

Spoilered to avoid anchoring:

4Raemon5mo

The qualia for me for conversations is usually not pronouncedly "a warm feeling in chest" (it is noticeably different from what I call "Deep/Meaningful Limerence" which I think you're pointing at). Three distinct flavor of good conversation: 1. alive, creative, magnetic vibrant conversation (I think I might describe part of this as slightly warm chest, I don't quite remember, I haven't had it recently. But it's more the qualia of traditional excitement than warm connection". (I bet you have these conversations occasionally, or at least ever have, and they correlate more with obvious John values) 2. slightly nice sitting-around-living-room or restaurant/bar or campfire vibes (shallow) 3. somewhat-more-nice sitting around living-room/campfire vibes where the conversation is sort of "deep", in a way that multiple people are talking about something either emotionally confusing, or psychologically fraught, or "meaning-making"-ish. I expect #3 (less confidently than #1) to be somewhat obviously valuable to you in some circumstances regardless of qualia. But, it does have some particular qualia that's like (hrm, probably can't remember actual biological phenomenology right now), but, like, spacious, relaxed, I think there's maybe some kind of feeling in my chest but I don't have a good word for it. #2... I think might have a very mild version of "warm feeling in chest". Or, I think it does feel warm but I think it's more distributed throughout my body. But I think #2 more importantly for me is like: "there is an actively (slightly) bad qualia to not-having-had-nice-livingroom-conversations lately" which is, like, feeling sort of blah, or just somewhat less vibrant. If I have something to be socially anxious about, lack of recent #2 makes it worse.

3Selfmaker6625mo

It’s different: sometimes it’s spacious calmness of being able to sit in silence together; sometimes warm feelings of seeing and being seen, when discussing something private with a good friend; or just listening to a really good story. IIRC I also included dates into conversations back then, they have a different dynamic, where a lot of pleasure is feeling a young beautiful woman being with me. — this is a very particular feeling you have and those differ a lot in where they appear for different people, how they feel and what they’re about. Not having seen other people’s answers I‘d bet your hypothesis to be wrong.

4Vaniver6mo

Did you ever try Circling? I wonder some if there's a conversational context that's very "get to the interesting stuff" which would work better for you. (Or, even if it's boring, it might be because it's foregrounding relational aspects of the conversation which are much less central for you than they are for most people.)

2johnswentworth6mo

I have a few times, found it quite interesting, and would happily do it again. It feels like the sort of thing which is interesting mainly because I learned a lot, but marginal learnings would likely fall off quickly, and I don't know how interesting it would be after doing it a few more times.

4Johannes C. Mayer6mo

I wanted to say that for me it is the opposite, but reading the second half I have to say it's the same. I have defnetly had the problem that I talked too long sometimes to somebody. E.g. multiple times I talked to a person for 8-14 hours without break about various technical things. E.g. talking about compiler optimizations, CPU architectures and this kind of stuff, and it was really hard to stop. Also just solving problems in a conversation is very fun. The main reason I didn't do this a lot is that there are not that many people I know, actually basically zero right now (if you exclude LLMs), that I can have the kinds of conversations with that I like to have. It seems to be very dependent on the person. So I am quite confused why you say "but conversation just isn't a particularly fun medium". If it's anything like for me, then engaging with the right kind of people on the right kind of content is extremenly fun. It seems like your model is confused because you say "conversations are not fun" when infact in the space of possible conversations I expect there are many types of conversations that can be very fun, but you haven't mapped this space, while implicitly assuming that your map is complete. Probably there are also things besides technical conversations that you would find fun but that you simply don't know about, such as hardcore flirting in a very particular way. E.g. I like to talk to Grok in voice mode, in romantic mode, and then do some analysis of some topic (or rather that is what I just naturally do), and then Grok complements my mind in ways that my mind likes, e.g. pointing out that I used a particular thinking pattern that is good or that I at all thought about this difficult thing and then I am like "Ah yes that was actually good, and yes it seems like this is a difficult topic most people would not think about."

3samuelshadrach5mo

My life is less "fun" than it used to be because I've become more work-focussed. That being said, something I like is getting positive reception for ideas I'm otherwise guessing might receive negative reception. The first couple of times this happens is really nice, after that it becomes normal.

3Said Achmiz6mo

I’m confused about this anecdote. How else did the psychologist expect you (or any other kid) to behave? What else does one do when a conversation is over, other than “go back to doing what you were doing before / what you would be doing otherwise”…?

[-][anonymous]6mo103

I presume the psychologist expected John to actively seek out similar conversations. From the psychologist's perspective:

most kids would do that, but John didn't.
most of the kids who wouldn't do that would decline because of social anxiety/a lack of social skills/a hatred of social interactions etc, which is not the case for John; he seemed perfectly comfortable while partaking in such conversations.

Since John wasn't in either category, it probably struck the psychologist as odd.

2Said Achmiz6mo

I see, thanks. That makes sense. (At least, the reasoning makes sense, given the psychologist’s beliefs as you describe them; I have no idea if those beliefs are true or not.)

3CAC6mo

Do group conversations count? I would agree that the median one-on-one conversation for me is equivalent to something like a mediocre blogpost (though I think my right-tail is longer than yours, I'd say my favorite one-on-one conversations were about as fun as watching some of my favorite movies). But, in groups, my median shifts toward 80th percentile YouTube video (or maybe the average curated post here on LessWrong). It does feel like a wholly different activity, and might not be the answer you're looking for. Group conversations, for example, are in a way inherently less draining: you're not forced to either speak or actively listen for 100% of the time.

2johnswentworth6mo

Yes.

[-]johnswentworth2yΩ10277

Here's a meme I've been paying attention to lately, which I think is both just-barely fit enough to spread right now and very high-value to spread.

Meme part 1: a major problem with RLHF is that it directly selects for failure modes which humans find difficult to recognize, hiding problems, deception, etc. This problem generalizes to any sort of direct optimization against human feedback (e.g. just fine-tuning on feedback), optimization against feedback from something emulating a human (a la Constitutional AI or RLAIF), etc.

Many people will then respond: "Ok, but if how on earth is one supposed to get an AI to do what one wants without optimizing against human feedback? Seems like we just have to bite that bullet and figure out how to deal with it." ... which brings us to meme part 2.

Meme part 2: We already have multiple methods to get AI to do what we want without any direct optimization against human feedback. The first and simplest is to just prompt a generative model trained solely for predictive accuracy, but that has limited power in practice. More recently, we've seen a much more powerful method: activation steering. Figure out which internal activation-patterns encode for the thing we want (via some kind of interpretability method), then directly edit those patterns.

6TurnTrout2y

I agree that there's something nice about activation steering not optimizing the network relative to some other black-box feedback metric. (I, personally, feel less concerned by e.g. finetuning against some kind of feedback source; the bullet feels less jawbreaking to me, but maybe this isn't a crux.) (Medium confidence) FWIW, RLHF'd models (specifically, the LLAMA-2-chat series) seem substantially easier to activation-steer than do their base counterparts.

4Chris_Leong2y

What other methods fall into part 2?

3Johannes C. Mayer2y

This seems basically correct though it seems worth pointing out that even if we are able to do "Meme part 2" very very well, I expect we will still die because if you optimize hard enough to predict text well, with the right kind of architecture, the system will develop something like general intelligence simply because general intelligence is beneficial for predicting text correctly. E.g. being able to simulate the causal process that generated the text, i.e. the human, is a very complex task that would be useful if performed correctly. This is an argument Eliezer brought forth in some recent interviews. Seems to me like another meme that would be beneficial to spread more.

[-]johnswentworth4y260

Somebody should probably write a post explaining why RL from human feedback is actively harmful to avoiding AI doom. It's one thing when OpenAI does it, but when Anthropic thinks it's a good idea, clearly something has failed to be explained.

(I personally do not expect to get around to writing such a post soon, because I expect discussion around the post would take a fair bit of time and attention, and I am busy with other things for the next few weeks.)

81a3orn4y

I'd also be interested in someone doing this; I tend towards seeing it as good, but haven't seen a compilation of arguments for and against.

1[comment deleted]4y

[-]johnswentworth4y250

Here's an idea for a novel which I wish someone would write, but which I probably won't get around to soon.

The setting is slightly-surreal post-apocalyptic. Society collapsed from extremely potent memes. The story is episodic, with the characters travelling to a new place each chapter. In each place, they interact with people whose minds or culture have been subverted in a different way.

This provides a framework for exploring many of the different models of social dysfunction or rationality failures which are scattered around the rationalist blogosphere. For instance, Scott's piece on scissor statements could become a chapter in which the characters encounter a town at war over a scissor. More possible chapters (to illustrate the idea):

A town of people who insist that the sky is green, and avoid evidence to the contrary really hard, to the point of absolutely refusing to ever look up on a clear day (a refusal which they consider morally virtuous). Also they clearly know exactly which observations would show a blue sky, since they avoid exactly those (similar to the dragon-in-the-garage story).
Middle management of a mazy company continues to have meetings and track (completely fabri

... (read more)

3niplav4y

* A town of anti-inductivists (if something has never happened before, it's more likely to happen in the future). Show the basic conundrum ("Q: Why can't you just use induction? A: Because anti-induction has never worked before!"). * A town where nearly all people are hooked to maximally attention grabbing & keeping systems (maybe several of those, keeping people occupied in loops).

[-]johnswentworth3y231

I'm writing a 1-year update for The Plan. Any particular questions people would like to see me answer in there?

7Gunnar_Zarncke3y

I had a look at The Plan and noticed something I didn't notice before: You do not talk about people and organization in the plan. I probably wouldn't have noticed if I hadn't started a project too, and needed to think about it. Google seems to think that people and team function play a big role. Maybe your focus in that post wasn't on people, but I would be interested in your thoughts on that too: What role did people and organization play in the plan and its implementation? What worked, and what should be done better next time?

4Erik Jenner3y

* What's the specific most-important-according-to-you progress that you (or other people) have made on your agenda? New theorems, definitions, conceptual insights, ... * Any changes to the high-level plan (becoming less confused about agency, then ambitious value learning)? Any changes to how you want to become less confused (e.g. are you mostly thinking about abstractions, selection theorems, something new?) * What are the major parts of remaining deconfusion work (to the extent to which you have guesses)? E.g. is it mostly about understanding abstractions better, or mostly about how to apply an understanding of abstractions to other problems (say, what it means for a program to have a "subagent"), or something else? Does the most difficult part feel more conceptual ("what even is an agent?") or will the key challenges be more practical concerns ("finding agents currently takes exponential time")? * Specifically for understanding abstractions, what do you see as important open problems?

[-]johnswentworth2y220

Corrigibility proposal. Status: passed my quick intuitive checks, I want to know if anyone else immediately sees a major failure mode before I invest more time into carefully thinking it through.

Setup: shutdown problem. Two timesteps, shutdown button will be either pressed or not-pressed at second timestep, we want agent to optimize for one of two different utility functions depending on whether button is pressed. Main thing we're trying to solve here is the "can't do this with a utility maximizer" barrier from the old MIRI work; we're not necessarily trying to solve parts like "what utility function incentivizes shutting down nicely".

Proposal: agent consists of two subagents with veto power. Subagent 1 maximizes E[u1|do(press)], subagent 2 maximizes E[u2|do(no press)]. Current guess about what this does:

The two subagents form a market and equilibrate, at which point the system has coherent probabilities and a coherent utility function over everything.
Behaviorally: in the first timestep, the agent will mostly maintain optionality, since both subagents need to expect to do well (better than whatever the veto-baseline is) in their worlds. The subagents will bet all of their wealth ag

... (read more)

8Thane Ruthenis2y

I don't necessarily see how that prevents the AI from destroying the button and murdering its operators (and then everyone else). Neither subagent inherently cares; both assume it already has been pressed or hasn't been. Which, yes, means neither will deliberately plot to interact with it for reasons downstream of this setup; but also, neither will care to protect it from being destroyed as the side-effect of some other activity. And if u1 and u2 were perfectly aligned, that'd be fine. But the reason we're using the corrigibility setup to begin with is because we're not confident in them, right? So it needs to work in situations where utility functions, u1 in particular, recommend bad stuff. And in that case, either Subagent 2 will have enough bargaining power to prevent Subagent 1 from ever doing anything, or it won't, meaning Subagent 1 will be able to have some influence on the world. At which point the planet-destroying death rays argument applies: the most minuscule amount of superintelligent optimization not carefully shaped to avoid killing everyone will kill everyone. Like, imagine that Subagent 1 is straight-up omnicidal out of the gates, and Subagent 2 is perfectly aligned to the shutdown task. That's a situation in which this setup should work, right? And what we'd want is for Subagent 2 to dominate completely. But Subagent 2's bargaining power doesn't scale with Subagent 1's misalignment, so if the setup doesn't always give it complete veto power, Subagent 1 would be allowed to do small stuff like "release the hyperviral worship-the-machine-god memagent" (as bargained down by Subagent 2 from its preferred "release the grey goo"). Maybe I'm miscalibrated as to how you're imagining the counterfactuals to work here. But the way I see it, even if the AI aims to take actions that do similarly well in both worlds, that's not necessarily survivable for us? Especially if they disagree so badly they have to compromise on something that both of them hate (impo

5johnswentworth2y

Yeah, the thing where the operators/button are destroyed as a side effect, without the AI actively optimizing for that, isn't something this proposal is trying to solve (yet). The proposal is aimed at a relatively narrow problem. One thing which this proposal is trying to solve is the barrier found in MIRI's old work: if we try to avoid button-manipulation by making the agent indifferent to the button state (and otherwise maximize U1 before button is pressed and U2 after), and the agent builds a child prior to the button being pressed, then the agent won't construct the child to respect shutdown commands; it will just construct the child to maximize U1. (More generally, this proposal is generally trying to get past the hardness results from MIRI's work, which themselves were aimed at a very simplified version of the problem.) Key point here: whatever actions the composite agent takes prior to the button revelation, must be a pareto improvement over the (implicit) default action. Otherwise, one of the two subagents would veto, resulting in the default action. It does matter a lot what the default action is. More precisely, the analogy would be: we deploy two AIs simultaneously, one which builds a utopia and expects that a certain button will be pressed with probability 1 and has nothing causally upstream of it, the other of which is evil and expects the same button will be pressed with probability 0 and has nothing causally upstream of it. The button plays a very important role: the two AIs will bet all their influence on that button-press, each confident that they'll win, and neither has any incentive at all, or even any ability at all (under their own model), to manipulate the button. The main way that this is different from the original proposal is that there's not a built-in default option which is taken if one or the other AI vetos prior to the button press. Which does make things importantly worse, and I agree that is another important piece to be sorted

4tailcalled2y

More precisely, "do(no press)" means something like "you construct an alternate model of physics where there's an unstoppable force pushing back against any attempt to push the button", right? As in, if someone presses the button then it will "mysteriously" seem to be stuck and unpressable. And then subagent 2 believes we live in that world? And "do(press)" presumably means something like "you construct an alternate model of the universe where some mysterious force has suddenly pressed the button". Seems like they would immediately want to try to press the button to settle their disagreement? If it can be pressed, then that disprove the "do(no press)" model, which subagent 2 has fully committed. to.

3johnswentworth2y

Correct reasoning, but not quite the right notion of do(). "do(no press)" would mean that the button just acts like a completely normal button governed by completely normal physics, right up until the official time at which the button state is to be recorded for the official button-press random variable. And at that exact moment, the button magically jumps into one particular state (either pressed or not-pressed), in a way which is not-at-all downstream of any usual physics (i.e. doesn't involve any balancing of previously-present forces or anything like that). One way to see that the do() operator has to do something-like-this is that, if there's a variable in a causal model which has been do()-operated to disconnect all parents (but still has some entropy), then the only way to gain evidence about the state of that variable is to look at things causally downstream of it, not things upstream of it.

4tailcalled2y

I think we're not disagreeing on the meaning of do (just slightly different state of explanation), I just hadn't realized the extent to which you intended to rely on there being "Two timesteps". (I just meant the forces as a way of describing the jump to a specific position. That is, "mysterious forces" in contrast to a perfectly ordinary explanation for why it went to a position, such as "a guard stabs anybody who tries to press the button", rather than in contrast to "the button just magically stays place".) I now think the biggest flaw in your idea is that it literally cannot generalize to anything that doesn't involve two timesteps.

2Dagon2y

[ not that deep on the background assumptions, so maybe not the feedback you're looking for. Feel free to ignore if this is on the wrong dimensions. ] I'm not sure why either subagent would contract away whatever influence it had over the button-press. This is probably because I don't understand wealth and capital in the model of your "Why not subagents" post. That seemed to be about agreement not to veto, in order to bypass some path-dependency of compromise improvements. In the subagent-world where all value is dependent on the button, this power would not be given up. I'm also a bit skeptical of enforced ignorance of a future probability. I'm unsure it's possible to have a rational superintelligent (sub)agent that is prevented from knowing it has influence over a future event that definitely affects it.

2johnswentworth2y

On the agents' own models, neither has any influence at all over the button-press, because each is operating under a model in which the button-press has been counterfacted-upon.

[-]johnswentworth5y222

Post which someone should write (but I probably won't get to soon): there is a lot of potential value in earning-to-give EA's deeply studying the fields to which they donate. Two underlying ideas here:

The key idea of knowledge bottlenecks is that one cannot distinguish real expertise from fake expertise without sufficient expertise oneself. For instance, it takes a fair bit of understanding of AI X-risk to realize that "open-source AI" is not an obviously-net-useful strategy. Deeper study of the topic yields more such insights into which approaches are probably more (or less) useful to fund. Without any expertise, one is likely to be mislead by arguments which are optimized (whether intentionally or via selection) to sound good to the layperson.

That takes us to the pareto frontier argument. If one learns enough/earns enough that nobody else has both learned and earned more, then there are potentially opportunities which nobody else has both the knowledge to recognize and the resources to fund. Generalized efficient markets (in EA-giving) are ther... (read more)

[-]johnswentworth5y220

Below is a graph from T-mobile's 2016 annual report (on the second page). Does anything seem interesting/unusual about it?

I'll give some space to consider before spoiling it.

...

Answer: that is not a graph of those numbers. Some clever person took the numbers, and stuck them as labels on a completely unrelated graph.

Yes, that is a thing which actually happened. In the annual report of an S&P 500 company. And apparently management considered this gambit successful, because the 2017 annual report doubled down on the trick and made it even more egregious: they added 2012 and 2017 numbers, which are even more obviously not on an accelerating growth path if you actually graph them. The numbers are on a very-clearly-decelerating growth path.

Now, obviously this is an cute example, a warning to be on alert when consuming information. But I think it prompts a more interesting question: why did such a ridiculous gambit seem like a good idea in the first place? Who is this supposed to fool, and to what end?

This certainly shouldn't fool any serious investment analyst. They'll all have their own spreadsheets and graphs forecasting T-mobile's growth. Unless T-mobile's management deeply ... (read more)

[-]johnswentworth1y*217

Basically every time a new model is released by a major lab, I hear from at least one person (not always the same person) that it's a big step forward in programming capability/usefulness. And then David gives it a try, and it works qualitatively the same as everything else: great as a substitute for stack overflow, can do some transpilation if you don't mind generating kinda crap code and needing to do a bunch of bug fixes, and somewhere between useless and actively harmful on anything even remotely complicated.

It would be nice if there were someone who tries out every new model's coding capabilities shortly after they come out, reviews it, and gives reviews with a decent chance of actually matching David's or my experience using the thing (90% of which will be "not much change") rather than getting all excited every single damn time. But also, to be a useful signal, they still need to actually get excited when there's an actually significant change. Anybody know of such a source?

EDIT-TO-ADD: David has a comment below with a couple examples of coding tasks.

[-]habryka1y1812

My guess is neither of you is very good at using them, and getting value out of them somewhat scales with skill.

Models can easily replace on the order of 50% of my coding work these days, and if I have any major task, my guess is I quite reliably get 20%-30% productivity improvements out of them. It does take time to figure out at which things they are good at, and how to prompt them.

9Neil1y

I think you're right, but I rarely hear this take. Probably because "good at both coding and LLMs" is a light tail end of the distribution, and most of the relative value of LLMs in code is located at the other, much heavier end of "not good at coding" or even "good at neither coding nor LLMs". (Speaking as someone who didn't even code until LLMs made it trivially easy, I probably got more relative value than even you.)

7David Lorell1y

Sounds plausible. Is that 50% of coding work that the LLMs replace of a particular sort, and the other 50% a distinctly different sort?

4Johannes C. Mayer1y

Note this 50% likely only holds if you are using a main stream language. For some non-main stream language I have gotten responses that where really unbelivably bad. Things like "the name of this variable wrong" which literally could never be the problem (it was a valid identifier). And similarly, if you are trying to encode novel concepts, it's very different from gluing together libraries, or implementing standard well known tasks, which I would guess is what habryka is mostly doing (not that this is a bad thing to do).

[-]David Lorell1y134

I do use LLMs for coding assistance every time I code now, and I have in fact noticed improvements in the coding abilities of the new models, but I basically endorse this. I mostly make small asks of the sort that sifting through docs or stack-overflow would normally answer. When I feel tempted to make big asks of the models, I end up spending more time trying to get the LLMs to get the bugs out than I'd have spent writing it all myself, and having the LLM produce code which is "close but not quite and possibly buggy and possibly subtly so" that I then have to understand and debug could maybe save time but I haven't tried because it is more annoying than just doing it myself.

If someone has experience using LLMs to substantially accelerate things of a similar difficulty/flavor to transpilation of a high-level torch module into a functional JITable form in JAX which produces numerically close outputs, or implementation of a JAX/numpy based renderer of a traversable grid of lines borrowing only the window logic from, for example, pyglet (no GLSL calls, rasterize from scratch,) with consistent screen-space pixel width and fade-on-distance logic, I'd be interested in seeing how you do y... (read more)

5Nathan Helm-Burger1y

I find them quite useful despite being buggy. I spend about 40% of my time debugging model code, 50% writing my own code, and 10% prompting. Having a planning discussion first with s3.6, and asking it to write code only after 5 or more exchanges works a lot better. Also helpful is asking for lots of unit tests along the way yo confirm things are working as you expect.

[-]Jacob Pfau1y113

Two guesses on what's going on with your experiences:

You're asking for code which involves uncommon mathematics/statistics. In this case, progress on scicodebench is probably relevant, and it indeed shows remarkably slow improvement. (Many reasons for this, one relatively easy thing to try is to breakdown the task, forcing the model to write down the appropriate formal reasoning before coding anything. LMs are stubborn about not doing CoT for coding, even when it's obviously appropriate IME)
You are underspecifying your tasks (and maybe your questions are more niche than average), or otherwise prompting poorly, in a way which a human could handle but models are worse at. In this case sitting down with someone doing similar tasks but getting more use out of LMs would likely help.

[-]kave1y162

In this case sitting down with someone doing similar tasks but getting more use out of LMs would likely help.

I would contribute to a bounty for y'all to do this. I would like to know whether the slow progress is prompting-induced or not.

6johnswentworth8mo

We did end up doing a version of this test. A problem came up in the course of our work which we wanted an LLM to solve (specifically, refactoring some numerical code to be more memory efficient). We brought in Ray, and Ray eventually concluded that the LLM was indeed bad at this, and it indeed seemed like our day-to-day problems were apparently of a harder-for-LLMs sort than he typically ran into in his day-to-day.

6Raemon8mo

A thing unclear from the interaction: it had seemed towards the end that "build a profile to figure out where the bottleneck is" was one of the steps towards figuring out the problem, and that the LLM was (or might have been) better at that part. And, maybe models couldn't solve you entire problem wholesale but there was still potential skills in identifying factorable pieces that were better fits for models.

4kave8mo

Interesting! Two yet more interesting versions of the test: * Someone who currently gets use from LLMs writing more memory-efficient code, though maybe this is kind of question-begging * Someone who currently gets use from LLMs, and also is pretty familiar with trying to improve the memory efficiency of their code (which maybe is Ray, idk)

7Johannes C. Mayer1y

Maybe you include this in "stack overflow substitute", but the main thing I use LLMs for is to understand well known technical things. The workflow is: 1) I am interested in understanding something, e.g. how a multiplexed barrel bit shifter works. 2) I ask the LLM to explain the concept. 3) Based on the initial response I create seperate conversation branches with questions I have (to save money and have the context be closer. Didn't evaluate if this actually makes the LLM better.). 4) Once I think I understood the concept or part of the concept I explain it to GPT. (Really I do this all the time during the entire process.) 5) The LLM (hopeful) corrects me if I am wrong (it seems it detects mistakes more often than not). The last part of the conversation can then looks like this: I had probably ~200,000 words worth of conversation with LLMs, mainly in this format. I am not sure what next leap you are talking about. But I intuit based on some observations that GPT-4o is much better for this than GPT-3 (you might talk about more recent "leaps"). (Didn't test o1 extensively because it's so expensive).

5Aprillion9mo

Have you tried to make a mistake in your understanding on purpose to test out whether it would correct you or agree with you even when you'd get it wrong? (and if yes, was it "a few times" or "statistically significant" kinda test, please?)

3Johannes C. Mayer8mo

Why don't you run the test yourself seems very easy? Yes it does catch me when I am saying wrong things quite often. It also quite often says things that are not correct and I correct it, and if I am right it usually agrees immediately.

1Aprillion8mo

Interesting - the first part of the response seems to suggest that it looked like I was trying to understand more about LLMs... Sorry for confusion, I wanted to clarify an aspect of your worflow that was puzzling to me. I think I got all info for what I was asking about, thanks! FWIW, if the question was an expression of actual interest and not a snarky suggestion, my experience with chatbots has been positive for brainstorming, dictionary "search", rubber ducking, description of common sense (or even niche) topics, but disappointing for anything that requires application of commons sense. For programmming, one- or few-liner autocomplete is fine for me - then it's me doing the judgement, half of the suggestions are completely useless, half are fine, and the third half look fine at first before I realise I needed the second most obvious thing this time.. but it can save time for the repeating part of almost-repeating stuff. For multi file editing,, I find it worse than useless when it feels like doing code review after a psychopath pretending to do programming (AFAICT all models can explain everything most stuff correctly and then write the wrong code anyway .. I don't find it useful when it tries to appologize later if I point it out or to pre-doubt itself in CoT in 7 paragraphs and then do it wrong anyway) - I like to imagine as if it was trained on all code from GH PRs - both before and after the bug fix... or as if it was bored, so it's trying to insert drama into a novel about my stupid programming task, when the second chapter will be about heroic AGI firefighting the shit written by previous dumb LLMs...

2Johannes C. Mayer8mo

I don't use it to write code, or really anything. Rather I find it useful to converse with it. My experience is also that half is wrong and that it makes many dumb mistakes. But doing the conversation is still extremely valuable, because GPT often makes me aware of existing ideas that I don't know. Also like you say it can get many things right, and then later get them wrong. That getting right part is what's useful to me. The part where I tell it to write all my code is just not a thing I do. Usually I just have it write snippets, and it seems pretty good at that. Overall I am like "Look there are so many useful things that GPT tells me and helps me think about simply by having a conversation". Then somebody else says "But look it get's so many things wrong. Even quite basic things." And I am like "Yes, but the useful things are still useful that overall it's totally worth it." Maybe for your use case try codex.

6Stephen McAleese1y

One thing I've noticed is that current models like Claude 3.5 Sonnet can now generate non-trivial 100-line programs like small games that work in one shot and don't have any syntax or logical errors. I don't think that was possible with earlier models like GPT-3.5.

6David Lorell1y

My impression is that they are getting consistently better at coding tasks of a kind that would show up in the curriculum of an undergrad CS class, but much more slowly improving at nonstandard or technical tasks.

6jacquesthibs1y

I'd be down to do this. Specifically, I want to do this, but I want to see if the models are qualitatively better at alignment research tasks. In general, what I'm seeing is that there is not big jump with o1 Pro. However, it is possibly getting closer to one-shot a website based on a screenshot and some details about how the user likes their backend setup. In the case of math, it might be a bigger jump (especially if you pair it well with Sonnet).

8jacquesthibs1y

Regarding coding in general, I basically only prompt programme these days. I only bother editing the actual code when I notice a persistent bug that the models are unable to fix after multiple iterations. I don't know jackshit about web development and have been making progress on a dashboard for alignment research with very little effort. Very easy to build new projects quickly. The difficulty comes when there is a lot of complexity in the code. It's still valuable to understand how high-level things work and low-level things the model will fail to proactively implement.

3Aprillion9mo

While Carl Brown said (a few times) he doesn't want to do more youtube videos for every new disappointing AI release, so far he seems to be keeping tabs on them in the newsletter just fine - https://internetofbugs.beehiiv.com/ ...I am quite confident that if anything actually started to work, he would comment on it, so even if he won't say much about any future incremental improvements, it might be a good resource to subscribe to for getting better signal - if Carl will get enthusiastic about AI coding assistants, it will be worth paying attention.

[-]johnswentworth6mo195

How can biochemical interventions be spatially localized, and why is that problem important?

High vs low voltage has very different semantics at different places on a computer chip. In one spot, a high voltage might indicate a number is odd rather than even. In another spot, a high voltage might indicate a number is positive rather than negative. In another spot, it might indicate a jump instruction rather than an add.

Likewise, the same chemical species have very different semantics at different places in the human body. For example, high serotonin concentration along the digestive tract is a signal to digest, whereas high serotonin concentration in various parts of the brain signals... uh... other stuff. Similarly, acetylcholine is used as a neurotransmitter both at neuromuscular junctions and in the brain, and these have different semantics. More generally, IIUC neurotransmitters like dopamine, norepinephrine, or serotonin are released by neurons originating at multiple anatomically distinct little sub-organs in the brain. Each sub-organ projects to different places, and the same neurotransmitter probably has different semantics when different sub-organs project to different targe... (read more)

[-]johnswentworth7mo183

It feels like unstructured play makes people better/stronger in a way that structured play doesn't.

What do I mean? Unstructured play is the sort of stuff I used to do with my best friend in high school:

unscrewing all the cabinet doors in my parents' house, turning them upside down and/or backwards, then screwing them back on
jumping in and/or out of a (relatively slowly) moving car
making a survey and running it on people at the mall
covering pool noodles with glow-in-the-dark paint, then having pool noodle sword fights with them at night while the paint is still wet, so we can tell who's winning by who's glowing more

In contrast, structured play is more like board games or escape rooms or sports. It has fixed rules. (Something like making and running a survey can be structured play or unstructured play or not play at all, depending on the attitude with which one approaches it. Do we treat it as a fun thing whose bounds can be changed at any time?)

I'm not quite sure why it feels like unstructured play makes people better/stronger, and I'd be curious to hear other peoples' thoughts on the question. I'm going to write some of mine below, but maybe don't look at them yet if you want to an... (read more)

6Thane Ruthenis7mo

(Written before reading the second part of the OP.) I don't really share that feeling[1]. But if I conditioned on that being true and then produced an answer: Obviously because it trains research taste. Or, well, the skills in that cluster. If you're free to invent/modify the rules of the game at any point, then if you're to have fun, you need to be good at figuring out what rules would improve the experience for you/everyone, and what ideas would detract from it. You're simultaneously acting as a designer and as a player. And there's also the element of training your common-sense/world-modeling skills: what games would turn out fun and safe in the real world, and which ones seem fun in your imagination, but would end up boring due to messy realities or result in bodily harm. By contrast, structured play enforces a paradigm upon you and only asks you to problem-solve within it. It trains domain-specific skills, whereas unstructured play is "interdisciplinary", in that you can integrate anything in your reach into it. More broadly: when choosing between different unstructured plays, you're navigating a very-high-dimensional space of possible games, and (1) that means there's simply a richer diversity of possible games you can engage in, which means a richer diversity of skills you can learn, (2) getting good at navigating that space is a useful skill in itself. Structured plays, on the other hand, present for choice a discrete set of options pre-computed to you by others. Unstructured play would also be more taxing on real-time fluid-intelligence problem-solving. Inferring the rules (if they've been introduced/changed by someone else), figuring out how to navigate them on the spot, etc. What's the sense of "growing better/stronger" you're using here? Fleshing that out might make the answer obvious. 1. ^ Not in the sense that I think this statement is wrong, but in that I don't have the intuition that it's true.

4tailcalled7mo

My guess would be unstructured play develops more material skills and structured play develops more social skills.

[-]johnswentworth3mo150

One thing we've been working on lately is finding natural latents in real datasets. Looking for natural latents between pairs of variables with only a few values each is relatively easy in practice with the math we have at this point. But that doesn't turn up much in excel-style datasets, and one wouldn't particularly expect it to turn up much in such datasets. Intuitively, it seems like more "distributed" latents are more practically relevant for typical excel-style datasets - i.e. latents for which many different observables each yield some weak information.

Here's one operationalization, which runs into some cute math/numerical algorithm problems for which I have a working solution but not a very satisfying solution. Maybe you enjoy those sorts of problems and will want to tackle them!

Setup and Math

Assume we have (categorical) observable variables $X_{1}, . . ., X_{m}$ and a latent variable $Λ$ . We'll make two assumptions about the form of the distribution:

Assumption 1: $P [X | Λ]$ has exponential form with all $X_{i}$ independent given $Λ$ . I.e. $P [X | Λ] = \prod_{i} \frac{1}{Z_{| Λ}^{i} (λ)} e^{λ^{T} f_{i} (x_{i})} = \frac{1}{Z_{| Λ} (λ)} e^{λ^{T} \sum_{i} f_{i} (x_{i})}$ .
Assumption 2: $P [Λ | X]$ is normal. I.e. $P [Λ | X] = \frac{1}{Z_{| X} (x)} e^{- \frac{1}{2}}$

... (read more)

4Lucius Bushnaq3mo

Not following this part. Can you elaborate? Some scattered thoughts: 1. Regrading convergence, to state the probably obvious, since P[Xi∣Λ]∝∑xeλTfi(xi), fi(xi) at least has to go to zero for x going to infinity. 2. In my field-theory-brained head, the analysis seems simpler to think about for continuous x. So unless we're married to x being discrete, I'd switch from ∑x to ∫dx. Then you can potentially use Gaussian integral and source-term tricks with the dependency on x as well. If you haven't already, you might want to look at (quantum) field theory textbooks that describe how to calculate expectation values of observables over path integrals. This expression looks extremely like the kind of thing you'd usually want to calculate with Feynman diagrams, except I'm not sure whether the fi(xi)have the right form to allow us to power expand in xi and then shove the non-quadratic xi terms into source derivatives the way we usually would in perturbative quantum field theory. 3. If all else fails, you can probably do it numerically, lattice-QFT style, using techniques like hybrid Monte Carlo to sample points in the integral efficiently.[1] 1. ^ You can maybe also train a neural network to do the sampling.

4johnswentworth3mo

I'm assuming, for simplicity, that each Xi has finitely many values. The sum on X is then a sum on the cartesian product of the values of each Xi, which we can rewrite in general as ∑Xg(X)=1∏iniEQ[g(X)], where Q is the uniform distribution on X and ni is the number of values of Xi. That uniform distribution Q is a product of uniform distributions over each individual Xi, i.e. Uniform[X]=∏iUniform[Xi], so the Xi's are all independent under Q. So, under Q, the fi(Xi)'s are all independent. Did that clarify? Yup, it sure does look similar. One tricky point here is that we're trying to fit the f's to the data, so if going that route we'd need to pick some parametetric form for f. We'd want to pick a form which always converges, but also a form general enough that the fitting process doesn't drive f to the edge of our admissible region.

4Lucius Bushnaq3mo

Yes. Seems like a pretty strong assumption to me. Ah. In that case, are you sure you actually need Z to do the model comparisons you want? Do you even really need to work with this specific functional form at all? As opposed to e.g. training a model p(λ∣X) to feed its output into m tiny normalizing flow models which then try to reconstruct the original input data with conditional probability distributions qi(xi∣λ)? To sketch out a little more what I mean, p(λ∣X) could e.g. be constructed as a parametrised function[1] which takes in the actual samples X and returns the mean of a Gaussian, which λ is then sampled from in turn[2]. The qi(xi∣λ) would be constructed using normalising flow networks[3], which take in λ as well as uniform distributions over variables zi that have the same dimensionality as their xi. Since the networks are efficiently invertible, this gives you explicit representations of the conditional probabilities qi(xi∣λ), which you can then fit to the actual data using KL-divergence. You'd get explicit representations for both P[λ∣X] and P[X∣λ] from this. 1. ^ Or ensemble of functions, if you want the mean of λ to be something like ∑ifi(xi) specifically. 2. ^ Using reparameterization to keep the sampling operation differentiable in the mean. 3. ^ If the dictionary of possible values of X is small, you can also just use a more conventional ml setup which explicitly outputs probabilities for every possible value of every xi of course.

4johnswentworth3mo

That would be pretty reasonable, but it would make the model comparison part even harder. I do need P[X] (and therefore Z) for model comparison; this is the challenge which always comes up for Bayesian model comparison.

4Lucius Bushnaq3mo

Why does it make Bayesian model comparison harder? Wouldn't you get explicit predicted probabilities for the data X from any two models you train this way? I guess you do need to sample from the Gaussian in λ a few times for each X and pass the result through the flow models, but that shouldn't be too expensive.

4Alexander Gietelink Oldenziel3mo

For my interest, for these reallife latents with many different pieces contributing a small amount of information do you reckon Eisenstat's Condensation / some unpublished work you mentioned at ODYSSEY would be the right framework here?

2johnswentworth3mo

Sort of. Condensation as-written requires what David and I call "strong redundancy", i.e. the latent must be determinable from any one observable downstream, which is the opposite of "small amount of information from each individual observable". But it's pretty easy to bridge between the two mathematically by glomming together multiple observables into one, which is usually how David and I think about it. The way you'd use this is: * Use the sort of machinery above to find a latent which is weakly loaded on many different observables. * Check how well that latent satisfies redundancy over some subset of the observables. * If we can find disjoint subsets of observables (any disjoint subsets) such that the latent can be determined reasonably well from any one of the subsets, then the machinery of natural latents/condensation kicks in to give us guarantees about universality of the latent.

3Lorxus3mo

No kidding? Did you get a sense of why the datasets I picked didn't really work for the purpose when I gave that a try? Entirely possible that you don't remember but it was a dataset of candidate exoplanets and an admittedly synthetic clustering tester set.

2johnswentworth3mo

Haven't been using that one, but I expect it would have very different results than the dataset we are using. That one would test very different things than we're currently trying to get feedback on; there's a lot more near-deterministic known structure in that one IIRC.

[-]johnswentworth3y153

I've heard various people recently talking about how all the hubbub about artists' work being used without permission to train AI makes it a good time to get regulations in place about use of data for training.

If you want to have a lot of counterfactual impact there, I think probably the highest-impact set of moves would be:

Figure out a technical solution to robustly tell whether a given image or text was used to train a given NN.
Bring that to the EA folks in DC. A robust technical test like that makes it pretty easy for them to attach a law/regulation to it. Without a technical test, much harder to make an actually-enforceable law/regulation.
In parallel, also open up a class-action lawsuit to directly sue companies using these models. Again, a technical solution to prove which data was actually used in training is the key piece here.

Model/generator behind this: given the active political salience, it probably wouldn't be too hard to get some kind of regulation implemented. But by-default it would end up being something mostly symbolic, easily circumvented, and/or unenforceable in practice. A robust technical component, plus (crucially) actually bringing that robust technical compo... (read more)

[-]johnswentworth4y*140

Suppose I have a binary function $f$ , with a million input bits and one output bit. The function is uniformly randomly chosen from all such functions - i.e. for each of the $2^{1000000}$ possible inputs $x$ , we flipped a coin to determine the output $f (x)$ for that particular input.

Now, suppose I know $f$ , and I know all but 50 of the input bits - i.e. I know 999950 of the input bits. How much information do I have about the output?

Answer: almost none. For almost all such functions, knowing 999950 input bits gives us $\sim \frac{1}{2^{50}}$ bits of information about the output. More generally, If the function has $n$ input bits and we know all but $k$ , then we have $o (\frac{1}{2^{k}})$ bits of information about the output. (That’s “little $o$ ” notation; it’s like big $O$ notation, but for things which are small rather than things which are large.) Our information drops off exponentially with the number of unknown bits.

Proof Sketch

With $k$ input bits unknown, there are $2^{k}$ possible inputs. The output corresponding to each of those inputs is an independent coin flip, so we have $2^{k}$ independent coin flips. If $m$ of th... (read more)

4Dagon4y

o(1/2^k) doesn't vary with n - are you saying that it doesn't matter how big the input array is, the only determinant is the number of unknown bits, and the number of known bits is irrelevant? That would be quite interesting if so (though I have some question about how likely the function is to be truly random from an even distribution of such functions). One can enumerate all such 3-bit functions (8 different inputs, each input can return 0 or 1, so 256 functions (one per output-bit-pattern of the 8 possible inputs). But this doesn't seem to follow your formula - if you have 3 unknown bits, that should be 1/8 of a bit about the output, 2 for 1/4, and 1 unknown for 1/2 a bit about the output. But in fact, the distribution of functions includes both 0 and 1 output for every input pattern, so you actually have no predictive power for the output if you have ANY unknown bits.

4johnswentworth4y

Yes, that's correct. The claim is for almost all functions when the number of inputs is large. (Actually what we need is for 2^(# of unknown bits) to be large in order for the law of large numbers to kick in.) Even in the case of 3 unknown bits, we have 256 possible functions, and only 18 of those have less than 1/4 1's or more than 3/4 1's among their output bits.

2Kenny4y

Little o is just a tighter bound. I don't know what you are referring to by your statement:

3johnswentworth4y

I'm not sure what context that link is assuming, but in an analysis context I typically see little o used in ways like e.g. "f(x)=f(x0)+dfdx|x0dx+o(dx2)". The interpretation is that, as dx goes to 0, the o(dx2) terms all fall to zero at least quadratically (i.e. there is some C such that Cdx2 upper bounds the o(dx2) term once dx is sufficiently small). Usually I see engineers and physicists using this sort of notation when taking linear or quadratic approximations, e.g. for designing numerical algorithms.

[-]johnswentworth6y*130

I find it very helpful to get feedback on LW posts before I publish them, but it adds a lot of delay to the process. So, experiment: here's a link to a google doc with a post I plan to put up tomorrow. If anyone wants to give editorial feedback, that would be much appreciated - comments on the doc are open.

I'm mainly looking for comments on which things are confusing, parts which feel incomplete or slow or repetitive, and other writing-related things; substantive comments on the content should go on the actual post once it's up.

EDIT: it's up. Thank you to Stephen for comments; the post is better as a result.

[-]johnswentworth10mo*12-41

Here's a place where I feel like my models of romantic relationships are missing something, and I'd be interested to hear peoples' takes on what it might be.

Background claim: a majority of long-term monogamous, hetero relationships are sexually unsatisfying for the man after a decade or so. Evidence: Aella's data here and here are the most legible sources I have on hand; they tell a pretty clear story where sexual satisfaction is basically binary, and a bit more than half of men are unsatisfied in relationships of 10 years (and it keeps getting worse from there). This also fits with my general models of mating markets: women usually find the large majority of men sexually unattractive, most women eventually settle on a guy they don't find all that sexually attractive, so it should not be surprising if that relationship ends up with very little sex after a few years.

What doesn't make sense under my current models is why so many of these relationships persist. Why don't the men in question just leave? Obviously they might not have better relationship prospects, but they could just not have any relationship. The central question which my models don't have a compelling answer to is: wh... (read more)

[-]yams10mo2519

Ah, I think this just reads like you don't think of romantic relationships as having any value proposition beyond the sexual, other than those you listed (which are Things but not The Thing, where The Thing is some weird discursive milieu). Also the tone you used for describing the other Things is as though they are traps that convince one, incorrectly, to 'settle', rather than things that could actually plausibly outweigh sexual satisfaction.

Different people place different weight on sexual satisfaction (for a lot of different reasons, including age).

I'm mostly just trying to explain all the disagree votes. I think you'll get the most satisfying answer to your actual question by having a long chat with one of your asexual friends (as something like a control group, since the value of sex to them is always 0 anyway, so whatever their cause is for having romantic relationships is probably the kind of thing that you're looking for here).

6johnswentworth10mo

That's an excellent suggestion, thanks.

[-]abramdemski9mo184

There are a lot of replies here, so I'm not sure whether someone already mentioned this, but: I have heard anecdotally that homosexual men often have relationships which maintain the level of sex over the long term, while homosexual women often have long-term relationships which very gradually decline in frequency of sex, with barely any sex after many decades have passed (but still happily in a relationship).

This mainly argues against your model here:

This also fits with my general models of mating markets: women usually find the large majority of men sexually unattractive, most women eventually settle on a guy they don't find all that sexually attractive, so it should not be surprising if that relationship ends up with very little sex after a few years.

It suggests instead that female sex drive naturally falls off in long-term relationships in a way that male sex drive doesn't, with sexual attraction to a partner being a smaller factor.

4Garrett Baker9mo

Note: You can verify this is the case by filtering for male respondents with male partners and female respondents with female partners in the survey data

[-]DirectedEvolution10mo1615

“I'm skeptical of this one because female partners are typically notoriously high maintenance in money, attention, and emotional labor.”

Some people enjoy attending to their partner and find meaning in emotional labor. Housing’s a lot more expensive than gifts and dates. My partner and I go 50/50 on expenses and chores. Some people like having long-term relationships with emotional depth. You might want to try exploring out of your bubble, especially if you life in SF, and see what some normal people (ie non-rationalists) in long term relationships have to say about it.

[-]Elizabeth9mo*1514

female partners are typically notoriously high maintenance in money, attention, and emotional labor.

That's the stereotype, but men are the ones who die sooner if divorced, which suggests they're getting a lot out of marriage.

ETA: looked it up, divorced women die sooner as well, but the effect is smaller despite divorce having a bigger financial impact on women.

9johnswentworth9mo

Causality dubious, seems much more likely on priors that men who divorced are disproportionately those with Shit Going On in their lives. That said, it is pretty plausible on priors that they're getting a lot out of marriage.

[-]Garrett Baker10mo121

I will also note that Aella's relationships data is public, and has the following questions:

1. Your age? (rkkox57)
2. Which category fits you best? (4790ydl)
3. In a world where your partner was fully aware and deeply okay with it, how much would you be interested in having sexual/romantic experiences with people besides your partner? (ao3mcdk)
4. In a world where you were fully aware and deeply okay with it, how much would *your partner* be interested in having sexual/romantic experiences with people besides you? (wcq3vrx)
5. To get a little more specific, how long have you been in a relationship with this person? (wqx272y)
6. Which category fits your partner best? (u9jccbo)
7. Are you married to your partner? (pfqs9ad)
8. Do you have children with your partner? (qgjf1nu)
9. Have you or your partner ever cheated on each other? (hhf9b8h)
10. On average, over the last six months, about how often do you watch porn or consume erotic content for the purposes of arousal? (vnw3xxz)
11. How often do you and your partner have a fight? (x6jw4sp)
12. "It’s hard to imagine being happy without this relationship." (6u0bje)
13. "I have no secrets from my partner" (bgassjt)
14. "If my partner an

... (read more)

[-]Thane Ruthenis10mo125

I see two explanations: the boring wholesome one and the interesting cynical one.

The wholesome one is: You're underestimating how much other value the partner offers and how much the men care about the mostly-platonic friendship. I think that's definitely a factor that explains some of the effect, though I don't know how much.

The cynical one is: It's part of the template. Men feel that are "supposed to" have wives past a certain point in their lives; that it's their role to act. Perhaps they even feel that they are "supposed to" have wives they hate, see the cliché boomer jokes.

They don't deviate from this template, because:

It's just something that is largely Not Done. Plans such as "I shouldn't get married" or "I should get a divorce" aren't part of the hypothesis space they seriously consider.
- In the Fristonian humans-are-prediction-error-minimizers frame: being married is what the person expects, so their cognition ends up pointed towards completing the pattern, one way or another. As a (controversial) comparison, we can consider serial abuse victims, which seem to somehow self-select for abusive partners despite doing everything in their conscious power to avoid them.
- In your par

it's the mystery of love, John

[-]Elizabeth9mo*115

Their romantic partner offering lots of value in other ways. I'm skeptical of this one because female partners are typically notoriously high maintenance in money, attention, and emotional labor. Sure, she might be great in a lot of ways, but it's hard for that to add up enough to outweigh the usual costs.

Assuming arguendo this is true: if you care primarily about sex, hiring sex workers is orders of magnitude more efficient than marriage. Therefor the existence of a given marriage is evidence both sides get something out of it besides sex.

3Viliam5mo

If both partners have an income, then living together is usually cheaper than each of them living alone, and sex is just a bonus to that. How would sex workers be the cheaper alternative? Possibly true if one size has zero income.

1Cedar5mo

Making no claim about the actual value of each, but can't I counter your specific argument by saying, marriage is a socially enforced cartel for sex, and if they could do so without being punished, a lot more men would rather not get sex without getting married?

8Johannes C. Mayer9mo

Imagine a woman is a romantic relationship with somebody else. Are they still so great a person that you would still enjoy hanging out with them as a friend? If not that woman should not be your girlfriend. Friendship first. At least in my model romantic stuff should be stacked ontop of platonic love.

7Lucius Bushnaq10mo

This data seems to be for sexual satisfaction rather than romantic satisfaction or general relationship satisfaction.

3johnswentworth10mo

Yes, the question is what value-proposition accounts for the romantic or general relationship satisfaction.

[-]Lucius Bushnaq10mo*3128

Relationship ... stuff?

I guess I feel kind of confused by the framing of the question. I don't have a model under which the sexual aspect of a long-term relationship typically makes up the bulk of its value to the participants. So, if a long-term relationship isn't doing well on that front, and yet both participants keep pursuing the relationship, my first guess would be that it's due to the value of everything that is not that. I wouldn't particularly expect any one thing to stick out here. Maybe they have a thing where they cuddle and watch the sunrise together while they talk about their problems. Maybe they have a shared passion for arthouse films. Maybe they have so much history and such a mutually integrated life with partitioned responsibilities that learning to live alone again would be a massive labour investment, practically and emotionally. Maybe they admire each other. Probably there's a mixture of many things like that going on. Love can be fed by many little sources.

So, this I suppose:

Their romantic partner offering lots of value in other ways. I'm skeptical of this one because female partners are typically notoriously high maintenance in money, attention, and emotional labor. Sure, she might be great in a lot of ways, but it's hard for that to add up enough to outweigh the usual costs.

I don't find it hard at all to see how that'd add up to something that vastly outweighs the costs, and this would be my starting guess for what's mainly going on in most long-term relationships of this type.

6johnswentworth9mo

Update 3 days later: apparently most people disagree strongly with Most people in the comments so far emphasize some kind of mysterious "relationship stuff" as upside, but my actual main update here is that most commenters probably think the typical costs are far far lower than I imagined? Unsure, maybe the "relationship stuff" is really ridiculously high value. So I guess it's time to get more concrete about the costs I had in mind: * A quick google search says the male is primary or exclusive breadwinner in a majority of married couples. Ass-pull number: the monetary costs alone are probably ~50% higher living costs. (Not a factor of two higher, because the living costs of two people living together are much less than double the living costs of one person. Also I'm generally considering the no-kids case here; I don't feel as confused about couples with kids.) * I was picturing an anxious attachment style as the typical female case (without kids). That's unpleasant on a day-to-day basis to begin with, and I expect a lack of sex tends to make it a lot worse. * Eyeballing Aella's relationship survey data, a bit less than a third of respondents in 10-year relationships reported fighting multiple times a month or more. That was somewhat-but-not-dramatically less than I previously pictured. Frequent fighting is very prototypically the sort of thing I would expect to wipe out more-than-all of the value of a relationship, and I expect it to be disproportionately bad in relationships with little sex. * Less legibly... conventional wisdom sure sounds like most married men find their wife net-stressful and unpleasant to be around a substantial portion of the time, especially in the unpleasant part of the hormonal cycle, and especially especially if they're not having much sex. For instance, there's a classic joke about a store salesman upselling a guy a truck, after upselling him a boat, after upselling him a tackle box, after [...] and the punchline is "No, he wasn't

[-]Lucius Bushnaq9mo*230

A quick google search says the male is primary or exclusive breadwinner in a majority of married couples. Ass-pull number: the monetary costs alone are probably ~50% higher living costs. (Not a factor of two higher, because the living costs of two people living together are much less than double the living costs of one person. Also I'm generally considering the no-kids case here; I don't feel as confused about couples with kids.

But remember that you already conditioned on 'married couples without kids'. My guess would be that in the subset of man-woman married couples without kids, the man being the exclusive breadwinner is a lot less common than in the set of all man-woman married couples. These properties seem like they'd be heavily anti-correlated.

In the subset of man-woman married couples without kids that get along, I wouldn't be surprised if having a partner effectively works out to more money for both participants, because you've got two incomes, but less than 2x living expenses.

I was picturing an anxious attachment style as the typical female case (without kids). That's unpleasant on a day-to-day basis to begin with, and I expect a lack of sex tends to make it a lot worse.

I... (read more)

6johnswentworth9mo

This comment gave me the information I'm looking for, so I don't want to keep dragging people through it. Please don't feel obligated to reply further! That said, I did quickly look up some data on this bit: ... so I figured I'd drop it in the thread. When interpreting these numbers, bear in mind that many couples with no kids probably intend to have kids in the not-too-distant future, so the discrepancy shown between "no children" and 1+ children is probably somewhat smaller than the underlying discrepancy of interest (which pushes marginally more in favor of Lucius' guess).

2johnswentworth9mo

Big thank you for responding, this was very helpful.

7Zack_M_Davis9mo

Not sure how much this generalizes to everyone, but part of the story (for either the behavior or the pattern of responses to the question) might that some people are ideologically attached to believing in love: that women and men need each other as a terminal value, rather than just instrumentally using each other for resources or sex. For myself, without having any particular empirical evidence or logical counterargument to offer, the entire premise of the question just feels sad and gross. It's like you're telling me you don't understand why people try to make ghosts happy. But I want ghosts to be happy.

[-]johnswentworth9mo113

That is useful, thanks.

Any suggestions for how I can better ask the question to get useful answers without apparently triggering so many people so much? In particular, if the answer is in fact "most men would be happier single but are ideologically attached to believing in love", then I want to be able to update accordingly. And if the answer is not that, then I want to update that most men would not be happier single. With the current discussion, most of what I've learned is that lots of people are triggered by the question, but that doesn't really tell me much about the underlying reality.

[-]Thane Ruthenis9mo150

Track record: My own cynical take seems to be doing better with regards to not triggering people (though it's admittedly less visible).

Any suggestions for how I can better ask the question to get useful answers without apparently triggering so many people so much?

First off, I'm kind of confused about how you didn't see this coming. There seems to be a major "missing mood" going on in your posts on the topic – and I speak as someone who is sorta-aromantic, considers the upsides of any potential romantic relationship to have a fairly low upper bound for himself^[1], and is very much willing to entertain the idea that a typical romantic relationship is a net-negative dumpster fire.

So, obvious-to-me advice: Keep a mental model of what topics are likely very sensitive and liable to trigger people, and put in tons of caveats and "yes, I know, this is very cynical, but it's my current understanding" and "I could totally be fundamentally mistaken here".

In particular, a generalization of an advice from here has been living in my head rent-free for years (edited/adapted):

Tips For Talking About Your Beliefs On Sensitive Topics
You want to make it clear that they're just your current beliefs abo

... (read more)

5cousin_it5mo

I think it's net negative. Seen it with any combination of genders. The person who's less happy in the relationship stays due to force of habit, fear of the unknown, and the other person giving them a precise minimum of "crumbs" to make them stay. Even a good relationship can fall into this pattern slowly, with the other person believing all along that everything is fine. And when it finally breaks (often due to some random event breaking the suspension of disbelief), the formerly unhappy person is surprised how much better things become.

5Garrett Baker9mo

An effect I noticed: Going through Aella's correlation matrix (with poorly labeled columns sadly), a feature which strongly correlates with the length of a relationship is codependency. Plotting question 20. "The long-term routines and structure of my life are intertwined with my partner's" (li0toxk) assuming that's what "codependency" refers to The shaded region is a 95% posterior estimate for the mean of the distribution conditioned on the time-range (every 2 years) and cis-male respondents, with prior N(0,0.5). Note also that codependency and sex satisfaction are basically uncorrelated This shouldn't be that surprising. Of course the longer two people are together the more their long term routines will be caught up with each other. But also this seems like a very reasonable candidate for why people will stick together even without a good sex life.

5Jonas Hallgren9mo

I thought I would give you another causal model based on neuroscience which might help. I think your models are missing a core biological mechanism: nervous system co-regulation. Most analyses of relationship value focus on measurable exchanges (sex, childcare, financial support), but overlook how humans are fundamentally regulatory beings. Our nervous systems evolved to stabilize through connection with others. When you share your life with someone, your biological systems become coupled. This creates several important values: 1. Your stress response systems synchronize and buffer each other. A partner's presence literally changes how your body processes stress hormones - creating measurable physiological benefits that affect everything from immune function to sleep quality. 2. Your capacity to process difficult emotions expands dramatically with someone who consistently shows up for you, even without words. 3. Your nervous system craves predictability. A long-term partner represents a known regulatory pattern that helps maintain baseline homeostasis - creating a biological "home base" that's deeply stabilizing. For many men, especially those with limited other sources of deep co-regulation, these benefits may outweigh sexual dissatisfaction. Consider how many men report feeling "at peace" at home despite minimal sexual connection - their nervous systems are receiving significant regulatory benefits. This also explains why leaving feels so threatening beyond just practical considerations. Disconnecting an integrated regulatory system that has developed over years registers in our survival-oriented brains as a fundamental threat. This isn't to suggest people should stay in unfulfilling relationships - rather, it helps explain why many do, and points to the importance of developing broader regulatory networks before making relationship transitions.

5Viliam10mo

This seems supported by the popular wisdom. Question is, how much this is about relationships and sex specifically, and how much it is just another instance of a more general "life is full of various frustrations" or "when people reach their goals, after some time they became unsatisfied again" i.e. hedonistic treadmill. Is it? So, basically those women pretend to be more attracted than they are (to their partner, and probably also to themselves) in order to get married. Then they gradually stop pretending. But why is it so important to get married (or whatever was the goal of the original pretending), but then it is no longer important to keep the marriage happy? Is that because women get whatever they want even from an unhappy marriage, and divorces are unlikely? That doesn't feel like a sufficient explanation to me: divorces are quite frequent, and often initiated by women. I guess I am not sure what exactly is the women's utility function that this model assumes. Kids, not wanting to lose money in divorce, other value the partner provides, general lack of agency, hoping that the situation will magically improve... probably all of that together. Also, it seems to me that often both partners lose value on the dating market when they start taking their relationship for granted, stop trying hard, gain weight, stop doing interesting things, and generally get older. Even if the guy is frustrated, that doesn't automatically mean that entering the dating market again would make him happy. I imagine that many divorced men find out that an alternative to "sex once a month" could also be "sex never" (or "sex once a month, but it also takes a lot of time and effort and money").

8VivaLaPanda9mo

Worth noting that this pattern occurs among gay couples as well! (i.e. sexless long-term-relationship, where one party is unhappy about this). I think that conflict in desires/values is inherent in all relationship, and long-term-relationships have more room for conflict because they involve a closer/longer relationship. Sex drive is a major area where partners tend to diverge especially frequently (probably just for biological reasons in het couples). It's not obvious to me that sex in marriages needs much special explanation beyond the above. Unless of course the confusion is just "why don't people immediately end all relationships whenever their desires conflict with those of their counterparty".

6Viliam9mo

A general source of problems is that when people try to get a new partner, they try to be... more appealing than usual, in various ways. Which means that after the partner is secured, the behavior reverts to the norm, which is often a disappointment. One way how people try to impress their partners is that the one with lower sexual drive pretends to be more enthusiastic about sex than they actually are in long term. So the moment one partner goes "amazing, now I finally have someone who is happy to do X every day or week", the other partner goes "okay, now that the courtship phase is over, I guess I no longer have to do X every day or week". There are also specific excuses in heterosexual couples, like the girl pretending that she is actually super into doing sex whenever possible, it's just that she is too worried about accidental pregnancy or her reputation... and when these things finally get out of the way, it turns out that it was just an excuse. Perhaps the polyamorous people keep themselves in better shape, but I suspect that they have similar problems, only instead of "my partner no longer wants to do X" it is "my partner no longer wants to do X with me".

3Jasnah Kholin3mo

reading it is weird, because my model is somewhat the opposite - more women initiate divorce then men, and more women will gain from initiating it, and remain in relationships they should leave. women make more of the housework, more of the emotional labor (the point about women require emotional work is wildly contradicting my model), more of the maintaining social ties (there are studies i read about that, and socialization reasons for that. women have more friends and more intimate friends, and a lot of men freeload on their gf friendships and have no intimate relationship that is not romantic). it can be that both are true, and it's not hard imagining two deeply incompatible people, when breaking up will be net-positive for both of them. but this is not my actual model, nor are the statistics i encountered - for example, that married men live longer, while married women shorter. in my model, in standard marriage, the wins-from-trade are distributed unevenly, and a lot of times the man gain and the woman lose. and all that still hold marriages is kids, and the remains of social stigma. and i know various statistics -about housework and happiness after the spouse die and life expectancy that does not contradict this model. I also encountered a lot of anecdata that sounds like (not actual citation) "i broke up, this bf made my life so much worse" and even (not actual citation) "i divorced, and despite having to do all the work alone and not having the money he provided, i have more time, because he was so useless housework and childcare-wise, that he net-added work, and i much easier without him." so, like, models when marriages are net-negative for men look to me so strange, and one that i don't know how to reconcile with so much contradicting data.

3Garrett Baker10mo

An obvious answer you missed: Lacking a prenup, courts often rule in favor of the woman over the man in the case of a contested divorce.

-5LVSN10mo

[-]johnswentworth3yΩ7122

Consider two claims:

Any system can be modeled as maximizing some utility function, therefore utility maximization is not a very useful model
Corrigibility is possible, but utility maximization is incompatible with corrigibility, therefore we need some non-utility-maximizer kind of agent to achieve corrigibility

These two claims should probably not both be true! If any system can be modeled as maximizing a utility function, and it is possible to build a corrigible system, then naively the corrigible system can be modeled as maximizing a utility function.

I expect that many peoples' intuitive mental models around utility maximization boil down to "boo utility maximizer models", and they would therefore intuitively expect both the above claims to be true at first glance. But on examination, the probable-incompatibility is fairly obvious, so the two claims might make a useful test to notice when one is relying on yay/boo reasoning about utilities in an incoherent way.

9Steven Byrnes3y

FWIW I endorse the second claim when the utility function depends exclusively on the state of the world in the distant future, whereas I endorse the first claim when the utility function can depend on anything whatsoever (e.g. what actions I’m taking right this second). (details) I wish we had different terms for those two things. That might help with any alleged yay/boo reasoning. (When Eliezer talks about utility functions, he seems to assume that it depends exclusively on the state of the world in the distant future.)

5Johannes C. Mayer2y

Expected Utility Maximization is Not Enough Consider a homomorphically encrypted computation running somewhere in the cloud. The computations correspond to running an AGI. Now from the outside, you can still model the AGI based on how it behaves, as an expected utility maximizer, if you have a lot of observational data about the AGI (or at least let's take this as a reasonable assumption). No matter how closely you look at the computations, you will not be able to figure out how to change these computations in order to make the AGI aligned if it was not aligned already (Also, let's assume that you are some sort of Cartesian agent, otherwise you would probably already be dead if you were running these kinds of computations). So, my claim is not that modeling a system as an expected utility maximizer can't be useful. Instead, I claim that this model is incomplete. At least with regard to the task of computing an update to the system, such that when we apply this update to the system, it would become aligned. Of course, you can model any system, as an expected utility maximizer. But just because I can use the "high level" conceptual model of expected utility maximization, to model the behavior of a system very well. But behavior is not the only thing that we care about, we actually care about being able to understand the internal workings of the system, such that it becomes much easier to think about how to align the system. So the following seems to be beside the point unless I am <missing/misunderstanding> something: Maybe I have missed the fact that the claim you listed says that expected utility maximization is not very useful. And I'm saying it can be useful, it might just not be sufficient at all to actually align a particular AGI system. Even if you can do it arbitrarily well.

4Viliam3y

I am not an expert, but as I remember it, it was a claim that "any system that follows certain axioms can be modeled as maximizing some utility function". The axioms assumed that there were no circular preferences -- if someone prefers A to B, B to C, and C to A, it is impossible to define a utility function such that u(A) > u(B) > u(C) > u(A) -- and that if the system says that A > B > C, it can decide between e.g. a 100% chance of B, and a 50% chance of A with a 50% chance of C, again in a way that is consistent. I am not sure how this works when the system is allowed to take current time into account, for example when it is allowed to prefer A to B on Monday but prefer B to A on Tuesday. I suppose that in such situation any system can trivially be modeled by a utility function that at each moment assigns utility 1 to what the system actually did in that moment, and utility 0 to everything else. Corrigibility is incompatible with assigning utility to everything in advance. A system that has preferences about future will also have a preference about not having its utility function changed. (For the same reason people have a preference not to be brainwashed, or not to take drugs, even if after brainwashing they are happy about having been brainwashed, and after getting addicted they do want more drugs.) Corrigible system would be like: "I prefer A to B at this moment, but if humans decide to fix me and make me prefer B to A, then I prefer B to A". In other words, it doesn't have values for u(A) and u(B), or it doesn't always act according to those values. A consistent system that currently prefers A to B would prefer not to be fixed.

5Steven Byrnes3y

I think John's 1st bullet point was referring to an argument you can find in https://www.lesswrong.com/posts/NxF5G6CJiof6cemTw/coherence-arguments-do-not-entail-goal-directed-behavior and related.

4Vladimir_Nesov3y

A utility function represents preference elicited in a large collection of situations, each a separate choice between events that happens with incomplete information, as an event is not a particular point. This preference needs to be consistent across different situations to be representable by expected utility of a single utility function. Once formulated, a utility function can be applied to a single choice/situation, such as a choice of a policy. But a system that only ever makes a single choice is not a natural fit for expected utility frame, and that's the kind of system that usually appears in "any system can be modeled as maximizing some utility function". So it's not enough to maximize something once, or in a narrow collection of situations, the situations the system is hypothetically exposed to need to be about as diverse as choices between any pair of events, with some of the events very large, corresponding to unreasonably incomplete information, all drawn across the same probability space. One place this mismatch of frames happens is with updateless decision theory. An updateless decision is a choice of a single policy, once and for all, so there is no reason for it to be guided by expected utility, even though it could be. The utility function for the updateless choice of policy would then need to be obtained elsewhere, in a setting that has all these situations with separate (rather than all enacting a single policy) and mutually coherent choices under uncertainty. But once an updateless policy is settled (by a policy-level decision), actions implied by it (rather than action-level decisions in expected utility frame) no longer need to be coherent. Not being coherent, they are not representable by an action-level utility function. So by embracing updatelessness, we lose the setting that would elicit utility if the actions were instead individual mutually coherent decisions. And conversely, by embracing coherence of action-level decisions, we get an

3JNS3y

Completely off the cuff take: I don't think claim 1 is wrong, but it does clash with claim 2. That means any system that has to be corrigible cannot be a system that maximizes a simple utility function (1 dimension), or put another way "whatever utility function is maximizes must be along multiple dimensions". Which seems to be pretty much what humans do, we have really complex utility functions, and everything seems to be ever changing and we have some control over it ourselves (and sometimes that goes wrong and people end up maxing out a singular dimension at the cost of everything else). Note to self: Think more about this and if possible write up something more coherent and explanatory.

[-]johnswentworth5y120

One second-order effect of the pandemic which I've heard talked about less than I'd expect:

This is the best proxy I found on FRED for new businesses founded in the US, by week. There was a mild upward trend over the last few years, it's really taken off lately. Not sure how much of this is kids who would otherwise be in college, people starting side gigs while working from home, people quitting their jobs and starting their own businesses so they can look after the kids, extra slack from stimulus checks, people losing their old jobs en masse but still having enough savings to start a business, ...

For the stagnation-hypothesis folks who lament relatively low rates of entrepreneurship today, this should probably be a big deal.

5gwern5y

How sure are you that the composition is interesting? How many of these are just quick mask-makers or sanitizer-makers, or just replacing restaurants that have now gone out of business? (ie very low-value-added companies, of the 'making fast food in a stall in a Third World country' sort of 'startup', which make essentially no or negative long-term contributions).

2johnswentworth5y

Good question. I haven't seen particularly detailed data on these on FRED, but they do have separate series for "high propensity" business applications (businesses they think are likely to hire employees), business applications with planned wages, and business applications from corporations, as well as series for each state. The spike is smaller for planned wages, and nonexistent for corporations, so the new businesses are probably mostly single proprietors or partnerships. Other than that, I don't know what the breakdown looks like across industries.

5gwern2y

How do you feel about this claim now? I haven't noticed a whole lot of innovation coming from all these small businesses, and a lot of them seem like they were likely just vehicles for the extraordinary extent of fraud as the results from all the investigations & analyses come in.

5johnswentworth2y

Well, it wasn't just a temporary bump: ... so it's presumably also not just the result of pandemic giveaway fraud, unless that fraud is ongoing. Presumably the thing to check here would be TFP, but Fred's US TFP series currently only goes to end of 2019, so apparently we're still waiting on that one? Either that or I'm looking at the wrong series.

2Gunnar_Zarncke5y

Somebody should post this on Paul Graham's twitter. He would be very interested in it (I can't): https://mobile.twitter.com/paulg

[-]johnswentworth5y120

Neat problem of the week: researchers just announced roughly-room-temperature superconductivity at pressures around 270 GPa. That's stupidly high pressure - a friend tells me "they're probably breaking a diamond each time they do a measurement". That said, pressures in single-digit GPa do show up in structural problems occasionally, so achieving hundreds of GPa scalably/cheaply isn't that many orders of magnitude away from reasonable, it's just not something that there's historically been much demand for. This problem plays with one idea for generating such pressures in a mass-produceable way.

Suppose we have three materials in a coaxial wire:

innermost material has a low thermal expansion coefficient and high Young's modulus (i.e. it's stiff)
middle material is a thin cylinder of our high-temp superconducting concoction
outermost material has a high thermal expansion coefficient and high Young's modulus.

We construct the wire at high temperature, then cool it. As the temperature drops, the innermost material stays roughly the same size (since it has low thermal expansion coefficient), while the outermost material shrinks, so the superconducting concoction is squeezed between them.

Exerc... (read more)

[-]johnswentworth3y102

So I saw the Taxonomy Of What Magic Is Doing In Fantasy Books and Eliezer’s commentary on ASC's latest linkpost, and I have cached thoughts on the matter.

My cached thoughts start with a somewhat different question - not "what role does magic play in fantasy fiction?" (e.g. what fantasies does it fulfill), but rather... insofar as magic is a natural category, what does it denote? So I'm less interested in the relatively-expansive notion of "magic" sometimes seen in fiction (which includes e.g. alternate physics), and more interested in the pattern called "magic" which recurs among tons of real-world ancient cultures.

Claim (weakly held): the main natural category here is symbols changing the territory. Normally symbols represent the world, and changing the symbols just makes them not match the world anymore - it doesn't make the world do something different. But if the symbols are "magic", then changing the symbols changes the things they represent in the world. Canonical examples:

Wizard/shaman/etc draws magic symbols, speaks magic words, performs magic ritual, or even thinks magic thoughts, thereby causing something to happen in the world.
Messing with a voodoo doll messes with

... (read more)

[-]johnswentworth4y102

Everybody's been talking about Paxlovid, and how ridiculous it is to both stop the trial since it's so effective but also not approve it immediately. I want to at least float an alternative hypothesis, which I don't think is very probable at this point, but does strike me as at least plausible (like, 20% probability would be my gut estimate) based on not-very-much investigation.

Early stopping is a pretty standard p-hacking technique. I start out planning to collect 100 data points, but if I manage to get a significant p-value with only 30 data points, then I just stop there. (Indeed, it looks like the Paxlovid study only had 30 actual data points, i.e. people hospitalized.) Rather than only getting "significance" if all 100 data points together are significant, I can declare "significance" if the p-value drops below the line at any time. That gives me a lot more choices in the garden of forking counterfactual paths.

Now, success rates on most clinical trials are not very high. (They vary a lot by area - most areas are about 15-25%. Cancer is far and away the worst, below 4%, and vaccines are the best, over 30%.) So I'd expect that p-hacking is a pretty large chunk of approved drugs, which means pharma companies are heavily selected for things like finding-excuses-to-halt-good-seeming-trials-early.

[-]gwern4y190

Early stopping is a pretty standard p-hacking technique.

It was stopped after a pre-planned interim analysis; that means they're calculating the stopping criteria/p-values with multiple testing correction built in, using sequential analysis.

[-]johnswentworth2y92

Here's an AI-driven external cognitive tool I'd like to see someone build, so I could use it.

This would be a software tool, and the user interface would have two columns. In one column, I write. Could be natural language (like google docs), or code (like a normal IDE), or latex (like overleaf), depending on what use-case the tool-designer wants to focus on. In the other column, a language and/or image model provides local annotations for each block of text. For instance, the LM's annotations might be:

(Natural language or math use-case:) Explanation or visu

... (read more)

4[anonymous]2y

Can you share your prompts and if you consider the output satisfactory for some example test cases?

2johnswentworth2y

I haven't experimented very much, but here's one example prompt. This one produced basically-decent results from GPT-4. Although I don't have the exact prompt on hand at the moment, I've also asked GPT-4 to annotate a piece of code line-by-line with a Fermi estimate of its runtime, which worked pretty well.

2[anonymous]2y

Yeah i was thinking your specs were, well 1. Wrap gpt-4 and Gemini, columned output over a set of text, applying prompts to each section? Prototype in a weekend. 2. Make the AI able to meaningfully contribute non obvious comments to help someone who already is an expert? https://xkcd.com/1425/

5johnswentworth2y

Don't really need comments which are non-obvious to an expert. Part of what makes LLMs well-suited to building external cognitive tools is that external cognitive tools can create value by just tracking "obvious" things, thereby freeing up the user's attention/working memory for other things.

5Viliam2y

So kinda like spellcheckers (most typos you could figure out, but why spend time and attention on proofreading if the program can do that for you), but... thought-checkers. Like, if a part of your article contradicts another part, it would be underlined.

5gwern2y

I've long wanted this, but it's not clear how to do it. Long-context LLMs are still expensive and for authors who need it most, context windows are still too small: me or Yudkowsky, for example, would still exceed the context window of almost all LLMs except possibly the newest Gemini. And then you have their weak reasoning. You could try to RAG it, but embeddings are not necessarily tuned to encode logically contradictory or inconsistent claims: probably if I wrote "the sky is blue" in one place and "the sky is red" in another, a retrieval would be able to retrieve both paragraphs and a LLM point out that they are contradictory, but such blatant contradictions are probably too rare to be useful to check for. You want something more subtle, like where you say "the sky is blue" and elsewhere "I looked up from the ground and saw the color of apples". You could try to brute force it and consider every pairwise comparison of 2 reasonable sized chunks of text and ask for contradictions, but this is quadratic and will get slow and expensive and probably turn up too many false positives. (And how do you screen off false positives and mark them 'valid'?) My general thinking these days is that these truly useful 'tools for thought' LLMs are going to require either much better & cheaper LLMs, so smart that they can provide useful assistance despite being used in a grossly unnatural way input-wise or safety-tuned to hell, or biting the bullet of finetuning/dynamic-evaluation (see my Nenex proposal). A LLM finetuned on my corpus can hope to quickly find, with good accuracy, contradictions because it was trained to know 'the sky was blue' when I wrote that at the beginning of the corpus, and it gets confused when it hits 'the color of ____' and it gets the prediction totally wrong. And RAG on an embedding tailored to the corpus can hope to surface the contradictions because it sees the two uses are the same in the essays' context, etc. (And if you run them locally, and they do

2Viliam2y

Perhaps you could do it in multiple steps. Feed it a shorter text (that fits in the window) and ask it to provide a short summary focusing on factual statements. Then hopefully all short versions could fit in the window. Find the contradiction -- report the two contradicting factual statements and which section they appeared in. Locate the statement in the original text.

2[anonymous]2y

Did you write more than 7 million words yet @gwern? https://www.google.com/amp/s/blog.google/technology/ai/google-gemini-next-generation-model-february-2024/amp/ Basically it's the "lazy wait" calculation. Get something to work now or wait until the 700k or 7m word context window ships.

3gwern2y

I may have. Just gwern.net is, I think, somewhere around 2m, and it's not comprehensive. Also, for contradictions, I would want to detect contradictions against citations/references as well (detecting miscitations would be more important than self-consistency IMO), and as a rough ballpark, the current Gwern.net annotation* corpus is approaching 4.3m words, looks like, and is also not comprehensive. So, closer than one might think! (Anyway, doesn't deal with the cost or latency: as you can see in the demos, we are talking minutes, not seconds, for these million-token calls and the price is probably going to be in the dollar+ regime per call.) * which are not fulltext. It would be nice to throw in all of the hosted paper & book & webpage fulltexts, but then that's probably more like 200m+ words.

5ryan_greenblatt2y

There isn't any clear technical obstruction to getting this time down pretty small with more parallelism.

2gwern2y

There may not be any 'clear' technical obstruction, but it has failed badly in the past. 'Add more parallelism' (particularly hierarchically) is one of the most obvious ways to improve attention, and people have spent the past 5 years failing to come up with efficient attentions that do anything but move along a Pareto frontier from 'fast but doesn't work' to 'slow and works only as well as the original dense attention'. It's just inherently difficult to know what tokens you will need across millions of tokens without input from all the other tokens (unless you are psychic), implying extensive computation of some sort, which makes things inherently serial and costs you latency, even if you are rich enough to spend compute like water. You'll note that when Claude-2 was demoing the ultra-long attention windows, it too spent a minute or two churning. While the most effective improvements in long-range attention like Flash Attention or Ring Attention are just hyperoptimizing dense attention, which is inherently limited.

[-]johnswentworth4y90

I've long been very suspicious of aggregate economic measures like GDP. But GDP is clearly measuring something, and whatever that something is it seems to increase remarkably smoothly despite huge technological revolutions. So I spent some time this morning reading up and playing with numbers and generally figuring out how to think about the smoothness of GDP increase.

Major takeaways:

When new tech makes something previously expensive very cheap, GDP mostly ignores it. (This happens in a subtle way related to how we actually compute it.)
- Historical GDP curve

... (read more)

[-]johnswentworth4y270

If you want a full post on this, upvote this comment.

4Adam Zerner4y

In writing How much should we value life?, I spent some time digging into AI timeline stuff. It lead me to When Will AI Be Created?, written by Luke Muehlhauser for MIRI. He noted that there is reason not to trust expert opinions on AI timelines, and that trend extrapolation may be a good alternative. This point you're making about GDP seems like it is real progress towards coming up with a good way to do trend extrapolation, and thus seems worth a full post IMO. (Assuming it isn't already well known by the community or something, which I don't get the sense is the case.)

2Raemon4y

Upvoted, but I mostly trust you to write the post if it seems like there's an interesting meaty thing worth saying.

2johnswentworth4y

Eh, these were the main takeaways, the post would just be more details and examples so people can see the gears behind it.

4Mark Xu4y

A similar point is made by Korinek in his review of Could Advanced AI Drive Explosive Economic Growth:

3Mark Xu4y

In general, Baumol type effects (spending decreasing in sectors where productivity goes up), mean that we can have scenarios in which the economy is growing extremely fast on "objective" metrics like energy consumption, but GDP has stagnated because that energy is being spent on extremely marginal increases in goods being bought and sold.

[-]johnswentworth4y92

[Epistemic status: highly speculative]

Smoke from California/Oregon wildfires reaching the East Coast opens up some interesting new legal/political possibilities. The smoke is way outside state borders, all the way on the other side of the country, so that puts the problem pretty squarely within federal jurisdiction. Either a federal agency could step in to force better forest management on the states, or a federal lawsuit could be brought for smoke-induced damages against California/Oregon. That would potentially make it a lot more difficult for local homeowners to block controlled burns.

[-]johnswentworth5y90

Brief update on how it's going with RadVac.

I've been running ELISA tests all week. In the first test, I did not detect stronger binding to any of the peptides than to the control in any of several samples from myself or my girlfriend. But the control itself was looking awfully suspicious, so I ran another couple tests. Sure enough, something in my samples is binding quite strongly to the control itself (i.e. the blocking agent), which is exactly what the control is supposed to not do. So I'm going to try out some other blocking agents, and hopefully get an... (read more)

4ChristianKl5y

I would expect that hedging also happens because making definitive clinical claims has more danger from the FDA then making hedged statements.

[-]johnswentworth6y92

Someone should write a book review of The Design of Everyday Things aimed at LW readers, so I have a canonical source to link to other than the book itself.

[-]johnswentworth5y80

I had a shortform post pointing out the recent big jump in new businesses in the US, and Gwern replied:

How sure are you that the composition is interesting? How many of these are just quick mask-makers or sanitizer-makers, or just replacing restaurants that have now gone out of business? (ie very low-value-added companies, of the 'making fast food in a stall in a Third World country' sort of 'startup', which make essentially no or negative long-term contributions).

This was a good question in context, but I disagree with Gwern's model of where-progress-come... (read more)

4ChristianKl5y

The pandemic also has the effect of showing the kind of business ideas people try. It pushes a lot of innovation in food delivery. Some of the pandemic driver innovation will become worthless once the pandemic is over but a few good ideas likely survive and the old ideas of the businesses that went out of business are still around.

[-]johnswentworth4mo*Ω360

Does The Information-Throughput-Maximizing Input Distribution To A Sparsely-Connected Channel Satisfy An Undirected Graphical Model?

[EDIT: Never mind, proved it.]

Suppose I have an information channel $X \to Y$ . The X components $X_{1}, . . ., X_{m}$ and the Y components $Y_{1}, . . ., Y_{n}$ are sparsely connected, i.e. the typical $Y_{i}$ is downstream of only a few parent X-components $X_{p a (i)}$ . (Mathematically, that means the channel factors as $P [Y | X] = \prod_{i} P [Y_{i} | X_{p a (i)}]$ .)

Now, suppose I split the Y components into two sets, and hold constant any X-com... (read more)

8johnswentworth4mo

Proof Specifically, we'll show that there exists an information throughput maximizing distribution which satisfies the undirected graph. We will not show that all optimal distributions satisfy the undirected graph, because that's false in some trivial cases - e.g. if all the Y's are completely independent of X, then all distributions are optimal. We will also not show that all optimal distributions factor over the undirected graph, which is importantly different because of the P[X]>0 caveat in the Hammersley-Clifford theorem. First, we'll prove the (already known) fact that an independent distribution P[X]=P[X1]P[X2] is optimal for a pair of independent channels (X1→Y1,X2→Y2); we'll prove it in a way which will play well with the proof of our more general theorem. Using standard information identities plus the factorization structure Y1−X1−X2−Y2 (that's a Markov chain, not subtraction), we get MI(X;Y)=MI(X;Y1)+MI(X;Y2|Y1) =MI(X;Y1)+(MI(X;Y2)−MI(Y2;Y1)+MI(Y2;Y1|X)) =MI(X1;Y1)+MI(X2;Y2)−MI(Y2;Y1) Now, suppose you hand me some supposedly-optimal distribution P[X]. From P, I construct a new distribution Q[X]:=P[X1]P[X2]. Note that MI(X1;Y1) and MI(X2;Y2) are both the same under Q as under P, while MI(Y2;Y1) is zero under Q. So, because MI(X;Y)=MI(X1;Y1)+MI(X2;Y2)−MI(Y2;Y1), the MI(X;Y) must be at least as large under Q as under P. In short: given any distribution, I can construct another distribution with as least as high information throughput, under which X1 and X2 are independent. Now let's tackle our more general theorem, reusing some of the machinery above. I'll split Y into Y1 and Y2, and split X into X1−2 (parents of Y1 but not Y2), X2−1 (parents of Y2 but not Y1), and X1∩2 (parents of both). Then MI(X;Y)=MI(X1∩2;Y)+MI(X1−2,X2−1;Y|X1∩2) In analogy to the case above, we consider distribution P[X], and construct a new distribution Q[X]:=P[X1∩2]P[X1−2|X1∩2]P[X2−1|X1∩2]. Compared to P, Q has the same value of MI(X1∩2;Y), and by exactly the same argument as

6testingthewaters4mo

I suppose another way to look at this is the overlapping components are the blanket states in some kind of time dependent markov blanket setup, right? In the scenario you created you could treat x1,x2,x3as the some shielded state at time step t, so it. Then x5,x6,x7 are states outside of the blanket, so et (which group of states is i and which is e don't really matter, so long as they are on either side of the blanket). y1,y2,y3,y4 [1]become it+1, and y5,y6,y7,y8 become et+1. Then x4 becomes the blanket bt such that I(it+1,et+1|bt)≈0 and P(it+1,et+1|it,et,bt)=P(it+1|it,bt)⋅P(et+1|et,bt) With all that implies. In fact you can just as easily have three shielded states, or four, using this formulation. (the setup for this is shamelessly ripped off from @Gunnar_Zarncke 's unsupervised agent detection work) 1. ^ Did you miss an arrow going to y4 ?

3Daniel C4mo

(Was in the middle of writing a proof before noticing you did it already) I believe the end result is that if we have Y=(Y1,Y2), X=(X1,X2,X3) with P(Y|X)=P(Y1|X1,X3)P(Y2|X2,X3) (X1 upstream of Y1, X2 upstream of Y2, X3 upstream of both), then maximizing I(X;Y) is equivalent to maximizing I(Y1;X1,X3)+I(Y2;X2,X3)−I(Y1;Y2). & for the proof we can basically replicate the proof for additivity except substituting the factorization P(X1,X2,X3)=P(X3)P(X1|X3)P(X2|X3) as assumption in place of independence, then both directions of inequality will result in I(Y1;X1,X3)+I(Y2;X2,X3)−I(Y1;Y2). [EDIT: Forgot −I(Y1;Y2) term due to marginal dependence P(Y1,Y2)≠P(Y1)P(Y2)]

[-]johnswentworth2y60

Does anyone know of an "algebra for Bayes nets/causal diagrams"?

More specifics: rather than using a Bayes net to define a distribution, I want to use a Bayes net to state a property which a distribution satisfies. For instance, a distribution P[X, Y, Z] satisfies the diagram X -> Y -> Z if-and-only-if the distribution factors according to
P[X, Y, Z] = P[X] P[Y|X] P[Z|Y].

When using diagrams that way, it's natural to state a few properties in terms of diagrams, and then derive some other diagrams they imply. For instance, if a distribution P[W, X, Y, Z]... (read more)

[-]johnswentworth4y60

Weather just barely hit 80°F today, so I tried the Air Conditioner Test.

Three problems came up:

Turns out my laser thermometer is all over the map. Readings would change by 10°F if I went outside and came back in. My old-school thermometer is much more stable (and well-calibrated, based on dipping it in some ice water), but slow and caps out around 90°F (so I can't use to measure e.g. exhaust temp). I plan to buy a bunch more old-school thermometers for the next try.
I thought opening the doors/windows in rooms other than the test room and setting up a fan w

... (read more)

[-]johnswentworth5y60

Chrome is offering to translate the LessWrong homepage for me. Apparently, it is in Greek.

2habryka5y

Huh, amusing. We do ship a font that has nothing but the greek letter set in it, because people use greek unicode symbols all the time and our primary font doesn't support that character set. So my guess is that's where Google gets confused.

2johnswentworth5y

Oh, I had just assumed it was commentary on the writing style/content.

4Viliam5y

If about 10% of articles have "Ω" in their title, what is the probability that the page is in Greek? :D

[-]johnswentworth6y50

What if physics equations were written like statically-typed programming languages?

$(\frac{m a s s \cdot l e n g t h}{t i m e^{2}} : F) = (\frac{m a s s}{-} : m) (\frac{l e n g t h}{t i m e^{2}} : a)$

$(\frac{m a s s}{l e n g t h \cdot t i m e^{2}} : P) (\frac{l e n g t h^{3}}{-} : V) = (\frac{-}{-} : N) (\frac{m a s s \cdot l e n g t h^{2}}{t i m e^{2} \cdot t e m p} : R) (\frac{t e m p}{-} : T)$

6jimrandomh6y

The math and physics worlds still use single-letter variable names for everything, decades after the software world realized that was extremely bad practice. This makes me pessimistic about the adoption of better notation practices.

7johnswentworth6y

Better? I doubt it. If physicists wrote equations the way programmers write code, a simple homework problem would easily fill ten pages. Verboseness works for programmers because programmers rarely need to do anything more complicated with their code than run it - analogous to evaluating an expression, for a physicist or mathematician. Imagine if you needed to prove one program equivalent to another algebraically - i.e. a sequence of small transformations, with a record of intermediate programs derived along the way in order to show your work. I expect programmers subjected to such a use-case would quickly learn the virtues of brevity.

4Gunnar_Zarncke4mo

Related to that: You have much fewer variables under consideration that you can even have standard names for. A remnant of this effect can be seen in typical Fortan programs.

3Steven Byrnes6y

Yeah, I'm apparently not intelligent enough to do error-free physics/engineering calculations without relying on dimensional analysis as a debugging tool. I even came up with a weird, hack-y way to do that in computing environments like Excel and Cython, where flexible multiplicative types are not supported.

[-]johnswentworth9mo40

Is interpersonal variation in anxiety levels mostly caused by dietary iron?

~~I stumbled across~~ ~~this paper~~ yesterday. I haven't looked at it very closely yet, but the high-level pitch is that they look at genetic predictors of iron deficiency and then cross that with anxiety data. It's interesting mainly because it sounds pretty legit (i.e. the language sounds like direct presentation of results without any bullshitting, the p-values are satisfyingly small, there's no branching paths), and the effect sizes are BIG IIUC:

~~The odd ratios (OR) of anxiety disorders~~

... (read more)

1samuelshadrach9mo

Have you tested this hypothesis on your friends? Ask them for their iron level from last blood test, and ask them to self-report anxiety level (you also make a separate estimate of their anxiety level).

[-]johnswentworth2y42

I keep seeing news outlets and the like say that SORA generates photorealistic videos, can model how things move in the real world, etc. This seems like blatant horseshit? Every single example I've seen looks like video game animation, not real-world video.

Have I just not seen the right examples, or is the hype in fact decoupled somewhat from the model's outputs?

6ryan_greenblatt2y

I think I mildly disagree, but probably we're looking at the same examples. I think the most impressive (in terms of realism) videos are under "Sora is able to generate complex scenes with multiple characters, ...". (Includes white SUV video and Toyko suburbs video.) I think all of these videos other than the octopus and paper planes are "at-a-glance" photorealistic to me. Overall, I think SORA can do "at-a-glance" photorealistic videos and can model to some extent how things move in the real world. I don't think it can do both complex motion and photorealism in the same video. As in, the videos which are photorealistic don't really involve complex motion and the videos which involve complex motion aren't photorealistic. (So probably some amount of hype, but also pretty real?)

3habryka2y

Hmm, I don't buy it. These two scenes seem very much not like the kind of thing a video game engine could produce: Look at this frame! I think there is something very slightly off about that face, but the cat hitting the person's face and the person's reaction seem very realistic to me and IMO qualifies as "complex motion and photorealism in the same video".

2johnswentworth2y

Were these supposed to embed as videos? I just see stills, and don't know where they came from.

4ryan_greenblatt2y

These are stills from some of the videos I was referencing.

2ryan_greenblatt2y

TBC, I wasn't claiming anything about video game engines. I wouldn't have called the cat one "complex motion", but I can see where you're coming from.

2RamblinDash2y

Yeah, I mean I guess it depends on what you mean by photorealistic. That cat has three front legs.

8gwern2y

Yeah, this is the example I've been using to convince people that the game engines are almost certainly generating training data but are probably not involved at sampling time. I can't come up with any sort of hybrid architecture like 'NN controlling game-engine through API' where you get that third front leg. One of the biggest benefits of a game-engine would be ensuring exactly that wouldn't happen - body parts becoming detached and floating in mid-air and lack of conservation. If you had a game engine with a hyper-realistic cat body model in it which something external was manipulating, one of the biggest benefits is that you wouldn't have that sort of common-sense physics problem. (Meanwhile, it does look like past generative modeling of cats in its errors. Remember the ProGAN interpolation videos of CATS? Hilarious, but also an apt demonstration of how extremely hard cats are to model. They're worse than hands.) In addition, you see plenty of classic NN tells throughout - note the people driving a 'Dandrover'...

2johnswentworth2y

Yeah, those were exactly the two videos which most made me think that the model was mostly trained on video game animation. In the tokyo one, the woman's facial muscles never move at all, even when the camera zooms in on her. And in the SUV one, the dust cloud isn't realistic, but even covering that up the SUV has a Grand Theft Auto look to its motion. "Can't do both complex motion and photorealism in the same video" is a good hypothesis to track, thanks for putting that one on my radar.

2ryan_greenblatt2y

(Note that I was talking about the one with the train going through Toyko suburbs.)

[-]johnswentworth2y42

Putting this here for posterity: I have thought since the superconductor preprint went up, and continue to think, that the markets are putting generally too little probability on the claims being basically-true. I thought ~70% after reading the preprint the day it went up (and bought up a market on manifold to ~60% based on that, though I soon regretted not waiting for a better price), and my probability has mostly been in the 40-70% range since then.

2johnswentworth2y

After seeing the markets jump up in response to the latest, I think I'm more like 65-80%.

[-]johnswentworth4y40

Languages should have tenses for spacelike separation. My friend and I do something in parallel, it's ambiguous/irrelevant which one comes first, I want to say something like "I expect my friend <spacelike version of will do/has done/is doing> their task in such-and-such a way".

5JBlack4y

That sounds more like a tenseless sentence than using a spacelike separation tense. Your friend's performance of the task may well be in your future or past lightcone (or extend through both), but you don't wish to imply any of these. There are languages with tenseless verbs, as well as some with various types of spatial tense. The closest I can approximate this in English without clumsy constructs is "I expect my friend does their task in such-and-such a way", which I agree isn't very satisfactory.

4adamShimi4y

Who would have thought that someone would ever look at CSP and think "I want english to be more like that"?

2johnswentworth4y

lol

3kave4y

Future perfect (hey, that's the name of the show!) seems like a reasonable hack for this in English

[-]johnswentworth4y40

Two kinds of cascading catastrophes one could imagine in software systems...

A codebase is such a spaghetti tower (and/or coding practices so bad) that fixing a bug introduces, on average, more than one new bug. Software engineers toil away fixing bugs, making the software steadily more buggy over time.
Software services managed by different groups have dependencies - A calls B, B calls C, etc. Eventually, the dependence graph becomes connected enough and loopy enough that a sufficiently-large chunk going down brings down most of the rest, and nothing can go

... (read more)

[-]johnswentworth5y40

I wish there were a fund roughly like the Long-Term Future Fund, but with an explicit mission of accelerating intellectual progress.

6habryka5y

I mean, just to be clear, I am all in favor of intellectual progress. But doing so indiscriminately does sure seem a bit risky in this world of anthropogenic existential risks. Reminds me of my mixed feelings on the whole Progress Studies thing.

6johnswentworth5y

Yeah, I wouldn't want to accelerate e.g. black-box ML. I imagine the real utility of such a fund would be to experiment with ways to accelerate intellectual progress and gain understanding of the determinants, though the grant projects themselves would likely be more object-level than that. Ideally the grants would be in areas which are not themselves very risk-relevant, but complicated/poorly-understood enough to generate generalizable insights into progress. I think it takes some pretty specific assumptions for such a thing to increase risk significantly on net. If we don't understand the determinants of intellectual progress, then we have very little ability to direct progress where we want it; it just follows whatever the local gradient is. With more understanding, at worst it follows the same gradient faster, and we end up in basically the same spot. The one way it could net-increase risk is if the most likely path of intellectual progress leads to doom, and the best way to prevent doom is through some channel other than intellectual progress (like political action, for instance). Then accelerating the intellectual progress part potentially gives the other mechanisms (like political bodies) less time to react. Personally, though, I think a scenario in which e.g. political action successfully prevents intellectual progress from converging to doom (in a world where it otherwise would have) is vanishingly unlikely (like, less than one-in-a-hundred, maybe even less than one-in-a-thousand).

3Quinn5y

You might check out Donald Braben's view, it says "transformative research" (i.e. fundamental results that create new fields and industries) is critical for the survival of civilization. He does not worry that transformative results might end civilization.

[-]johnswentworth6y40

For short-term, individual cost/benefit calculations around C19, it seems like uncertainty in the number of people currently infected should drop out of the calculation.

For instance: suppose I'm thinking about the risk associated with talking to a random stranger, e.g. a cashier. My estimated chance of catching C19 from this encounter will be roughly proportional to $N_{i n f e c t e d}$ . But, assuming we already have reasonably good data on number hospitalized/died, my chances of hospitalization/death given infection will be roughly inversely proportional to $N_{i n}$ ... (read more)

[-]johnswentworth2y30

Way back in the halcyon days of 2005, a company called Cenqua had an April Fools' Day announcement for a product called Commentator: an AI tool which would comment your code (with, um, adjustable settings for usefulness). I'm wondering if (1) anybody can find an archived version of the page (the original seems to be gone), and (2) if there's now a clear market leader for that particular product niche, but for real.

7Garrett Baker2y

Archived website

5johnswentworth2y

You are a scholar and a gentleman.

6A.H.2y

Here is an archived version of the page : http://web.archive.org/web/20050403015136/http://www.cenqua.com/commentator/

[-]johnswentworth3y30

Here's an interesting problem of embedded agency/True Names which I think would make a good practice problem: formulate what it means to "acquire" something (in the sense of "acquiring resources"), in an embedded/reductive sense. In other words, you should be able-in-principle to take some low-level world-model, and a pointer to some agenty subsystem in that world-model, and point to which things that subsystem "acquires" and when.

Some prototypical examples which an answer should be able to handle well:

Organisms (anything from bacteria to plant to animals) eating things, absorbing nutrients, etc.
Humans making money or gaining property.

3Gunnar_Zarncke3y

...and how the brain figures this out and why it is motivated to do so. There are a lot of simple animals that apparently "try to control" resources or territory. How? Drives to control resources occur everywhere. And your control of resources is closely related to your dominance in a dominance hierarchy. Which seems to be regulated in many animals by serotonin. See e.g. https://www.nature.com/articles/s41386-022-01378-2

[-]johnswentworth6mo20

This billboard sits over a taco truck I like, so I see it frequently:

The text says "In our communities, Kaiser Permanente members are 33% less likely to experience premature death due to heart disease.*", with the small-text directing one to a url.

The most naive (and presumably intended) interpretation is, of course, that being a Kaiser Permanente member provides access to better care, causing 33% lower chance of death due to heart disease.

Now, I'd expect most people reading this to immediately think something like "selection effects!" - i.e. what the bill... (read more)

5Kabir Kumar6mo

the actual trap is that it caught your attention, you posted about it online and now more people know and think about Kaiser Permanente than before and according to whoever was in charge of making this billboard, that's a success metric they can leverage for a promotion.

4jmh6mo

Is that what is does tell us? The sign doesn't make the claim you suggest -- it doesn't claim it's reducing the deaths from heart disease, it states it's 33% less likely to be "premature" -- which is probably a weaselly term here. But it clearly is not making any claims about reducing deaths from heart disease. You seem to be projecting the conclusion that the claim/expected interpretation is that membership reduces the deaths by 33%. But I don't know how you're concluding that the marketing team thought that would be the general interpretation by those seeing the sign. While I would not be incline to take an billboard ad at face value, a more reasonable take seems to me that claiming that even with heard disease KP's members are less likely to die earlier than expect that other with other healthcare providers. That may be a provable and true claim or it might be more "puffing" and everyone will play with just how "premature" is going to be measured. Whether or not it's corporate stupidity, I think that might be a separate question but understanding exactly what results such an ad is supposed to be producing will matter a lot here. Plus, there is the old adage about no one every going bankrupt underestimating the intelligence of the American consumer -- and I suspect that might go double in the case of medical/healthcare consumption.

0faul_sname6mo

"Kaiser Permanente members are younger and healthier, and thus consume fewer healthcare resources on average, which allows us to pass the savings on to you."

[-][anonymous]6mo23

That is unsurprising to me, since the overall gist of Rationalism is an attempt to factor uncertainty out of the near future, life, and thought itself.

This tells me you don't know anything about LW-rationality or are being deliberately uncharitable to it.

You're mostly making blanket broad claims, maybe make a top level post which is charitable to the entire project. Go in depth post by post on where you think people have gone wrong, and in what way. High effort posting is appreciated.

[-]johnswentworth3y22

An interesting conundrum: one of the main challenges of designing useful regulation for AI is that we don't have any cheap and robust way to distinguish a dangerous neural net from a non-dangerous net (or, more generally, a dangerous program from a non-dangerous program). This is an area where technical research could, in principle, help a lot.

The problem is, if there were some robust metric for how dangerous a net is, and that metric were widely known and recognized (as it would probably need to be in order to be used for regulatory purposes), then someone would probably train a net to maximize that metric directly.

6Garrett Baker3y

This seems to lead to the solution of trying to make your metric one-way, in the sense that your metric should 1. Provide an upper-bound on the dangerousness of your network 2. Compress the space of networks which map to approximately the same dangerousness level on the low end of dangerousness, and expand the space of networks which map to approximately the same dangerousness level on the upper end of dangerous, so that you can train your network to minimize the metric, but when you train your network to maximize the metric you end up in a degenerate are with technically very high measured danger levels but in actuality very low levels of dangerousness. We can hope (or possibly prove) that as you optimize upwards on the metric you get subject to goodheart's curse, but the opposite occurs on the lower end.

4Thane Ruthenis3y

Sure, even seems a bit tautological: any such metric, to be robust, would need to contain in itself a definition of a dangerously-capable AI, so you probably wouldn't even need to train a model to maximize it. You'd be able to just lift the design from the metric directly.

2Thane Ruthenis3y

Do you have any thoughts on a softer version of this problem, where the metric can't be maximized directly, but gives a concrete idea of what sort of challenge your AI needs to beat to qualify as AGI? (And therefore in which direction in the architectural-design-space you should be moving.) Some variation on this seems like it might work as a "fire alarm" test set, but as you point out, inasmuch as it's recognized, it'll be misapplied for benchmarking instead. (I suppose the ideal way to do it would be to hand it off to e. g. ARC, so they can use it if OpenAI invites them for safety-testing again. This way, SOTA models still get tested, but the actors who might misuse it aren't aware of the testing's particulars until they succeed anyway...)

[-]johnswentworth4y20

I just went looking for a good reference for the Kelly criterion, and didn't find any on Lesswrong. So, for anybody who's looking: chapter 6 of Thomas & Cover's textbook on information theory is the best source I currently know of.

6Yoav Ravid4y

Might be a good thing to add to the Kelly Criterion tag

[-]johnswentworth5y20

Neat problem of the week: we have n discrete random variables, $X_{1} . . . X_{n}$ . Given any variable, all variables are independent:

$\forall i : P [X | X_{i}] = \prod_{j} P [X_{j} | X_{i}]$

Characterize the distributions which satisfy this requirement.

This problem came up while working on the theorem in this post, and (separately) in the ideas behind this post. Note that those posts may contain some spoilers for the problem, though frankly my own proofs on this one just aren't very good.

[+][comment deleted]4y-40

Moderation Log