I'm confident enough in this take to write it as a PSA: playing music at medium-size-or-larger gatherings is a Chesterton's Fence situation.
It serves the very important function of reducing average conversation size: the louder the music, the more groups naturally split into smaller groups, as people on the far end develop a (usually unconscious) common knowledge that it's too much effort to keep participating in the big one and they can start a new conversation without being unduly disruptive.
If you've ever been at a party with no music where people gravitate towards a single (or handful of) group of 8+ people, you've experienced the failure mode that this solves: usually these conversations are then actually conversations of 2-3 people with 5-6 observers, which is usually unpleasant for the observers and does not facilitate close interactions that easily lead to getting to know people.
By making it hard to have bigger conversations, the music naturally produces smaller ones; you can modulate the volume to have the desired effect on a typical discussion size. Quiet music (e.g. at many dinner parties) makes it hard to have conversations bigger than ~4-5, which is already a big improvement. Medium-volume music (think many bars) facilitates easy conversations of 2-3. The extreme end of this is dance clubs, where very loud music (not coincidentally!) makes it impossible to maintain conversations bigger than 2.
I suspect that high-decoupler hosts are just not in the habit of thinking "it's a party, therefore I should put music on," or even actively think "music makes it harder to talk and hear each other, and after all isn't that the point of a party?" But it's a very well-established cultural practice to play music at large gatherings, so, per Chesterton's Fence, you need to understand what function it plays. The function it plays is to stop the party-destroying phenomenon of big group conversations.
As having gone to Lighthaven, does this still feel marginally worth it at Lighthaven where we mostly tried to make it architecturally difficult to have larger conversations? I can see the case for music here, but like, I do think music makes it harder to talk to people (especially on the louder end) and that does seem like a substantial cost to me.
Talking 1-1 with music is so difficult to me that I don't enjoy a place if there's music. I expect many people on/towards the spectrum could be similar.
Having been at two LH parties, one with music and one without, I definitely ended up in the "large conversation with 2 people talking and 5 people listening"-situation much more in the party without music.
That said, I did find it much easier to meet new people at the party without music, as this also makes it much easier to join conversations that sound interesting when you walk past (being able to actually overhear them).
This might be one of the reasons why people tend to progressively increase the volume of the music during parties. First give people a chance to meet interesting people and easily join conversations. Then increase the volume to facilitate smaller conversations.
Yeah, when there's loud music it's much easier for me to understand people I know than people I don't because I'm already used to their speaking patterns, and can more easily infer what they said even when I don't hear it perfectly. And also because any misunderstanding or difficulty that rises out of not hearing each other well is less awkward with someone I already know than someone I do.
As someone who's spent meaningful amounts of time at LH during parties, absolutely yes. You successfully made it architecturally awkward to have large conversations, but that's often cashed out as "there's a giant conversation group in and totally blocking [the Entry Hallway Room of Aumann]/[the lawn between A&B]/[one or another firepit and its surrounding walkways]; that conversation group is suffering from the obvious described failure modes, but no one in it is sufficiently confident or agentic or charismatic to successfully break out into a subgroup/subconversation.
I'd recommend quiet music during parties? Or maybe even just a soundtrack of natural noises - birdsong and wind? rain and thunder? - to serve the purpose instead.
@habryka Forgot to comment on the changes you implemented for soundscape at LH during the mixer - possibly you may want to put a speaker in the Bayes window overlooking the courtyard firepit. People started congregating/pooling there (and notably not at the other firepit next to it!) because it was the locally-quietest location, and then the usual failure modes of an attempted 12-person conversation ensued.
Seems cheap to get the info value, especially for quieter music? Can be expensive to set up a multi-room sound system, but it's probably most valuable in the room that is largest/most prone to large group formation, so maybe worth experimenting with a speaker playing some instrumental jazz or something. I do think the architecture does a fair bit of work already.
I’m being slightly off-topic here, but how does one "makes it architecturally difficult to have larger conversations"? More broadly, the topic of designing spaces where people can think better/do cooler stuff/etc. is fascinating, but I don’t know where to learn more than the very basics of it. Do you know good books, articles, etc. on these questions, by any chance?
I like Christopher Alexander's stuff.
On the object level question, the way to encourage small conversations architecturally is to have lots of nooks that only fit 3-6 people.
“Nook”, a word which here includes both “circles of seats with no other easily movable seats nearby” and “easily accessible small rooms”.
Thanks! I knew of Alexander, but you reminded me that I’ve been procrastinating on tackling the 1,200+ pages of A Pattern Language for a few months, and I’ve now started reading it :-)
Was one giant cluster last two times I was there. In the outside area. Not sure why the physical space arrangement wasn't working. I guess walking into a cubby feels risky/imposing, and leaving feels rude. I would have liked it to work.
I'm not sure how you could improve it. I was trying to think of something last time I was there. "Damn all these nice cubbies are empty." I could not think of anything.
Just my experience.
I agree music has this effect, but I think the Fence is mostly because it also hugely influences the mood of the gathering, i.e. of the type and correlatedness of people's emotional states.
(Music also has some costs, although I think most of these aren't actually due to the music itself and can be avoided with proper acoustical treatment. E.g. people sometimes perceive music as too loud because the emitted volume is literally too high, but ime people often say this when the noise is actually overwhelming for other reasons, like echo (insofar as walls/floor/ceiling are near/hard/parallel), or bass traps/standing waves (such that the peak amplitude of the perceived wave is above the painfully loud limit, even though the average amplitude is fine; in the worst cases, this can result in barely being able to hear the music while simultaneously perceiving it as painfully loud!)
Other factors also to consider:
1.
Gatherings with generous alcohol drinking tend to have louder music because alcohol relaxes the inner ear muscles, resulting in less vibration being conveyed, resulting in sound dampening. So anyone drinking alcohol experiences lower sound volumes. This means that a comfortable volume for a drunk person is quite a bit higher than for a sober person. Which is a fact that can be quite unpleasant if you are the designated driver! I always try to remember to bring earplugs if I'm going to be a designated driver for a group going out drinking.
If you are drinking less than the average amount of alcohol at a social gathering, chances are your opinion of the music will be that it is too loud.
2. The intent of the social gathering in some cases is to facilitate good conversations. In such a case the person managing the music (host or DJ) should be thoughtful of this, and aim for a 'coffee shop' vibe with quiet background music and places to go in the venue where the music dwindles away.
In the alternate case, where the intent of the party is to facilitate social connection and/or flirtation and/or fun dancing... then the host / DJ may be actively pushing the music loud to discourage any but the most minimal conversation, trying to get people to drink alcohol and dance rather than talk, and at most have brief simple 1-1 conversations. A dance club is an example of a place deliberately aiming for this end of the spectrum.
So, in designing a social gathering, these factors are definitely something to keep in mind. What are the goals of the gathering? How much, if any, alcohol will the guests be drinking? If you have put someone in charge of controlling the music, are they on the same page about this? Or are they someone who is used to controlling music in a way appropriate to dance hall style scenarios and will default to that?
In regards to intellectual discussion focused gatherings, I do actually think that there can be a place for gatherings of people in which only a small subset of people talk... but I agree this shouldn't be the default. The scenario where I think this makes sense is something more like a debate club or mini lecture with people taking turns to ask questions or challenge assumptions of the lecturer. This is less a social gathering and more an educational type experience, but can certainly be something on the borderlands between coffeeshop-style small group conversation and formal academic setting. Rousing debates and speeches or mini lectures around topics that the group finds interesting, relevant, and important can be both an educational experience and a fun social experience to perform or watch. I think this is something that needs more planning and structure to go well, and which people should be aware is intended and what rules the audience will be expected to follow in regards to interruptions, etc.
Wow, I had no idea about the effects of alcohol on hearing! It makes so much sense - I never drink and I hate how loud the music is in parties!
get people to drink alcohol and dance rather than talk
Also important to notice that restaurants and bars are not fully aligned with your goals. On one hand, if you feel good there, you are likely to come again, and thus generate more profit for them -- this part is win/win. On the other hand, it is better for them if you spend less time talking (even if that's what you like), and instead eat and drink more, and then leave, so that other paying customers can come -- that part is win/lose.
(Could restaurants become better aligned if instead of food we paid them for time? I suspect this would result in other kind of frustrating actions, such as them taking too much time to bring the food in very small portions.)
So while it is true that the music serves a socially useful purpose, it also serves a profit-increasing purpose, so I suspect that the usual volume of music we are used to is much higher than would be socially optimal.
I also like Lorxus's proposal of playing natural noises instead.
Could restaurants become better aligned if instead of food we paid them for time?
The “anti-café” concept is like this. I've never been to one myself, but I've seen descriptions on the Web of a few of them existing. They don't provide anything like restaurant-style service that I've heard; instead, there are often cheap or free snacks along the lines of what a office break room might carry, along with other amenities, and you pay for the amount of time you spend there.
I think a restaurant where you paid for time, if the food was nothing special, would quickly turn into a coworking space. Maybe it would be more open-office and more amenable to creative, conversational, interpersonal work rather than laptop work. You probably want it to be a cafe - or at least look like a cafe from the outside in signage / branding; you may want architectural sound dampening like a denny's booth. You could sell pre-packaged food and sodas - it isn't what they're here for. Or you could even sell or rent activities like coloring books, simple social tabletop games, small toys, lockpicking practice locks, tiny marshmallow candle smore sets, and so on.
Unfortunately different people have different levels of hearing ability, so you're not setting the conversation size at the same level for all participants. If you set the volume too high, you may well be excluding some people from the space entirely.
I think that people mostly put music on in these settings as a way to avoid awkward silences and to create the impression that the room is more active than it is, whilst people are arriving. If this is true, then it serves no great purpose once people have arrived and are engaged in conversation.
Another important consideration is sound-damping. I've been in venues where there's no music playing and the conversations are happening between 3 -5 people but everyone is shouting to be heard above the crowd, and it's incredibly difficult for someone with hearing damage to participate at all. This is primarily a result of hard, echoey walls and very few soft furnishings.
I think there's something to be said for having different areas with different noise levels, allowing people to choose what they're comfortable with, and observing where they go.
which is usually unpleasant for the observers
It seems to me that this claim has a lot to overcome, given that the observers could walk away at any time.
does not facilitate close interactions that easily lead to getting to know people.
Is that a goal? I've never been much of a partygoer, but if I want to have a one-on-one conversation with somebody and get to know them, a party is about the last place I'd think about going. Too many annoying interruptions.
The function it plays is to stop the party-destroying phenomenon of big group conversations.
It may do that, but that doesn't necessarily mean that that's the function. You could equally well guess that its function was to exclude people who don't like loud music, since it also does that.
this is an incredible insight! from this I think we can design better nightclublike social spaces for people who don't like loud sounds (such as people in this community with signal processing issues due to autism).
One idea I have is to do it in the digital. like, VR chat silent nightclub where the sound falloff is super high. (perhaps this exists?) Or a 2D top down equivalent. I will note that Gather Town is backwards - the sound radius is so large that there is still lots of lemurs, but at the same time you can't read people's body language from across the room - and instead there needs to be an emotive radius from webcam / face-tracking needs to be larger than the sound radius. Or you can have a trad UI with "rooms" of very small size that you have to join to talk. tricky to get that kind of app right though since irl there's a fluid boundary between in and out of a convo and a binary demarcation would be subtly unpleasant.
Another idea is to find alternative ways to sound isolate in meatspace. Other people have talked about architectural approaches like in Lighthaven. Or imagine a party where everyone had to wear earplugs. sound falls off with the square of distance and you can calculate out how many decibles you need to deafen everyone by to get the group sizes you want. Or a party with a rule that you have to plug your ears when you aren't actively in a conversation.
Or you could lay out some hula hoops with space between them and the rule is you can only talk within the hula hoop with other people in it, and you can't listen in on someone else's hula hoop convo. have to plug your ears as you walk around. Better get real comfortable with your friends! Maybe secretly you can move the hoops around to combine into bigger groups if you are really motivated. Or with way more effort, you could similarly do a bed fort building competition.
These are very cheap experiments!
I just ran a party where everyone was required to wear earplugs. I think this did effectively cap the max size of groups at 5 people, past which people tend to split into mini conversations. People say the initial silence feels a bit odd though. I'm definitely going to try this more
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable (edit to add: whereas successfully proposing minor changes achieves hard-to-reverse progress, making ideal policy look more reasonable).
I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.
In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.
Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
These are plausible concerns, but I don't think they match what I see as a longtime DC person.
We know that the legislative branch is less productive in the US than it has been in any modern period, and fewer bills get passed (many different metrics for this, but one is https://www.reuters.com/graphics/USA-CONGRESS/PRODUCTIVITY/egpbabmkwvq/) . Those bills that do get passed tend to be bigger swings as a result -- either a) transformative legislation (e.g., Obamacare, Trump tax cuts and COVID super-relief, Biden Inflation Reduction Act and CHIPS) or b) big omnibus "must-pass" bills like FAA reauthorization, into which many small proposals get added in.
I also disagree with the claim that policymakers focus on credibility and consensus generally, except perhaps in the executive branch to some degree. (You want many executive actions to be noncontroversial "faithfully executing the laws" stuff, but I don't see that as "policymaking" in the sense you describe it.)
In either of those, it seems like the current legislative "meta" favors bigger policy asks, not small wins, and I'm having trouble of thinking of anyone I know who's impactful in DC who has adopted the opposite strategy. What are examples of the small wins that you're thinking of as being the current meta?
Agree with lots of this– a few misc thoughts [hastily written]:
I agree that more research on this could be useful. But I think it would be most valuable to focus less on "is X in the Overton Window" and more on "is X written/explained well and does it seem to have clear implications for the target stakeholders?"
Quick reactions:
Unless you're talking about financial conflicts of interest, but there are also financial incentives for orgs pursuing a "radical" strategy to downplay boring real-world constraints, as well as social incentives (e.g. on LessWrong IMO) to downplay boring these constraints and cognitive biases against thinking your preferred strategy has big downsides.
It's not just that problem though, they will likely be biased to think that their policy is helpful for safety of AI at all, and this is a point that sometimes gets forgotten.
But correct on the fact that Akash's argument is fully general.
Recently, John Wentworth wrote:
Ingroup losing status? Few things are more prone to distorted perception than that.
And I think this makes sense (e.g. Simler's Social Status: Down the Rabbit Hole which you've probably read), if you define "AI Safety" as "people who think that superintelligence is serious business or will be some day".
The psych dynamic that I find helpful to point out here is Yud's Is That Your True Rejection post from ~16 years ago. A person who hears about superintelligence for the first time will often respond to their double-take at the concept by spamming random justifications for why that's not a problem (which, notably, feels like legitimate reasoning to that person, even though it's not). An AI-safety-minded person becomes wary of being effectively attacked by high-status people immediately turning into what is basically a weaponized justification machine, and develops a deep drive wanting that not to happen. Then justifications ensue for wanting that to happen less frequently in the world, because deep down humans really don't want their social status to be put at risk (via denunciation) on a regular basis like that. These sorts of deep drives are pretty opaque to us humans but their real world consequences are very strong.
Something that seems more helpful than playing whack-a-mole whenever this issue comes up is having more people in AI policy putting more time into improving perspective. I don't see shorter paths to increasing the number of people-prepared-to-handle-unexpected-complexity than giving people a broader and more general thinking capacity for thoughtfully reacting to the sorts of complex curveballs that you get in the real world. Rationalist fiction like HPMOR is great for this, as well as others e.g. Three Worlds Collide, Unsong, Worth the Candle, Worm (list of top rated ones here). With the caveat, of course, that doing well in the real world is less like the bite-sized easy-to-understand events in ratfic, and more like spotting errors in the methodology section of a study or making money playing poker.
I think, given the circumstances, it's plausibly very valuable e.g. for people already spending much of their free time on social media or watching stuff like The Office, Garfield reruns, WWI and Cold War documentaries, etc, to only spend ~90% as much time doing that and refocusing ~10% to ratfic instead, and maybe see if they can find it in themselves to want to shift more of their leisure time to that sort of passive/ambient/automatic self-improvement productivity.
I'm not a decel, but the way this stuff often is resolved is that there are crazy people that aren't taken seriously by the managerial class but that are very loud and make obnoxious asks. Think the evangelicals against abortion or the Columbia protestors.
Then there is some elite, part of the managerial class, that makes reasonable policy claims. For Abortion, this is Mitch McConnel, being disciplined over a long period of time in choosing the correct judges. For Palestine, this is Blinken and his State Department bureaucracy.
The problem with decels is that theoretically they are part of the managerial class themselves. Or at least, they act like they are. They call themselves rationalists, read Eliezer and Scott Alexander, and what not. But the problem is that it's very hard for an uninterested third party to take seriously these Overton Window bogous claims from people that were supposed to be measured, part of the managerial class.
You need to split. There are the crazy ones that people don't take seriously, but will move the managerial class. And there are the serious people that EA Money will send to D.C. to work at Blumenthal's office. This person needs to make small policy requests that will sabotage IA, without looking so. And slowly, you get policy wins and you can sabotage the whole effort.
[reposting from Twitter, lightly edited/reformatted] Sometimes I think the whole policy framework for reducing catastrophic risks from AI boils down to two core requirements -- transparency and security -- for models capable of dramatically accelerating R&D.
If you have a model that could lead to general capabilities much stronger than human-level within, say, 12 months, by significantly improving subsequent training runs, the public and scientific community have a right to know this exists and to see at least a redacted safety case; and external researchers need to have some degree of red-teaming access. Probably various other forms of transparency would be useful too. It feels like this is a category of ask that should unite the "safety," "ethics," and "accelerationist" communities?
And the flip side is that it's very important that a model capable of kicking off that kind of dash to superhuman capabilities not get stolen/exfiltrated, such that you don't wind up with multiple actors facing enormous competitive tradeoffs to rush through this process.
These have some tradeoffs, especially as you approach AGI -- e.g. if you develop a system that can do 99% of foundation model training tasks and your security is terrible you do have some good reasons not to immediately announce it -- but not if we make progress on either of these before then, IMO. What the Pareto Frontier of transparency and security looks like, and where we should land on that curve, seems like a very important research agenda.
If you're interested in moving the ball forward on either of these, my colleagues and I would love to see your proposal and might fund you to work on it!
And the flip side is that it's very important that a model capable of kicking off that kind of dash to superhuman capabilities not get stolen/exfiltrated, such that you don't wind up with multiple actors facing enormous competitive tradeoffs to rush through this process.
Is it? My sense is the race dynamics get worse if you are worried that your competitor has access to a potentially pivotal model but you can't verify that because you can't steal it. My guess is the best equilibrium is major nations being able to access competing models.
Also, at least given present compute requirements, a smaller actor stealing a model is not that dangerous, since you need to invest hundreds of millions into compute to use the model for dangerous actions, which is hard to do secretly (though to what degree dangerous inference will cost a lot is something I am quite confused about).
In general I am not super confident here, but I at least really don't know what the sign of hardening models against exfiltration with regards to race dynamics is.
My sense is the race dynamics get worse if you are worried that your competitor has access to a potentially pivotal model but you can't verify that because you can't steal it. My guess is the best equilibrium is major nations being able to access competing models.
What about limited API access to all actors for verification (aka transparency) while still having security?
It's really hard to know that your other party is giving you API access to their most powerful model. If you could somehow verify that the API you are accessing is indeed directly hooked up to their most powerful model, and that the capabilities of that model aren't being intentionally hobbled to deceive you, then I do think this gets you a lot of the same benefit.
Some of the benefit is still missing though. I think lack of moats is a strong disincentive to develop technology, and so in a race scenario you might be a lot less tempted to make a mad sprint towards AGI if you think your opponents can catch up almost immediately, and so you might end up with substantial timeline-accelerating effects by enabling better moats.
I do think the lack-of-moat benefit is smaller than the verification benefit.
I think it should be possible to get a good enough verification regime in practice with considerable effort. It's possible that sufficiently good verification occurs by default due to spies.
I agree it there will potentially be a lot of issues downstream of verification issues by default.
I think lack of moats is a strong disincentive to develop technology, and so in a race scenario you might be a lot less tempted to make a mad sprint towards AGI if you think your opponents can catch up almost immediately
Hmm, this isn't really how I model the situation with respect to racing. From my perspective, the question isn't "security or no security", but is instead "when will you have extreme security".
(My response might overlap with tlevin's, I'm not super sure.)
Here's an example way things could go:
In this scenario, if you had extreme security ready to go earlier, then the US would potentially have a larger lead and better negotiating position. I think this probably gets you longer delays prior to qualitatively wildly superhuman AIs in practice.
There is a case that if you don't work on extreme security in advance, then there will naturally be a pause to implement this. I'm a bit skeptical of this in practice, especially in short timelines. I also think that the timing of this pause might not be ideal - you'd like to pause when you already have transformative AI rather than before.
Separately, if you imagine that USG is rational and at least somewhat aligned, then I think security looks quite good, though I can understand why you wouldn't buy this.
Hmm, this isn't really how I model the situation with respect to racing. From my perspective, the question isn't "security or no security"
Interesting, I guess my model is that the default outcome (in the absence of heroic efforts to the contrary) is indeed "no security for nation state attackers", which as far as I can tell is currently the default for practically everything that is developed using modern computing systems. Getting to a point where you can protect something like the weights of an AI model from nation state actors would be extraordinarily difficult and an unprecedented achievement in computer security, which is why I don't expect it to happen (even as many actors would really want it to happen).
My model of cybersecurity is extremely offense-dominated for anything that requires internet access or requires thousands of people to have access (both of which I think are quite likely for deployed weights).
Interesting. I would have to think harder about whether this is a tractable problem. My gut says it's pretty hard to build confidence here without leaking information, but I might be wrong.
If probability of misalignment is low, probability of human+AI coups (including e.g. countries invading each other) is high, and/or there aren't huge offense-dominant advantages to being somewhat ahead, you probably want more AGI projects, not fewer. And if you need a ton of compute to go from an AI that can do 99% of AI R&D tasks to an AI that can cause global catastrophe, then model theft is less of a factor. But the thing I'm worried about re: model theft is a scenario like this, which doesn't seem that crazy:
So, had the weights not been available to Y, X would be confident that it had 12 + 5 months to manage a capabilities explosion that would have happened in 8 months at full speed; it can spend >half of its compute on alignment/safety/etc, and it has 17 rather than 5 months of serial time to negotiate with Y, possibly develop some verification methods and credible mechanisms for benefit/power-sharing, etc. If various transparency reforms have been implemented, such that the world is notified in ~real-time that this is happening, there would be enormous pressure to do so; I hope and think it will seem super illegitimate to pursue this kind of power without these kinds of commitments. I am much more worried about X not doing this and instead just trying to grab enormous amounts of power if they're doing it all in secret.
[Also: I just accidentally went back a page by command-open bracket in an attempt to get my text out of bullet format and briefly thought I lost this comment; thank you in your LW dev capacity for autosave draft text, but also it is weirdly hard to get out of bullets]
I expect that having a nearly-AGI-level AI, something capable of mostly automating further ML research, means the ability to rapidly find algorithmic improvements that result in:
1. drastic reductions in training cost for an equivalently strong AI.
- Making it seem highly likely that a new AI trained using this new architecture/method and a similar amount of compute as the current AI would be substantially more powerful. (thus giving an estimate of time-to-AGI)
- Making it possible to train a much smaller cheaper model than the current AI with the same capabilities.
2. speed-ups and compute-efficiency for inference on current AI, and for the future cheaper versions
3. ability to create and deploy more capable narrow tool-AIs which seem likely to substantially shift military power when deployed to existing military hardware (e.g. better drone piloting models)
4. ability to create and deploy more capable narrow tool-AIs which seem likely to substantially increase economic productivity of the receiving factories.
5. ability to rapidly innovate in non-ML technology, and thereby achieve military and economic benefits.
6. ability to create and destroy self-replicating weapons which would kill most of humanity (e.g. bioweapons), and also to create targeted ones which would wipe out just the population of a specific country.
If I were the government of a country in whom such a tech were being developed, I would really not other countries able to steal this tech. It would not seem like a worthwhile trade-off that the thieves would then have a more accurate estimate of how far from AGI my countries' company was.
but also it is weirdly hard to get out of bullets
Just pressing enter twice seems to work well-enough for me, though I feel like I vaguely remember some bugged state where that didn't work.
Company/country X has an AI agent that can
do 99%[edit: let's say "automate 90%"] of AI R&D tasks, call it Agent-GPT-7, and enough of a compute stock to have that train a significantly better Agent-GPT-8 in 4 months at full speed ahead, which can then train a basically superintelligent Agent-GPT-9 in another 4 months at full speed ahead. (Company/country X doesn't know the exact numbers, but their 80% CI is something like 2-8 months for each step; company/country Y has less info, so their 80% CI is more like 1-16 months for each step.)
Spicy take: it might be more realistic to substract 1 or even 2 from the numbers for the GPT generations, and also to consider that the intelligence explosion might be quite widely-distributed: https://www.lesswrong.com/posts/wr2SxQuRvcXeDBbNZ/bogdan-ionut-cirstea-s-shortform?commentId=6EFv8PAvELkFopLHy
I strongly disagree, habryka, on the basis that I believe LLMs are already providing some uplift for highly harmful offense-dominant technology (e.g. bioweapons). I think this effect worsens the closer you get to full AGI. The inference cost to do this, even with a large model, is trivial. You just need to extract the recipe.
This gives a weak state-actor (or wealthy non-state-actor) that has high willingness to undertake provocative actions the ability to gain great power from even temporary access to a small amount of inference from a powerful model. Once they have the weapon recipe, they no longer need the model.
I'm also not sure about tlevin's argument about 'right to know'. I think the State has a responsibility to protect its citizens. So I certainly agree the State should be monitoring closely all the AI companies within its purview. On the other hand, making details of the progress of the AI publicly known may lead to increased international tensions or risk of theft or terrorism. I suspect it's better that the State have inspectors and security personnel permanently posted in the AI labs, but that the exact status of the AI progress be classified.
I think the costs of biorisks are vastly smaller than AGI-extinction risk, and so they don't really factor into my calculations here. Having intermediate harms before AGI seems somewhat good, since it seems more likely to cause rallying around stopping AGI development, though I feel pretty confused about the secondary effects here (but am pretty confident the primary effects are relatively unimportant).
I think that doesn't really make sense, since the lowest hanging fruit for disempowering humanity routes through self-replicating weapons. Bio weapons are the currently available technology which is in the category of self-replicating weapons. I think that would be the most likely attack vector for a rogue AGI seeking rapid coercive disempowerment.
Plus, having bad actors (human or AGI) have access to a tech for which we currently have no practical defense, which could wipe out nearly all of humanity for under $100k... seems bad? Just a really unstable situation to be in?
I do agree that it seems unlikely that some terrorist org is going to launch a civilization-ending bioweapon attack within the remaining 36 months or so until AGI (or maybe even ASI). But I do think that manipulating a terrorist org into doing this, and giving them the recipe and supplies to do so, would be a potentially tempting tactic for a hostile AGI.
I think if AI kills us all it would be because the AI wants to kill us all. It is (in my model of the world) very unlikely to happen because someone misuses AI systems.
I agree that bioweapons might be part of that, but the difficult part of actually killing everyone via bioweapons requires extensive planning and deployment strategies, which humans won't want to execute (since they don't want to die), and so if bioweapons are involved in all of us dying it will very likely be the result of an AI seeing using them as an opportunity to take over, which I think is unlikely to happen because someone runs some leaked weights on some small amount of compute (or like, that would happen years after the same AIs would have done the same when run on the world's largest computing clusters).
In general, for any story of "dumb AI kills everyone" you need a story for why a smart AI hasn't killed us first.
I think if AI kills us all it would be because the AI wants to kill us all. It is (in my model of the world) very unlikely to happen because someone misuses AI systems.
I agree that it seems more likely to be a danger from AI systems misusing humans than humans misusing the AI systems.
What I don't agree with is jumping forward in time to thinking about when there is an AI so powerful it can kill us all at its whim. In my framework, that isn't a useful time to be thinking about, it's too late for us to be changing the outcome at that point.
The key time to be focusing on is the time before the AI is sufficiently powerful to wipe out all of humanity, and there is nothing we can do to stop it.
My expectation is that this period of time could be months or even several years, where there is an AI powerful enough and agentic enough to make a dangerous-but-stoppable attempt to take over the world. That's a critical moment for potential success, since potentially the AI will be contained in such a way that the threat will be objectively demonstrable to key decision makers. That would make for a window of opportunity to make sweeping governance changes, and further delay take-over. Such a delay could be super valuable if it gives alignment research more critical time for researching the dangerously powerful AI.
Also, the period of time between now and when the AI is that powerful is one where AI-as-a-tool makes it easier and easier for humans aided by AI to deploy civilization-destroying self-replicating weapons. Current AIs are already providing non-zero uplift (both lowering barriers to access, and raising peak potential harms). This is likely to continue to rapidly get worse over the next couple years. Delaying AGI doesn't much help with biorisk from tool AI, so if you have a 'delay AGI' plan then you need to also consider the rapidly increasing risk from offense-dominant tech.
Also - I'm not sure I'm getting the thing where verifying that your competitor has a potentially pivotal model reduces racing?
Same reason as knowing how many nukes your opponents has reduces racing. If you are conservative the uncertainty in how far ahead your opponent is causes escalating races, even if you would both rather not escalate (as long as your mean is well-calibrated).
E.g. let's assume you and your opponent are de-facto equally matched in the capabilities of your system, but both have substantial uncertainty, e.g. assign 30% probability to your opponent being substantially ahead of you. Then if you think those 30% of worlds are really bad, you probably will invest a bunch more into developing your systems (which of course your opponent will observe, increase their own investment, and then you repeat).
However, if you can both verify how many nukes you have, you can reach a more stable equilibrium even under more conservative assumptions.
Gotcha. A few disanalogies though -- the first two specifically relate to the model theft/shared access point, the latter is true even if you had verifiable API access:
Me verifying how many nukes you have doesn't mean I suddenly have that many nukes, unlike AI model capabilities, though due to compute differences it does not mean we suddenly have the same time distance to superintelligence.
It's not super clear whether from a racing perspective having an equal number of nukes is bad. I think it's genuinely messy (and depends quite sensitively on how much actors are scared of losing vs. happy about winning vs. scared of racing).
I do also currently think that the compute-component will likely be a bigger deal than the algorithmic/weights dimension, making the situation more analogous to nukes, but I do think there is a lot of uncertainty on this dimension.
Me having more nukes only weakly enables me to develop more nukes faster, unlike AI that can automate a lot of AI R&D.
Yeah, totally agree that this is an argument against proliferation, and an important one. While you might not end up with additional racing dynamics, the fact that more global resources can now use the cutting edge AI system to do AI R&D is very scary.
This model seems to assume you have an imprecise but accurate estimate of how many nukes I have, but companies will probably be underestimating the proximity of each other to superintelligence, for the same reason that they're underestimating their own proximity to superintelligence, until it's way more salient/obvious.
In-general I think it's very hard to predict whether people will overestimate or underestimate things. I agree that literally right now countries are probably underestimating it, but an overreaction in the future also wouldn't surprise me very much (in the same way that COVID started with an underreaction, and then was followed by a massive overreaction).
It's not super clear whether from a racing perspective having an equal number of nukes is bad. I think it's genuinely messy (and depends quite sensitively on how much actors are scared of losing vs. happy about winning vs. scared of racing).
Importantly though, once you have several thousand nukes the strategic returns to more nukes drop pretty close to zero, regardless of how many your opponents have, while if you get the scary model's weights and then don't use them to push capabilities even more, your opponent maybe gets a huge strategic advantage over you. I think this is probably true, but the important thing is whether the actors think it might be true.
In-general I think it's very hard to predict whether people will overestimate or underestimate things. I agree that literally right now countries are probably underestimating it, but an overreaction in the future also wouldn't surprise me very much (in the same way that COVID started with an underreaction, and then was followed by a massive overreaction).
Yeah, good point.