All of davekasten's Comments + Replies

We're hiring at ControlAI for folks who walk to work on UK and US policy advocacy.  Come talk to Congress and Parliament and stop risks from unsafe superintelligences!  controlai.com/careers

(Admins: I don't tend to see many folks posting this sort of thing here, so feel free to nuke this post if not the sort of content you're going for.  But given audience here, figured might be of interest)

1Felix C.
Thank you for posting this. Are there any opportunities for students about to graduate to apply themselves, particularly without a C.S background? My undergraduate experience was focused on Business and IR (Cold War history, Sino-U.S relations) before I pivoted my long term focus to AI safety policy, and it's been difficult to find good entry points for EA work in this field as a new grad. I've been monitoring 80,000 hours and applying to research fellowships where I can so far, but I'm always looking for new positions. If you or anyone else knows an org looking to onboard some fresh talent, I'd be happy to help. Edit: Application submitted.
davekasten1310

I think I am too much inside the DC policy world to understand why this is seen as a gaffe, really.  Can you unpack why it's seen as a gaffe to them?  In the DC world, by contrast, "yes, of course, this is a major national security threat, and no you of course could never use military capabilities to address it," would be a gaffe.

habryka112

I mean, you saw people make fun of it when Eliezer said it, and then my guess is people conservatively assumed that this would generalize to the future. I've had conversations with people where they tried to convince me that Eliezer mentioning kinetic escalation was one of the worst things that anyone has ever done for AI policy, and they kept pointing to twitter threads and conversations where opponents made fun of it as evidence. I think there clearly was something real here, but I also think people really fail to understand the communication dynamics here.

I particularly appreciated its coverage of explicitly including conventional ballistic escalation as part of a sabotage strategy for datacenters

One thing I find very confusing about existing gaps between the AI policy community and the national security community is that natsec policymakers have already explicitly said that kinetic (i.e., blowing things up) responses are acceptable for cyberattacks under some circumstances, while the AI policy community seems to somehow unconsciously rule those sorts of responses out of the policy window.  (To be clear: any day that American servicemembers go into combat is a bad day, I don't think we should choose such approaches lightly.)

9habryka
My sense is a lot of the x-risk oriented AI policy community is very focused on avoiding "gaffes" and have a very short-term and opportunistic relationship with reputation and public relations and all that kind of stuff. My sense is that people in the space don't believe being principled or consistently honest basically ever gets rewarded or recognized, so the right strategy is to try to identify what the overton window is, only push very conservatively on expanding it, and focus on staying in the good graces of whatever process determines social standing, which is generally assumed to be pretty random and arbitrary. I think many people in the space, if pushed, would of course acknowledge that kinetic responses are appropriate in many AI scenarios, but they would judge it as an unnecessarily risky gaffe, and that perception of a gaffe creates a pretty effective enforcement regime for people to basically never bring it up, lest you be judged as politically unresponsible.

I think a lot of this boils down to the fact that Sam Vimes is a copper, and sees poverty lead to precarity, and precarity lead to Bad Things Happening In Bad Neighborhoods.  The most salient fact about Lady Sybil is that she never has to worry, never is on the rattling edge; she's always got more stuff, new stuff, old stuff, good stuff.  Vimes (at that point in the Discworld series) isn't especially financially sophisticated, so he narrows it down to the piece he understands best, and builds a theory off of that.

2philh
Hm, does he? It's certainly a reasonable guess, but offhand I don't remember it coming up in the books, and the Thieves and Assassins guilds will change the dynamic compared to what we'd expect on Earth.

You can definitely meet your own district's staff locally (e.g., if you're in Berkeley, Congresswoman Simon has an office in Oakland, Senator Padilla has an office in SF, and Senator Schiff's offices look not to be finalized yet but undoubtedly will include a Bay Area Office).  

You can also meet most Congressional offices' staff via Zoom or phone (though some offices strongly prefer in-person meetings).  

There is also indeed a meaningful rationalist presence in DC, though opinions vary as to whether the enclave is in Adams Morgan-Columbia Heights... (read more)

2Alexander Gietelink Oldenziel
The people require it, sir.

The elites do want you to know it: you can just email a Congressional office and get a meeting

6Seth Herd
Would I have to go to DC? Because I hate going to DC. Not that I wouldn't to save the world, but I'd want to be sure it was necessary. Only partly kidding. Maybe if people got a rationalist enclave in DC going we'd be less averse?

I think on net, there are relatively fewer risks related to getting governments more AGI-pilled vs. them continuing on their current course; governments are broadly AI-pilled even if not AGI/ASI-pilled and are doing most of the accelerating actions an AGI-accelerator would want.

7Seth Herd
I wasn't able to finish that post in the few minutes I've got so far today, so here's the super short version. I remain highly uncertain whether my comments will include any mention of AGI. (Edit: I finally finished it: Whether governments will control AGI is important and neglected) I think whether AGI-pilling governments is a good idea is quite complex. Pushing the government to become aware of AGI x-risks will probably decelerate progress, but it could even accelerate it if the conclusion is "build it first, don't worry we'll be super careful when we get close".  Even if it does help with alignment, it's not necessarily net good. If governments take control early enough to prevent proliferation of AGI, that helps a lot with the risks of misalignment and catastrophic misuse. The US could even cooperate with China to prevent proliferation to other countries and to nongovermental groups, just as the US cooperated with Russia on nuclear nonproliferation. But government control also raises the risks of power concentration. Intent-aligned AGI in untrustworthy hands could create a permanent dictatorship and unbreakable police state. The current governments of both the US and China don't seem like the best types to control the future. So it's a matter of balancing Fear of centralized power vs. fear of misaligned AGI. This also needs to be balanced agains the possibility of misuse of intent-aligned AGI if it does proliferate broadly; see If we solve alignment, do we die anyway? If I had a firm estimate of how hard technical alignment is, I'd have a better answer. But I don't, and I think the best objective conclusion, taking in all of the arguments made to date and the very wide variance in opinion even among those who've thought deeply about it, is that nobody has a very good estimate. (Edit: I mean estimates between very very hard and modestly tricky. I don't know of anyone who's addressed the hard parts and concluded that it happens by default.) Neither do we h
davekasten*13011

The Trump administration (or, more specifically, the White House Office of Science and Technology Policy, but they are in the lead on most AI policy, it seems), are asking for comment on what their AI Action Plan should include.  Literally anyone can comment on it.  You should consider commenting on it, comments are due Saturday at 8:59pm PT/11:59pm ET via an email address.  These comments will actually be read, and a large number of comments on an issue usually does influence any White House's policy.  I encourage you to submit comment... (read more)

In the future, there should be some organization or some group of individuals in the LW community who raise awareness about these sorts of opportunities and offer content and support to ensure submissions from the most knowledgeable and relevant actors. This seems like a very low-hanging fruit and is something several groups I know are doing.

6Seth Herd
Edit: I finished that post on this topic: Whether governments will control AGI is important and neglected. I'm hoping for discussion on that post, and quite ready to change my draft comment, or not submit one, based on those arguments. After putting a bunch of thought into it, my planned comment will recommend forming a committee that can work in private to investigate the opportunities and risks of AI development, to inform future policy. I will note that this was Roosevelt's response to Einstein's letter on the potential of nuclear weaponry.  I hope that such a committee will conclude that yeah, there are some big dangers on expectation. I will emphasize the disagreement among experts, and suggest that the sane thing to do is put real effort into sorting out the many conflicting claims and possibilities, while also pursuing our current best guesses. I think any request for a slowdown is wasted, given the request's note about reducing regulatory barriers. But I will note that there are dangers to both our economy from potential rapid job loss, and large security risks from adversaries stealing or copying our AI, such that we may be currently building tools and weapons that will be used against us. I think I will not emphasize x-risk, and may not even include it. But I will probably mention that predictions of reaching human-level autonomous operation are very mixed, so we're not sure how far we are from creating what's effectively a new intelligent species. I'm hoping that triggers the right intuitions of danger. Again, I'm highly uncertain and very open to changing my mind on what to say. Original comment: This raises the question: what should we say? Fortunately, I've almost finished a post about this. It analyzes many aspects of the question "do we want governments to recognize the potential of AGI?". Unfortunately, it doesn't answer the question. There are strong points on both sides, and it needs more careful thought. Nonetheless, I'll probably get it

I think there's at least one missing one, "You wake up one morning and find out that a private equity firm has bought up a company everyone knows the name of, fired 90% of the workers, and says they can replace them with AI."

1mhampton
I agree that mass unemployment may spark policy change, but why do you see that change as being relevant to misalignment vs. specific to automation? 
2mako yass
Mm, scenario where mass unemployment can be framed as a discrete event with a name and a face. I guess I think it's just as likely there isn't an event, human-run businesses die off, new businesses arise, none of them outwardly emphasise their automation levels, the press can't turn it into a scary story because automation and foreclosures are nothing fundamentally new (only in quantity, but you can't photograph a quantity), the public become complicit by buying their cheaper higher quality goods and services so appetite for public discussion remains low.

This essay earns a read for the line, "It would be difficult to find a policymaker in DC who isn’t happy to share a heresy or two with you, a person they’ve just met" alone.

I would amplify to suggest that while many things are outside the Overton Window, policymakers are also aware of the concept of slowly moving the Overton Window, and if you explicitly admit you're doing that, they're usually on board (see, e.g., the conservative legal movement, the renewable energy movement, etc.).  It's mostly only if you don't realize you're proposing that that you trigger a dismissive response.

8David James
Right. To expand on this: there are also situations where an interest group pushes hard on a broader coalition to move faster, sometimes even accusing their partners or allies of “not caring enough” or “dragging their feet”. Assuming bad faith or impugning the motives of one’s allies can sour working relationships. Understanding the constraints in play goes a long way towards fostering compromise.

Ok, so it seems clear that we are, for better or worse, likely going to try to get AGI to do our alignment homework. 

Who has thought through all the other homework we might give AGI that is as good of an idea, assuming a model that isn't an instant-game-over for us?  E.G., I remember @Buck rattling off a list of other ideas that he had in his The Curve talk, but I feel like I haven't seen the list of, e.g., "here are all the ways I would like to run an automated counterintelligence sweep of my organization" ideas.

(Yes, obviously, if the AI is sne... (read more)

2Quinn
I'm working on making sure we get high quality critical systems software out of early AGI. Hardened infrastructure buys us a lot in the slightly crazy story of "self-exfiltrated model attacks the power grid", but buys us even more in less crazy stories about all the software modules adjacent to AGI having vulnerabilities rapidly patched at crunchtime.
3Ebenezer Dukakis
I think unlearning could be a good fit for automated alignment research. Unlearning could be a very general tool to address a lot of AI threat models. It might be possible to unlearn deception, scheming, manipulation of humans, cybersecurity, etc. I challenge you to come up with an AI safety failure story that can't, in principle, be countered through targeted unlearning in some way, shape, or form. Relative to some other kinds of alignment research, unlearning seems easy to automate, since you can optimize metrics for how well things have been unlearned. I like this post.
4Thane Ruthenis
Technology for efficient human uploading. Ideally backed by theory we can independently verify as correct and doing what it's intended to do (rather than e. g. replacing the human upload with a copy of the AGI who developed this technology).
5trevor
How to build a lie detector app/program to release to the public (preferably packaged with advice/ideas on ways to use and strategies for marketing the app, e.g. packaging it with an animal body-language to english translator).
1yams
Preliminary thoughts from Ryan Greenblatt on this here.
Buck142

@ryan_greenblatt is working on a list of alignment research applications. For control applications, you might enjoy the long list of control techniques in our original post.

Huh?  "fighting election misinformation" is not a sentence on this page as far as I can tell. And if you click through to the election page, you will see that the elections content is them praising a bipartisan bill backed by some of the biggest pro-Trump senators.  

-3ChristianKl
You are right, the wording is even worse. It says "Partnering with governments to fight misinformation globally". That would be more than just "election misinformation". I just tested that ChatGPT is willing to answer "Tell me about the latest announcement of the trump administration about cutting USAID funding?" while Gemini isn't willing to answer that question, so in practice their policy isn't as bad as Gemini's.  It's still sounds different from what Elon Musk advocates as "truth aligned"-AI. Lobbyists should be able to use AI to inform themselves about proposed laws. If you would ask David Sachs as the person who coordinates AI policy, I'm very certain that he supports Elon Musks idea where AI should help people to learn the truth about political questions.  If they wanted to appeal to the current administration they could say something about the importance of AI to tell truthful information and not mislead the user instead of speaking about "fighting misinformation". 
-1Maxwell Peterson
The Elections panel on OP’s image says “combat disinformation”, so while you’re technically right, I think Christian’s “fighting election misinformation” rephrasing is close enough to make no difference.

Without commenting on any strategic astronomy and neurology, it is worth noting that "bias", at least, is a major concern of the new administration (e.g., the Republican chair of the House Financial Services Committee is actually extremely worried about algorithmic bias being used for housing and financial discrimination and has given speeches about this).  

I am not a fan, but it is worth noting that these are the issues that many politicians bring up already, if they're unfamiliar with the more catastrophic risks. Only one missing on there is job loss. So while this choice by OpenAI sucks, it sort of usefully represents a social fact about the policy waters they swim in.

3ChristianKl
The page does not seem to o be directed at what's politically advantageous. The Trump administration who fights DEI is not looking favorably at the mission to prevent AI from reinforcing stereotypes even if those stereotypes are true. "Fighting election misinformation" is similarly a keyword that likely invite skepticism from the Trump administration. They just shut down USAID and their investment in "combating misinformation" is one of the reasons for that. It seems time more likely that they hired a bunch of woke and deep state people into their safety team and this reflects the priorities of those people.
7aog
I’m surprised they list bias and disinformation. Maybe this is a galaxy brained attempt to discredit AI safety by making it appear left-coded, but I doubt it. Seems more likely that x-risk focused people left the company while traditional AI ethics people stuck around and rewrote the website.

I am (sincerely!) glad that this is obvious to other people too and that they are talking about it already!

I mean, the literal best way to incentivize @Ricki Heicklen and me to do this again for LessOnline and Manifest 2025 is to create a prediction market on it, so I encourage you to do that

One point that maybe someone's made, but I haven't run across recently:  if you want to turn AI development into a Manhattan Project, you will by-default face some real delays from the reorganization of private efforts into one big national effort.  In a close race, you might actually see pressures not to do so, because you don't want to give up 6 months to a year on reorg drama -- so in some possible worlds, the Project is actually a deceleration move in the short term, even if it accelerates in the long term!

3Nathan Helm-Burger
This is a point that's definitely come up in private discussions I've been a part of. I don't remember if I saw it said publicly somewhere.

Incidentally, spurred by @Mo Putera's posting of Vernor Vinge's A Fire Upon The Deep annotations, I want to remind folks that Vinge's Rainbows End is very good and doesn't get enough attention, and will give you a less-incorrect understanding of how national security people think.  

Oh, fair enough then, I trust your visibility into this.  Nonetheless one Should Can Just Report Bugs

Note for posterity that there has been at least $15K of donations since this got turned back on -- You Can Just Report Bugs

[This comment is no longer endorsed by its author]Reply1
3habryka
Those were mostly already in-flight, so not counterfactual (and also the fundraising post still has the donation link at the top), but I do expect at least some effect!

Ok, but you should leave the donation box up -- link now seems to not work?  I bet there would be at least several $K USD of donations from folks who didn't remember to do it in time.

5habryka
Oops, you're right, fixed. That was just an accident.

I think you're missing at least one strategy here.  If we can get folks to agree that different societies can choose different combos, so long as they don't infringe on some subset of rights to protect other societies, then you could have different societies expand out into various pieces of the future in different ways.  (Yes, I understand that's a big if, but it reduces the urgency/crux nature of value agreement). 

2jbash
Societies aren't the issue; they're mindless aggregates that don't experience anything and don't actually even have desires in anything like the way a human, or or even an animal or an AI, has desires. Individuals are the issue. Do individuals get to choose which of these societies they live in?
4Noosphere89
I think the if condition is relying on either an impossibility as presented, or it requires you to exclude some human values, at which point you should at least admit that what values you choose to retain is a political decision, based on your own values.
2sloonz
I’m not missing that strategy at all. It’s an almost certainty that any solution will have to involve something like that, barring some extremely strong commitment to Unity which by itself will destroy a lot of Values. But there are some pretty fundamental values that some people (even/especially) here care a lot about, like negative utilitarianism ("minimize suffering"), which are flatly incompatible with simple implementations of that solution. Negative utilitarians care very much about the total suffering in the universe and their calculus do not stop at the boundaries of "different societies". And if you say "screw them", well, what about the guy who basically goes "let’s create the baby eaters society ?". If you recoil at that, it means there’s at least a bit of negative utilitarianism in you. Which is normal, don’t worry, it’s a pretty common human value, even in people who doesn’t describe themselves as "negative utilitarians". Now you can recognize the problem, which is that every individual will have a different boundary in the Independence-Freedom-Diversity vs Negative-Utilitarianism tradeoff. (which I do not think is the only tradeoff/conflict, but clearly one of the biggest one, if not THE biggest one, if you set aside transhumanism) And if you double down on the "screw them" solution ? Well, you enter exactly in what I described with "even with perfect play, you are going to lose some Human Values". For it is a non-negligible chunk of Human Values.

Note that the production function of the 10x really matters.  If it's "yeah, we get to net-10x if we have all our staff working alongside it," it's much more detectable than, "well, if we only let like 5 carefully-vetted staff in a SCIF know about it, we only get to 8.5x speedup".  

(It's hard to prove that the results are from the speedup instead of just, like, "One day, Dario woke up from a dream with The Next Architecture in his head")

Basic clarifying question: does this imply under-the-hood some sort of diminishing returns curve, such that the lab pays for that labor until it net reaches as 10x faster improvement, but can't squeeze out much more?

And do you expect that's a roughly consistent multiplicative factor, independent of lab size? (I mean, I'm not sure lab size actually matters that much, to be fair, it seems that Anthropic keeps pace with OpenAI despite being smaller-ish) 

5ryan_greenblatt
Yeah, for it to reach exactly 10x as good, the situation would presumably be that this was the optimum point given diminishing returns to spending more on AI inference compute. (It might be the returns curve looks very punishing. For instance, many people get a relatively large amount of value from extremely cheap queries to 3.5 Sonnet on claude.ai and the inference cost of this is very small, but greatly increasing the cost (e.g. o1-pro) often isn't any better because 3.5 Sonnet already gave an almost perfect answer.) I don't have a strong view about AI acceleration being a roughly constant multiplicative factor independent of the number of employees. Uplift just feels like a reasonably simple operationalization.

For the record: signed up for a monthly donation starting in Jan 2025.  It's smaller than I'd like given some financial conservatism until I fill out my taxes, may revisit it later.

Everyone who's telling you there aren't spoilers in here is well-meaning, but wrong.  But to justify why I'm saying that is also spoilery, so to some degree you have to take this on faith.

(Rot13'd for those curious about my justification: Bar bs gur znwbe cbvagf bs gur jubyr svp vf gung crbcyr pna, vs fhssvpvragyl zbgvingrq, vasre sne zber sebz n srj vfbyngrq ovgf bs vasbezngvba guna lbh jbhyq anviryl cerqvpg. Vs lbh ner gryyvat Ryv gung gurfr ner abg fcbvyref V cbyvgryl fhttrfg gung V cerqvpg Nfzbqvn naq Xbein naq Pnevffn jbhyq fnl lbh ner jebat.)

davekasten3917

Opportunities that I'm pretty sure are good moves for Anthropic generally: 

  1. Open an office literally in Washington, DC, that does the same work that any other Anthropic office does (i.e., NOT purely focused on policy/lobbying, though I'm sure you'd have some folks there who do that).  If you think you're plausibly going to need to convince policymakers on critical safety issues, having nonzero numbers of your staff that are definitively not lobbyists being drinking or climbing gym buddies that get called on the "My boss needs an opinion on this bi
... (read more)

FWIW re: the Dario 2025 comment, Anthropic very recently posted a few job openings for recruiters focused on policy and comms specifically, which I assume is a leading indicator for hiring. One plausible rationale there is that someone on the executive team smashed the "we need more people working on this, make it happen" button.

In an ideal world (perhaps not reasonable given your scale), you would have some sort of permissions and logging against some sensitive types of queries on DM metadata.  (E.G., perhaps you would let any Lighthaven team member see on the dashboard "rate of DMs from accounts <1 month in age compared to historic baseline" aggregate number, but "how many DMs has Bob (an account over 90 days old) sent to Alice" would require more guardrails.

Edit: to be clear, I am comfortable with you doing this without such logging at your current scale and think it is reasonable to do so.

7Karl Krueger
In a former job where I had access to logs containing private user data, one of the rules was that my queries were all recorded and could be reviewed. Some of them were automatically visible to anyone else with the same or higher level of access, so if I were doing something blatantly bad with user data, my colleagues would have a chance of noticing.

I have a few weeks off coming up shortly, and I'm planning on spending some of it monkeying around AI and code stuff.  I can think of two obvious tacks: 1. Go do some fundamentals learning on technical stuff I don't have hands-on technical experience with or 2. go build on new fun stuff.

Does anyone have particular lists of learning topics / syllabi / similar things like that that would be a good fit for "fairly familiar with the broad policy/technical space, but his largest shipped chunk of code is a few hundred lines of python" person like me? 

3Joseph Miller
The ARENA curriculum is very good.

Note also that this work isn't just papers; e.g., as a matter of public record MIRI has submitted formal comments to regulators to inform draft regulation based on this work.  

(For those less familiar, yes, such comments are indeed actually weirdly impactful in the American regulatory system).

In a hypothetical, bad future where we have to do VaccinateCA 2.0 against e.g. bird flu, I personally wonder if "aggressively help people source air filters" would be a pre-vaccine-distribution-time step we would consider.  (Not canon!  Might be very wrong! Just idle musing)

Also, I would generally volunteer to help with selling Lighthaven as an event venue to boring consultant things that will give you piles of money, and IIRC Patrick Ward is interested in this as well, so please let us know how we can help. 

8habryka
That sounds great! Let's definitely chat about that. I'll reach out as soon as fundraising hustle has calmed down a bit.

I am excited for this grounds of "we deserve to have nice things," though for boring financial planning reasons I am not sure whether I will donate additional funds prior to calendar year end or in calendar year 2025.

(Note that I made a similar statement in the past and then donated $100 to Lighthaven very shortly thereafter, so, like, don't attempt to reverse-engineer my financial status from this or whatever.)

3davekasten
For the record: signed up for a monthly donation starting in Jan 2025.  It's smaller than I'd like given some financial conservatism until I fill out my taxes, may revisit it later.

Also, I would generally volunteer to help with selling Lighthaven as an event venue to boring consultant things that will give you piles of money, and IIRC Patrick Ward is interested in this as well, so please let us know how we can help. 

I think I'm also learning that people are way more interested in this detail than I expected! 

I debated changing it to "203X" when posting to avoid this becoming the focus of the discussion but figured, "eh, keep it as I actually wrote it in the workshop" for good epistemic hygiene.  

Oh, it very possibly is the wrongest part of the piece!  I put it in the original workshop draft as I was running out of time and wanted to provoke debate.

A brief gesture at a sketch of the intuition:  imagine a different, crueler world, where there were orders of magnitude more nation-states, but at the start only a few nuclear powers, like in our world, with a 1950s-level tech base.  If the few nuclear powers want to keep control, they'll have to divert huge chunks of their breeder reactors' output to pre-emptively nuking any site in the m... (read more)

Interesting! You should definitely think more about this and write it up sometime, either you'll change your mind about timelines till superintelligence or you'll have found an interesting novel argument that may change other people's minds (such as mine).

As you know, I have huge respect for USG natsec folks.  But there are (at least!) two flavors of them: 1) the cautious, measure-twice-cut-once sort that have carefully managed deterrencefor decades, and 2) the "fuck you, I'm doing Iran-Contra" folks.  Which do you expect will get in control of such a program ?  It's not immediately clear to me which ones would.

4Orpheus16
@davekasten I know you posed this question to us, but I'll throw it back on you :) what's your best-guess answer? Or perhaps put differently: What do you think are the factors that typically influence whether the cautious folks or the non-cautious folks end up in charge? Are there any historical or recent examples of these camps fighting for power over an important operation?

I think this is a (c) leaning (b), especially given that we're doing it in public.  Remember, the Manhattan Project was a highly-classified effort and we know it by an innocuous name given to it to avoid attention.  

Saying publicly, "yo, China, we view this as an all-costs priority, hbu" is a great way to trigger a race with China...

But if it turned out that we knew from ironclad intel with perfect sourcing that China was already racing (I don't expect this to be the case), then I would lean back more towards (c).  

I'll be in Berkeley Weds evening through next Monday, would love to chat with, well, basically anyone who wants to chat. (I'll be at The Curve Fri-Sun, so if you're already gonna be there, come find me there between the raindrops!)

Thanks, looking forward to it!  Please do let us folks who worked on A Narrow Path (especially me, @Tolga , and @Andrea_Miotti ) know if we can be helpful in bouncing around ideas as you work on the treaty proposal!

2otto.barten
Thanks for the offer, we'll do that!

Is there a longer-form version with draft treaty langugage (even an outline)? I'd be curious to read it.

1otto.barten
Not publicly, yet. We're working on a paper providing more details about the conditional AI safety treaty. We'll probably also write a post about it on lesswrong when that's ready.

I think people opposing this have a belief that the counterfactual is "USG doesn't have LLMs" instead of "USG spins up its own LLM development effort using the NSA's no-doubt-substantial GPU clusters". 

Needless to say, I think the latter is far more likely.
 

1uhds
NSA building it is arguably better because atleast they won't sell it to countries like Saudi Arabia, and they have better ability to prevent people quitting or diffusing knowledge and code to companies outside. Also most people in SF agree working for the NSA is morally grey at best, and Anthropic won't be telling everyone this is morally okay.

I think the thing that you're not considering is that when tunnels are more prevalent and more densely packed, the incentives to use the defensive strategy of "dig a tunnel, then set off a very big bomb in it that collapses many tunnels" gets far higher.  It wouldn't always be infantry combat, it would often be a subterranean equivalent of indirect fires.

3Daniel Kokotajlo
Thanks, I hadn't considered that. So as per my argument, there's some threshold of density above which it's easier to attack underground; as per your argument, there's some threshold of density where 'indirect fires' of large tunnel-destroying bombs become practical. Unclear which threshold comes first, but I'd guess it's the first. 

Ok, so Anthropic's new policy post (explicitly NOT linkposting it properly since I assume @Zac Hatfield-Dodds or @Evan Hubinger or someone else from Anthropic will, and figure the main convo should happen there, and don't want to incentivize fragmenting of conversation) seems to have a very obvious implication.

Unrelated, I just slammed a big AGI-by-2028 order on Manifold Markets.
 

Yup.  The fact that the profession that writes the news sees "I should resign in protest" as their own responsibility in this circumstance really reveals something. 

At LessOnline, there was a big discussion one night around the picnic tables with @Eliezer_Yudkovsky , @habryka , and some interlocutors from the frontier labs (you'll momentarily see why I'm being vague on the latter names). 

One question was: "does DC actually listen to whistleblowers?" and I contributed that, in fact, DC does indeed have a script for this, and resigning in protest is a key part of it, especially ever since the Nixon years.

Here is a usefully publicly-shareable anecdote on how strongly this norm is embedded in national security decisi... (read more)

gwern*146

Also of relevance is the wave of resignations from the DC newspaper The Washington Post the past few days over Jeff Bezos suddenly exerting control.

Does "highest status" here mean highest expertise in a domain generally agreed by people in that domain, and/or education level, and/or privileged schools, and/or from more economically powerful countries etc?

I mean, functionally all of those things.  (Well, minus the country dynamic.  Everyone at this event I talked to was US, UK, or Canadian, which is all sorta one team for purposes of status dynamics at that event)

I was being intentionally broad, here.  I am probably less interested for purposes of this particular post only in the question of "who controls the future" swerves and more about "what else would interested, agentic actors do" questions. 

It is not at all clear to me that OpenPhil is the only org who feels this way -- I can think of several non-EA-ish charities that if they genuinely 100% believed "none of the people you care for will die of the evils you fight if you can just keep them alive for the next 90 days" would plausibly do some interestingly agentic stuff.  

Oh, to be clear I'm not sure this is at all actually likely, but I was curious if anyone had explored the possibility conditional on it being likely

Load More