Yeah referring to international sentiments. We'd want to avoid a "chip export controls" scenario, which would be tempting I think.
Re: HCAST tasks, most are being kept private since it's a benchmark. If you want to learn more here's the METR's paper on HCAST.
Thanks for the detailed response!
Re: my meaning, you got it correct here:
Spiritually, genomic liberty is individualistic / localistic; it says that if some individual or group or even state (at a policy level, as a large group of individuals) wants to use germline engineering technology, it is good for them to do so, regardless of whether others are using it. Thus, it justifies unequal access, saying that a world with unequal access is still a good world.
Re: genomic liberty makes narrow claims, yes I agree, but my point is that if implemented it will lead ...
This is a thoughtful post, and I appreciate it. I don't think I disagree with it from a liberty perspective, and agree there are potential huge benefits for humanity here.
However, my honest first reaction is "this reasoning will be used to justify a world in which citizens of rich countries have substantially superior children to citizens of poor countries (as viewed by both groups)". These days, I'm much more suspicious of policies likely to be socially corrosive: it leads to bad governance at a time where, because of AI risk, we need excellent governance...
Here's an interesting thread of tweets from one of the paper's authors, Elizabeth Barnes.
Quoting the key sections:
...Extrapolating this suggests that within about 5 years we will have generalist AI systems that can autonomously complete ~any software or research engineering task that a human professional could do in a few days, as well as a non-trivial fraction of multi-year projects, with no human assistance or task-specific adaptations required.
However, (...) It’s unclear how to interpret “time needed for humans”, given that this varies wildly between diffe
Random commentary on bits of the paper I found interesting:
Under Windows of opportunity that close early:
...Veil of ignorance
Lastly, some important opportunities are only available while we don’t yet know for sure who has power after the intelligence explosion. In principle at least, the US and China could make a binding agreement that if they “win the race” to superintelligence, they will respect the national sovereignty of the other and share in the benefits. Both parties could agree to bind themselves to such a deal in advance, because a guarantee of contr
Okay I got trapped in a Walgreens and read more of this, found something compelling. Emphasis mine:
...The best systems today fall short at working out complex problems over longer time horizons, which require some mix of creativity, trial-and-error, and autonomy. But there are signs of rapid improvement: the maximum duration of ML-related tasks that frontier models can generally complete has been doubling roughly every seven months. Naively extrapolating this trend suggests that, within three to six years, AI models will become capable of automating many cogn
Meta: I'm kind of weirded out by how apparently everyone is making their own high-effort custom-website-whitepapers? Is this something that's just easier with LLMs now? Did Situational Awareness create a trend? I can't read all this stuff man.
In general there seems to be way more high-effort work coming out since reasoning models got released. Maybe it's just crunchtime.
I think it's something of a trend relating to a mix of 'tools for thought' and imitation of some websites (LW2, Read The Sequences, Asterisk, Works in Progress & Gwern.net in particular), and also a STEM meta-trend arriving in this area: you saw this in security vulnerabilities where for a while every major vuln would get its own standalone domain + single-page website + logo + short catchy name (eg. Shellshock, Heartbleed). It is good marketing which helps you stand out in a crowded ever-shorter-attention-span world.
I also think part of it is that it ...
I meant test-time compute as in the compute expended in the thinking Claude does playing the game. I'm not sure I'm convinced that reasoning models other than R1 took only a few million dollars, but it's plausible. Appreciate the prediction!
Amazingly, Claude managed to escape the blackout strategy somehow. Exited Mt. Moon at ~68 hours.
It does have a lot of the info, but it doesn't always use it well. For example, it knows that Route 4 leads to Cerulean City, and so sometimes thinks there's a way around Mt. Moon that sticks solely to Route 4.
No idea. Be really worried, I guess—I tend a bit towards doomer. There's something to be said for not leaving capabilities overhangs lying around, though. Maybe contact Anthropic?
The thing is, the confidence the top labs have in short-term AGI makes me think there's a reasonable chance they have the solution to this problem already. I made the mistake of thinking they didn't once before - I was pretty skeptical that "more test-time compute" would really unhobble LLMs in a meaningful fashion when Situational Awareness came out and didn't elaborate at all on how that would work. But it turned out that at least OpenAI, and probably Anthropic too, already had the answer at the time.
I think this is a fair criticism, but I think it's also partly balanced out by the fact that Claude is committed to trying to beat the game. The average person who has merely played Red probably did not beat it, yes, but also they weren't committed to beating it. Also, Claude has pretty deep knowledge of Pokémon in its training data, making it a "hardcore gamer" both in terms of knowledge and willingness to keep playing. In that way, the reference class of gamers who put forth enough effort to beat the game is somewhat reasonable.
It's definitely possible to get confused playing Pokémon Red, but as a human, you're much better at getting unstuck. You try new things, have more consistent strategies, and learn better from mistakes. If you tried as long and as consistently as long as Claude is, even as a 6-year-old, you'd do much better.
I played Pokémon Red as a kid too (still have the cartridge!), it wasn't easy, but I beat it in something like that 26 hour number IIRC. You have a point that howlongtobeat is biased towards gamers, but it's the most objective number I can find, and it feels reasonable to me.
as a human, you're much better at getting unstuck
I'm not sure! Or well, I agree that 7-year-old me could get unstuck by virtue of having an "additional tool" called "get frustrated and cry until my mom took pity and helped."[1] But we specifically prevent Claude from doing stuff like that!
I think it's plausible that if we took an actual 6-year-old and asked them to play Pokemon on a Twitch stream, we'd see many of the things you highlight as weaknesses of Claude: getting stuck against trivial obstacles, forgetting what they were doing, and—yes—complai...
Thanks for the correction! I've added the following footnote:
Actually it turns out this hasn't been done, sorry! A couple RNG attempts were completed, but they involved some human direction/cheating. The point still stands only in the sense that, if Claude took more random/exploratory actions rather than carefully-reasoned shortsighted actions, he'd do better.
I think the idea behind MAIM is to make it so neither China nor the US can build superintelligence without at least implicit consent from the other. This is before we get to the possibility of first strikes.
If you suspect an enemy state is about to build a superintelligence which they will then use to destroy you (or that will destroy everyone), you MAIM it. You succeed in MAIMing it because everyone agreed to measures making it really easy to MAIM it. Therefore, for either side to build superintelligence, there must be a general agreement to do so. If the...
This is creative.
TL;DR: To mitigate race dynamics, China and the US should deliberately leave themselves open to the sabotage ("MAIMing") of their frontier AI systems. This gives both countries an option other than "nuke the enemy"/"rush to build superintelligence first" if superintelligence appears imminent: MAIM the opponent's AI. The deliberately unmitigated risk of being MAIMed also encourages both sides to pursue carefully-planned and communicated AI development, with international observation and cooperation, reducing AINotKillEveryone-ism risks.
The ...
After an inter-party power-struggle, the CCP commits to the perpetual existence of at least one billion Han Chinese people with biological reproductive freedom
You know, this isn't such a bad idea - that is, explicit government commitments against discarding their existing, economically-unproductive populace. Easier to ask for today, rather than later.
Hypothetically this is more valuable in autocracies than in democracies, where the 1 person = 1 vote rule keeps political power in the hands of the people, but I think I'd support adding a constitutional amend...
It's unclear exactly what the product GPT-5 will be, but according to OpenAI's Chief Product Officer today it's not merely a router between GPT-4.5/o3.
swyx
appreciate the update!!in gpt5, are gpt* and o* still separate models under the hood and you are making a model router? or are they going to be unified in some more substantive way?
Kevin Weil
Unified 👍
Here's a fun related hypothetical. Let's say you're a mid-career software engineer making $250k TC right now. In a world with no AI progress you plausibly have $5m+ career earnings still coming. In a world with AGI, maybe <$1m. Would you take a deal where you sell all your future earnings for, say, $2.5m right now?
(me: no, but I might consider selling a portion of future earnings in such a deal as a hedge)
Is there any way to make this kind of trade? Arguably a mortgage is kind of like this, but you have to pay that back unless the government steps in when everyone loses their jobs...
You're right that there's nuance here. The scaling laws involved mean exponential investment -> linear improvement in capability, so yeah it naturally slows down unless you go crazy on investment... and we are, in fact, going crazy on investment. GPT-3 is pre-ChatGPT, pre-current paradigm, and GPT-4 is nearly so. So ultimately I'm not sure it makes that much sense to compare the GPT1-4 timelines to now. I just wanted to note that we're not off-trend there.
soon when we were racing through GPT-2, GPT-3, to GPT-4. We just aren't in that situation anymore
I don't think this is right.
GPT-1: 11 June 2018
GPT-2: 14 February 2019 (248 days later)
GPT-3: 28 May 2020 (469 days later)
GPT-4: 14 March 2023 (1,020 days later)
Basically, wait until next model doubled every time. By that pattern, GPT-5 ought to come around September 20, 2028, but Altman said today it'll be out within months. (and frankly, I think o1 qualifies as a sufficiently-improved successor model, and that released December 5, 2024, or really September 12, 2024, if you count o1-preview; either way, shorter than the GPT-3 to 4 gap)
GPT-5 ought to come around September 20, 2028, but Altman said today it'll be out within months
I don't think what he said meant what you think it meant. Exact words:
In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3
The "GPT-5" he's talking about is not the next generation of GPT-4, not an even bigger pretrained LLM. It is some wrapper over GPT-4.5/Orion, their reasoning models, and their agent models. My interpretation is that "GPT-5" the product and GPT-5 the hypothetical 100x-bigger GPT mo...
Still-possible good future: there's a fast takeoff to ASI in one lab, contemporary alignment techniques somehow work, that ASI prevents any later unaligned AI from ruining world, ASI provides life and a path for continued growth to humanity (and to shrimp, if you're an EA).
Copium perhaps, and certainly less likely in our race-to-AGI world, but possible. This is something like the “original”, naive plan for AI pre-rationalism, but it might be worth remembering as a possibility?
The only sane version of this I can imagine is where there's either one aligned ASI, or a coalition of aligned ASIs, and everyone has equal access. Because the AI(s) are aligned they won't design bioweapons for misanthropes and such, and hopefully they also won't make all human effort meaningless by just doing everything for us and seizing the lightcone etc etc.
It's strange that he doesn't mention DeepSeek-R1-Zero anywhere in that blogpost, which is arguably the most important development DeepSeek announced (self-play RL on reasoning models). R1-Zero is what stuck out to me in DeepSeek's papers, and ex. the Arc Prize team behind the Arc-Agi benchmark says:
R1-Zero is significantly more important than R1.
Was R1-Zero already obvious to the big labs, or is Amodei deliberately underemphasizing that part?
I like it. Thanks for sharing.
(spoilers below)
While I recognize that in the story it's assumed alignment succeeds, I'm curious on a couple worldbuilding points.
First, about this stage of AI development:
His work becomes less stressful too — after AIs surpass his coding abilities, he spends most of his time talking to users, trying to understand what problems they’re trying to solve.
The AIs in the story are really good at understanding humans. How does he retain this job when it seems like AIs would do it better? Are AIs just prevented from taking over socie...
The problem with Dark Forest theory is that, in the absence of FTL detection/communication, it requires a very high density and absurdly high proportion of hiding civilizations. Without that, expansionary civilizations dominate. The only known civilization, us, is expansionary for reasons that don't seem path-determinant, so it seems unlikely that the preconditions for Dark Forest theory exist.
To explain:
Hiders have limited space and mass-energy to work with. An expansionary civilization, once in its technological phase, can spread to thousands of star sys...
As a non-finance person who subscribes to Matt Levine, I'm not sure any other industry is as endlessly creative and public.
Take for example a story from Levine's most recent newsletter, about a recently-popular retail trader strategy. You buy 1 share of a company about do a reverse-split to avoid being delisted from Nasdaq for falling under $1. If the company doesn't support fractional shares, anyone with a partial share gets their 1 share rounded up to a full share in the reverse-split, so ex. a $0.40 share becomes worth $4.00 in a 1:10 split. Only ...
I read A Deepness in the Sky first, and haven't yet gotten around to A Fire Upon the Deep, and I enjoyed the former quite a lot.
Hm, I was interpreting 'pulls towards existential catastrophe' as meaning Leopold's map mismatches the territory because it overrates the chance of existential catastrophe.
If the argument is instead "Leopold publishing his map increases the chance of existential catastrophe" (by charging race dynamics, for example) then I agree that's plausible. (Though I don't think the choice to publish it was inexcusable - the effects are hard to predict, and there's much to be said for trying to say true things.)
If the argument is "following Leopold's plan likely leads to existential catastrophe", same opinion as above.
I agree that it's a good read.
I don't agree that it "pulls towards existential catastrophe". Pulls towards catastrophe, certainly, but not existential catastrophe? He's explicitly not a doomer,[1] and is much more focused on really-bad-but-survivable harms like WW3, authoritarian takeover, and societal upheaval.
Page 105 of the PDF, "I am not a doomer.", with a footnote where he links a Yudkowsky tweet agreeing that he's not a doomer. Also, he listed his p(doom) as 5% last year. I didn't see an updated p(doom) in Situational Awareness or his Dwarkesh
The question of 'pulls towards catastrophe' doesn't matter whether the author believes their work pulls towards catastrophe. The direction of the pull is in the eye of the reader. Therefore, you must evaluate whether Jan (or you, or I) believe that the futures which Leopold's maps pull us toward will result in existential catastrophes. For a simplified explanation, imagine that Leopold is driving fast at night on a winding cliffside road, and his vision is obscured by a heads-up display of a map of his own devising. If his map directs him to take a left an...
One example: Leopold spends a lot of time talking about how we need to beat China to AGI and even talks about how we will need to build robo armies. He paints it as liberal democracy against the CCP. Seems that he would basically burn timeline and accelerate to beat China. At the same time, he doesn't really talk about his plan for alignment which kind of shows his priorities. I think his narrative shifts the focus from the real problem (alignment).
This part shows some of his thinking. Dwarkesh makes some good counter points here, like how is Donald Trump ...
I'm curious for opinions on what I think is a crux of Leopold's "Situational Awareness":
picking the many obvious low-hanging fruit on “unhobbling” gains should take us from chatbots to agents, from a tool to something that looks more like drop-in remote worker replacements.[1]
This disagrees with my own intuition - the gap between chatbot and agent seems stubbornly large. He suggests three main angles of improvement:[2]
and regardless, CEV merely re-allocates influence to the arbitrary natural preferences of the present generation of humans
I thought CEV was meant to cover the (idealized, extrapolated) preferences of all living humans in perpetuity. In other words, it would include future generations as they were born, and would also update if the wisdom of the current generation grew. (or less charitably, if its moral fashions changed)
I do recognize that classical CEV being speciesist in favor of Humans is probably its central flaw (forget about hypothetical sentient AIs ...
GitHub copilot is a great deal for the user at only $10 per month. It loses GitHub $20/user/month, says Wall Street Journal.
FWIW, the former GitHub CEO Nat Friedman claims this is false and that Copilot was profitable. He was CEO at the time Copilot was getting started, but left in late 2021. So, it's possible that costs have increased >3x since then, though unless they're constantly using GPT-4 under the hood, I would be surprised to learn that.
Others have speculated that maybe Copilot loses money on average because it's made available to free for stud...
(The OpenAI-Microsoft relationship seems like a big deal. Why haven't I heard more about this?)
It is a big deal, but it's been widely reported on and discussed here for years, and particularly within the last year, given that Microsoft keeps releasing AI products based on OpenAI tech. Not sure why you haven't heard about it.
I would be curious to see what the poll results for Question 1 look like, say, a week from now. I only saw the message in my inbox after Petrov day was over, and still responded.
I don't think they're closely tied in the public mind, but I do think the connection is known to the organs of media and government that interact with AI alignment. It comes up often enough, in the background - details like FTX having a large stake in Anthropic, for example. And the opponents of AI x-risk and EA certainly try to bring it up as often as possible.
Basically, my model is that FTX seriously undermined the insider credibility of AINotKillEveryoneIsm's most institutionally powerful proponents, but the remaining credibility was enough to work with.
Why was the AI Alignment community so unprepared for engaging with the wider world when the moment finally came?
I reject the premise. Actually, I think public communication has gone pretty dang well since ChatGPT. Not only has AI existential risk become a mainstream, semi-respectable concern (especially among top AI researchers and labs, which count the most!), but this is obviously because of the 20 years of groundwork the rationality and EA communities have laid down.
We had well-funded organizations like CAIS able to get credible mainstream signatories. ...
Nobody ever talks about the lack of drawdown after the Spanish-American war!
The proximate cause appears to be the occupation of the Philippines after the US decided to take them as a colony rather than liberate them. The unexpected insurgency that followed forced Congress to maintain the army's wartime size.
A complete explanation of why the army stayed large after the general end of the Philippine insurgency in 1902 is beyond me, however. I am seeing several general explanations along the lines of "the Spanish-American war revealed serious problems in the ...
It's nice to see OpenAI, Anthropic, and DeepMind collaborating on a paper like this.
"Sufficiently advanced" tech could also plausibly identify all those hidden civilizations. For example, an underground civilization would produce unusual seismic activity, and taking up some inner portion of a gas giant or star would alter their outward behavior. Ultimately, civilizations use mass-energy in unnatural ways, and I don't see a fundamental physical principle that could protect that from all possible sensing.
More importantly, I don't think your suggestions address my point that hostile civilizations would get you before you even evolve.
Bu...
As applied to aliens, I think the Dark Forest frame is almost certainly wrong. Perhaps it's useful in other contexts, and I know you repeatedly disclaimed its accuracy in the alien context, but at least for others I want to explain why it's unlikely.
Basically, there are two reasons:
To expand on the first, consider that humanity has consistently spammed out radio waves and sent out probes with the express hope aliens might find them. Now, these are ...
Doesn't sound very convincing to me. Sufficiently advanced tech could allow things like:
5-10% YoY is baseline for CS in the US right now. (ironclad stats are a bit hard to come by, given vagaries in what counts as a CS degree, but ex. here's a claim that awarded CS degrees increased 12% in 2020)
CS Bachelor's degrees, unlikely. There's already substantial, growing interest. (they make up roughly 2.5% of awarded degrees in the US right now, roughly 50k of 2M, and for comparison all of engineering makes up ~10% of degrees - though obviously "interest in" is far away from "awarded degree")
Master's degrees in ML, also unlikely, but I could imagine a semi-plausible scenario where public opinion in the software industry suddenly decided they would be valuable for repositioning careers going forward. I'd be surprised if that happened, though, especially ...
No mention of AI Safety in the DeepMind announcement, but in the linked message from Sundar Pichai there's a lot of "safe" and "responsible" thrown around.
Anyway, what I'm most curious to know is what happens to DeepMind's safety team. Are they effectively neutered or siloed by this merger? Cannibalized for full-bore AI LLMsSayNiceThings at the expense of anything near AI Don'tKillEveryoneIsm?
if you say safe and responsible enough times, you summon the safety and it protects you safely from the antitrust suit that happens eventually
No, the point is to not signal false alarms, so that when there is a real threat we are less likely to be ignored.
It proves little if others dismiss a clearly false alarm.
Also, Microsoft themselves infamously unplugged Tay. That incident is part of why they're doing a closed beta for Bing Search.
I didn't sign the petition.
I think this petition will be perceived as crying wolf by those we wish to convince of the dangers of AI. It is enough to simply p...
There are costs to even temporarily shutting down Bing Search. Microsoft would take a significant reputational hit—Google’s stock price fell 8% after a minor kerfuffle with their chatbot.
Doesn't that support the claim being made in the original post? Admitting that the new AI technology has flaws carries reputational costs, so Microsoft/Google/Facebook will not admit that their technology has flaws and will continue tinkering with it long past the threshold where any reasonable external observer would call for it to be shut down, simply because the cost...
I mostly agree and have strongly upvoted. However, I have one small but important nitpick about this sentense:
The risks of imminent harmful action by Sydney are negligible.
I think when it comes to x-risk, the correct question is not “what is the probability that this will result in existential catastrophe”. Suppose that there is a series of potential harmful any increasingly risky AIs that each have some probabilities of causing existiential catastrophe unless you press a stop button. If the probabilities are growing sufficiently slowly, then e...
I must admit, I have removed my signature from the petition and I had learned an important lesson.
Let this be a lesson for us not to pounce on the first signs of problems.
Good objection. I think gene editing would be different because it would feel more unfair and insurmountable. That's probably not rational - the effect size would have to be huge for it to be bigger than existing differences in access to education and healthcare, which are not fair or really surmountable in most cases - but something about other people getting to make their kids "superior" off the bat, inherently, is more galling to our sensibilities. Or at least mine, but I think most people feel the same way.