Re: HCAST tasks, most are being kept private since it's a benchmark. If you want to learn more here's the METR's paper on HCAST.
Thanks for the detailed response!
Re: my meaning, you got it correct here:
Spiritually, genomic liberty is individualistic / localistic; it says that if some individual or group or even state (at a policy level, as a large group of individuals) wants to use germline engineering technology, it is good for them to do so, regardless of whether others are using it. Thus, it justifies unequal access, saying that a world with unequal access is still a good world.
Re: genomic liberty makes narrow claims, yes I agree, but my point is that if implemented it will lead to a world with unequal access for some substantial period of time, and that I expect this to be socially corrosive.
Switching to quoting your post and responding to those quotes:
To be honest, mainly I've thought about inequality within single economic and jurisdictional regimes. (I think that objection is more common than the international version.)
Yeah that's the common variant of the concern but I think it's less compelling - rich countries will likely be able to afford subsidizing gene editing for their citizens, and will be strongly incentivized to do so even if it's quite expensive. So my expectation is that the intra-country effects for rich countries won't be as bad as science fiction has generally predicted, but that the international effects will be.
(and my fear is this would play into general nationalizing trends worldwide that increase competition and make nation-states bitter towards each other, when we want international cooperation on AI)
I am however curious to hear examples of technologies that {snip}
My worry is mostly that the tech won't spread "soon enough" to avoid socially corrosive effects, less so that it will never spread. As for a tech that never fully spread but should have benefitted everyone, all that comes to mind is nuclear energy.
So maybe developing the tech here binds it up with "all people should have this".
I think this would happen, but it would be expressed mostly resentfully, not positively.
The ideology should get a separate treatment--genomic liberty but as a positive right--what I've been calling genomic emancipation.
Sounds interesting!
This is a thoughtful post, and I appreciate it. I don't think I disagree with it from a liberty perspective, and agree there are potential huge benefits for humanity here.
However, my honest first reaction is "this reasoning will be used to justify a world in which citizens of rich countries have substantially superior children to citizens of poor countries (as viewed by both groups)". These days, I'm much more suspicious of policies likely to be socially corrosive: it leads to bad governance at a time where, because of AI risk, we need excellent governance.
I'm sure you've thought about this question, it's the classic objection. Do you have any idea how to avoid or at least mitigate the inequality adopting genomic liberty would cause? Or do you think it wouldn't happen at all? Or do you think that it's simply worth it and natural that any new technology is first adopted by those who can afford it, and that adoption drives down prices and will spread the technology widely soon enough?
Here's an interesting thread of tweets from one of the paper's authors, Elizabeth Barnes.
Quoting the key sections:
Extrapolating this suggests that within about 5 years we will have generalist AI systems that can autonomously complete ~any software or research engineering task that a human professional could do in a few days, as well as a non-trivial fraction of multi-year projects, with no human assistance or task-specific adaptations required.
However, (...) It’s unclear how to interpret “time needed for humans”, given that this varies wildly between different people, and is highly sensitive to expertise, existing context and experience with similar tasks. For short tasks especially, it makes a big difference whether “time to get set up and familiarized with the problem” is counted as part of the task or not.
(...)
We’ve tried to operationalize the reference human as: a new hire, contractor or consultant; who has no prior knowledge or experience of this particular task/codebase/research question; but has all the relevant background knowledge, and is familiar with any core frameworks / tools / techniques needed.
This hopefully is predictive of agent performance (given that models have likely memorized most of the relevant background information, but won’t have training data on most individual tasks or projects), whilst maintaining an interpretable meaning (it’s hopefully intuitive what a new hire or contractor can do in 10 mins vs 4hrs vs 1 week).
(...)
Some reasons we might be *underestimating* model capabilities include a subtlety around how we calculate human time. In calculating human baseline time, we only use successful baselines. However, a substantial fraction of baseline attempts result in failure. If we use human success rates to estimate the time horizon of our average baseliner, using the same methodology as for models, this comes out to around 1hr - suggesting that current models will soon surpass human performance. (However, we think that baseliner failure rates are artificially high due to our incentive scheme, so this human horizon number is probably significantly too low)
Other reasons include: For tasks that both can complete, models are almost always much cheaper, and much faster in wall-clock time, than humans. This also means that there's a lot of headroom to spend more compute at test time if we have ways to productively use it - e.g. BoK
That bit at the end about "time horizon of our average baseliner" is a little confusing to me, but I understand it to mean "if we used the 50% reliability metric on the humans we had do these tasks, our model would say humans can't reliably perform tasks that take longer than an hour". Which is a pretty interesting point.
Random commentary on bits of the paper I found interesting:
Under Windows of opportunity that close early:
Veil of ignorance
Lastly, some important opportunities are only available while we don’t yet know for sure who has power after the intelligence explosion. In principle at least, the US and China could make a binding agreement that if they “win the race” to superintelligence, they will respect the national sovereignty of the other and share in the benefits. Both parties could agree to bind themselves to such a deal in advance, because a guarantee of controlling 20% of power and resources post-superintelligence is valued more than a 20% chance of controlling 100%. However, once superintelligence has been developed, there will no longer be incentive for the ‘winner’ to share power.
Similarly for power within a country. At the moment, virtually everyone in the US might agree that no tiny group or single person should be able to grab complete control of the government. Early on, society could act unanimously to prevent that from happening. But as it becomes clearer which people might gain massive power from AI, they will do more to maintain and grow that power, and it will be too late for those restrictions.
Strong agree here, this is something governments should move quickly on: "No duh" agreements that put up some legal or societal barriers to malfeasance later.
Next, under Space Governance:
Missions beyond the Solar System. International agreements could require that extrasolar missions should be permitted only with a high degree of international consensus. This issue isn’t a major focus of attention at the moment within space law but, perhaps for that reason, some stipulation to this effect in any new treaty might be regarded as unobjectionable.
Also a good idea. I don't want to spend hundreds of years having to worry about the robot colony five solar systems over...
Finally, under Value lock-in mechanisms:
Human preference-shaping technology. Technological advances could enable us to choose and shape our own or others’ preferences, plus those of future generations. For example, with advances in neuroscience, psychology, or even brain-computer interfaces, a religious adherent could self-modify to make it much harder to change their mind about their religious beliefs (and never self-modify to undo the change). They could modify their children’s beliefs, too.
Gotta ask, was this inspired by To the Stars at all? There's no citation, but that story is currently covering the implications of having the technology to choose/shape "preference-specifications" for yourself and for society.
Okay I got trapped in a Walgreens and read more of this, found something compelling. Emphasis mine:
The best systems today fall short at working out complex problems over longer time horizons, which require some mix of creativity, trial-and-error, and autonomy. But there are signs of rapid improvement: the maximum duration of ML-related tasks that frontier models can generally complete has been doubling roughly every seven months. Naively extrapolating this trend suggests that, within three to six years, AI models will become capable of automating many cognitive tasks which take human experts up to a month.
This is presented without much fanfare but feels like a crux to me. After all, the whole paper is predicated on the idea that AI will be able to effectively replace the work of human researchers. The paragraph has a footnote (44), which reads:
METR, ‘Quantifying the Exponential Growth in AI Ability to Complete Longer Tasks’ (forthcoming). See also Pimpale et al., ‘Forecasting Frontier Language Model Agent Capabilities’.
So the citation is an unreleased paper! That unreleased paper may make a splash, since (assuming this 7-month-doubling trend is not merely 1-2 years old) it strongly implies we really will find good solutions for turning LLMs agentic fairly soon.
(The second paper cited, only a couple weeks old itself, was mentioned presumably for its forecast of RE-Bench performance, key conclusion: "Our forecast suggests that agent performance on RE-Bench may reach a score of 1—equivalent to the expert baseline reported by Wijk et al. (2024)—around December 2026. We have much more uncertainty about this forecast, and our 95% CI reflects this. It has a span of over 8 years, from August 2025 to May 2033." But it's based on just a few data points from about a period of just 1 year, so not super convincing.)
Meta: I'm kind of weirded out by how apparently everyone is making their own high-effort custom-website-whitepapers? Is this something that's just easier with LLMs now? Did Situational Awareness create a trend? I can't read all this stuff man.
In general there seems to be way more high-effort work coming out since reasoning models got released. Maybe it's just crunchtime.
Good objection. I think gene editing would be different because it would feel more unfair and insurmountable. That's probably not rational - the effect size would have to be huge for it to be bigger than existing differences in access to education and healthcare, which are not fair or really surmountable in most cases - but something about other people getting to make their kids "superior" off the bat, inherently, is more galling to our sensibilities. Or at least mine, but I think most people feel the same way.