Oliver Sourbut

oliversourbut.net

  • Autonomous Systems @ UK AI Safety Institute (AISI)
  • DPhil AI Safety @ Oxford (Hertford college, CS dept, AIMS CDT)
  • Former senior data scientist and software engineer + SERI MATS

I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently

  • Ord - The Precipice
  • Pearl - The Book of Why
  • Bostrom - Superintelligence
  • McCall Smith - The No. 1 Ladies' Detective Agency (and series)
  • Melville - Moby-Dick
  • Abelson & Sussman - Structure and Interpretation of Computer Programs
  • Stross - Accelerando
  • Graeme - The Rosie Project (and trilogy)

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

  • Hanabi (can't recommend enough; try it out!)
  • Pandemic (ironic at time of writing...)
  • Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
  • Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.

Sequences

Breaking Down Goal-Directed Behaviour

Wikitag Contributions

Comments

Sorted by

I am hopeful that one of the things we can do with just-before-the-brink AI will be to accelerate the design and deployment of such voluntary coordination contracts. Could we manage to use AI to speed-run the invention and deployment of such subsidiarity governance systems? I think the biggest challenge to this is how fast it would need to move in order to take effect in time. For a system that needs extremely broad buy-in from a large number of heterogenous actors, speed of implementation and adoption is a key weak point.

FYI FLF has a Fellowship on AI for Human Reasoning which centrally targets objectives like this (if I've understood)

I wrote a bit about experimentation recently.

You seem very close to taking this position seriously when you talk about frontier experiments and experiments in general, but I think you also need to notice that experiments come in more dimensions than that. Like, you don't learn how to be better at chemistry just by playing around with GPUs.

It's quite clear that labs both want more high-quality researchers -- top talent has very high salaries, reflecting large marginal value-add.

Three objections, one obvious. I'll state them strongly, a bit devil's advocate; not sure where I actually land on these things.

Obvious: salaries aren't that high.

Also, I model a large part of the value to companies of legible credentialed talent being the marketing value to VCs and investors, who (even if lab leadership can) can't tell talent apart except by (rare) legible signs. This is actually a way to get more compute (and other capital). (The legible signs are rare because compute is a bottleneck! So a Matthew effect pertains.)

Finally, the utility of labs is very convex in the production of AI: the actual profit comes from time spent selling a non-commoditised frontier offering at large margin. So small AI production speed gains translate into large profit gains.

The best objection to an SIE

 

I think the compute bottleneck is a reasonable objection, but there is also the fairly straightforward objection that gaining skills takes experience, and experience takes interaction (or hoovering up from web data).

You can get experience in things like 'writing fast code' easily, so a speed explosion is fairly plausible (up to diminishing returns). But various R&Ds or human influence or whatever are much harder to get experience for. So our exploded SIs might be super fast and maybe super good at learning from experience, but out of the gate at best human expert level where relevant data are abundant, and at best human novice level where data aren't.

What constitutes cooperation?

Realised my model pipeline:

  1. surface options (or find common ground)
  2. negotiate choices (agree a course of action)
  3. cooperate/enforce (counteract defections, actually do the joint good thing)

was missing an important preliminary step.

For cooperation to happen, you also need:

  1. identify potential coalitions (who could benefit from cooperating)!

(Could break down further: identifying, getting common knowledge, and securing initial prospective cooperative intent.)

In some cases, 'identifying potential coalitions' might be a large, even dominant part of the challenge of cooperation, especially when effects are diffuse!

That applies to global commons and it applies when coordinating political action. What other cases?

'Identifying potential coalitions' is what a lot of activism is about, and it might also be a big part of what various cooperative memeplexes like tribes, religions, political parties etc are doing.

This feels to me like another important part of the picture that new tech could potentially amplify!

Could we newly empower large groups of humans to cooperate by recognising and fulfilling the requirements of this cooperation pipeline?

I polished and published the draft

  • Introducing exploration and experimentation
  • Why does exploration matter?
  • Research and taste
  • From play to experimentation
  • Exploration in AI, past and future
  • Research by AI: AI with research taste?
  • Opportunities

If you want to be twice as profitable as your competitors, you don’t have to be twice as good as them. You just have to be slightly better.

I think AI development is mainly compute constrained (relevant for intelligence explosion dynamics).

There are some arguments against, based on the high spending of firms on researcher and engineer talent. The claim is that this supports one or both of a) large marginal returns to having more (good) researchers or b) steep power laws in researcher talent (implying large production multipliers from the best researchers).

Given that the workforces at labs remain not large, I think the spending naively supports (b) better.

But in fact I think there is another, even better explanation:

  • Researchers' taste (an AI production multiplier) varies more smoothly
  • (research culture/collective intelligence of a team or firm may be more important)
  • Marginal parallel researchers have very diminishing AI production returns (sometimes negative, when the researchers have worse taste)
  • (also determining a researcher's taste ex ante is hard)
  • BUT firms' utility is sharply convex in AI production
    • capturing more accolades and market share are basically the entire game
    • spending as much time as possible with a non-commoditised offering allows profiting off fast-evaporating margin
  • so firms are competing over getting cool stuff out first
    • time-to-delivery of non-commoditised (!) frontier models
  • and getting loyal/sticky customer bases
    • ease-of-adoption of product wrapping
    • sometimes differentiation of offerings
  • this turns small differences in human capital/production multiplier/research taste into big differences in firm utility
  • so demand for the small pool of the researchers with (legibly) great taste is very hot

This also explains why it's been somewhat 'easy' (but capital intensive) for a few new competitors to pop into existence each year, and why firms' revealed preferred savings rate into compute capital is enormous (much greater than 100%!).

We see token prices drop incredibly sharply, which supports the non-commoditised margin claim (though this is also consistent with a Wright's Law effect from (runtime) algorithmic efficiency gains, which should definitely also be expected).

A lot of engineering effort is being put into product wrappers and polish, which supports the customer base claim.

The implications include: headroom above top human expert teams' AI research taste could be on the small side (I think this is right for many R&D domains, because a major input is experimental throughput). So both quantity and quality of (perhaps automated) researchers should have steeply diminishing returns in AI production rate. But might they nevertheless unlock a practical monopoly (or at least an increasingly expensive barrier to entry) on AI-derived profit, by keeping the (more monetisable) frontier out of reach of competitors?

"choosing better experiments" is a relatively advanced skill, which will likely not emerge until well after experiment implementation skills

 

I have a draft discussing this. (Facepalm, should publish more often...)

Certainly choosing better experiments requires at least one of:

  • large scaleup in experimental observations (to get the experience to drive taste acquisition)
  • superhuman sample efficiency in taste acquisition
  • extreme reasoning/deliberation on top of weak taste, adding up to greater taste (I think there are likely very diminishing returns to this, but superspeed might yield it)

I think your claim is betting on the first one, and also assuming that you can only get that by increasing throughput.

But maybe you could slurp enough up from existing research logs, or from interviews with existing researchers, or something like that. Then you'd be competing with the still overall larger, but more tacit and more distributed-between-brains research experience of all the humans in the org.

nit: I found your graphical quantifiers and implications a bit confusing to read. It's the middle condition that looks weird.

If I've understood, I think you want (given a set of , there exists with all of):

(the third line is just my crude latex rendition of your third diagram)

Is that order of quantifiers and precedence correct?

In words, you want an :

  1. which is a redund over the
  2. where any which is a redund over the is screened from by
  3. which is approximately deterministic on

An unverified intuition (I'll unlikely work on this further): would the joint distribution of all candidate s work as ? It screens off the from any given , right (90% confidence)? And it is itself a redund over the I think (again 90% confidence ish)? I don't really grok what it means to be approximately deterministic but it feels like this should be? Overall confidence 30% ish, deflating for outside and definedness reasons.

Hey, some thoughts in case helpful. I was exploring a little bit into the 'agent structure' sort of questions and the Good/Gooder regulator landscape.

You can take GR a bit further by looking at a temporally indexed MDP-like causal diagram and applying various bookkeeping transformations. Search 'combine nodes' in John's post on Bayes net algebra and 'uncombine' in my comment on the same.

Then you can see a 'good regulator motif' across many timesteps and timescales and draw some richer conclusions.

Here's a comment where I hastily sketch a version of this.

Wasn't planning to expand on any of those things, but if you think it'd be especially helpful let me know.

Load More