Oliver Sourbut — LessWrong

LESSWRONG
LW

(I'm not sure what term would be appropriate to encompass both soloware and groupware.)

Anyware? Someware? Everyware?

Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro

Oliver Sourbut7d*20

I do think that this sort of "AIs generate their own training environments" flywheel could cause superexponential progress via the same sort of mechanism as AIs automating AI R&D, though I don't expect to see this data generation effect show up much in overall AI progress.

Confused by this. "AIs generate their own training environments" is (a central part of) AI R&D, right? (Besides this there's what, compute optimisation of various kinds, which shades into architecture and training algo and hyperparameter innovation/exploration, plus scaffolding, prompting/input design, and tooling/integrations? I'd weight data+environment as ~a third of AI R&D.)

Separately, how to square '...could cause superexponential...' with 'I don't expect... to show up much'? Is it just that you think it's conceivable but low probability?

An epistemic advantage of working as a moderate

Oliver Sourbut21d*70

Mostly unfiltered blurting

Counterfactual?

Control is super obvious and not new conceptually; rather it's a bit new that someone is actually trying to do the faffy thing of making something maybe work. I think it's pretty likely they'd be doing it anyway?

Counterpoint: companies as group actors (in spite of intelligent and even caring constituent humans) are mostly myopic and cut as many corners as possible by default (either due to vicious leadership, corporate myopia, or (perceived) race incentives), so maybe even super obvious things get skipped without external parties picking up the slack?

The same debate could perhaps be had about dangerous capability evaluations.

Yudkowsky on "Don't use p(doom)"

Oliver Sourbut23d20

As a minor anecdatal addition: people (at least in the bay) were annoyingly and unproductively asking about p(doom) before Death with Dignity was published.

How quick and big would a software intelligence explosion be?

Oliver Sourbut1mo*20

This is nicely put together as usual! Sadly still leaves me feeling uncomfortable. Trying to put my finger on why, and I think it's at least two things.

Mainly: lumping so many things together as a single scalar 'software' just smells really off to me! Perhaps I'm 'too close' to the problem, having got deep on the nitty gritty of pretty well all aspects of software here at one time or another. You definitely flag that admirably, and have some discussion on how to make those things sort of commensurable. I do think it's important to at least distinguish efficiency from 'quality', and perhaps to go further (e.g. distinguishing training from runtime efficiencies, or even speed from parallel efficiencies).

I also think in treatment of R&D it's important to distinguish steady/stock 'quality' from learning/accrual 'quality', and to acknowledge that all of these things deprecate as you move through scale regimes: today's insights may or may not stand up to a 10x or a 100x of your system parameters. This makes sample efficient generalisation and exploratory heuristics+planning really key.

Related to this (and I'm less sure, having not deeply interrogated the details), I feel like some double counting of factors is going into the estimates of parameters, especially . But I can imagine retracting this on further scrutiny.

Civil Service: a Victim or a Villain?

Oliver Sourbut1mo40

Anecdata: I quit the UK civil service (AISI, quite atypical) in small part due to feeling micromanagement creeping in, but also I mostly experienced a good amount of trust and autonomy. People sometimes said 'startup in government', perhaps to convey this (among other features). I don't know how long that's sustainable.

Civil Service: a Victim or a Villain?

Oliver Sourbut1mo2-1

It's strange... I have a sense that civil services used to be enviable, admirable places to work. Elite selection and talented operators getting impressive things done. At least in the UK and China, and also in the US (presumably also other places but I'm less familiar). What went wrong?

My limited experience (c. 14 months as a 'civil servant' in the UK AISI, really more like an imported technical expert pretending to be a civil servant) was that I encountered a predominance of very driven, intelligent, and well-motivated people in my immediate vicinity (including, perhaps especially, the 'real' civil servants who'd pre-existed AISI). I expect there may have been substantial selection effects. I heard complaints about interfacing with other parts of govt, and certain kinds of moves (like publication or sending messages 'up the chain' to MPs) were highly bureaucratised and a bit Procrustean in ways that sometimes limited our autonomy and ability to achieve the mission.

Will compute bottlenecks prevent a software intelligence explosion?

Oliver Sourbut2mo20

I also noticed this! And wondered how much evidence it is (and of what). I don't think of Meta as especially rational in its AI-related behaviour. Maybe this is Zuckerberg trying to make up for Le Cun's years of bad judgement.

Jemist's Shortform

Oliver Sourbut2mo50

(Psst: a lot of AISI's work is this, and they have sufficient independence and expertise credentials to be quite credible; this doesn't go for all of their work, some of which is indeed 'try for a better plan')

Better than logarithmic returns to reasoning?

Oliver Sourbut2mo30

Indeed. von Neumann

All stable processes we shall predict. All unstable processes we shall control.

Though some chaotic processes are quite hard to control (and others might not take kindly to your attempting to control them)!

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments