My blog is here. You can subscribe for new posts there.
My personal site is here.
You can contact me using this form.
Thanks for the heads-up, that looks very convenient. I've updated the post to link to this instead of the scraper repo on GitHub.
As far as I know, my post started the recent trend you complain about.
Several commenters on this thread (e.g. @Lucius Bushnaq here and @MondSemmel here) mention LessWrong's growth and the resulting influx of uninformed new users as the likely cause. Any such new users may benefit from reading my recently-curated review of Planecrash, the bulk of which is about summarising Yudkowsky's worldview.
i continue to feel so confused at what continuity led to some users of this forum asking questions like, "what effect will superintelligence have on the economy?" or otherwise expecting an economic ecosystem of superintelligences
If there's decision-making about scarce resources, you will have an economy. Even superintelligence does not necessarily imply infinite abundance of everything, starting with the reason that our universe only has so many atoms. Multipolar outcomes seem plausible under continuous takeoff, which the consensus view in AI safety (as I understand it) sees as more likely than fast takeoff. I admit that there are strong reasons for thinking that the aggregate of a bunch of sufficiently smart things is agentic, but this isn't directly relevant for the concerns about humans within the system in my post.
a value-aligned superintelligence directly creates utopia
In his review of Peter Singer's commentary on Marx, Scott Alexander writes:
[...] Marx was philosophically opposed, as a matter of principle, to any planning about the structure of communist governments or economies. He would come out and say it was irresponsible to talk about how communist governments and economies will work. He believed it was a scientific law, analogous to the laws of physics, that once capitalism was removed, a perfect communist government would form of its own accord. There might be some very light planning, a couple of discussions, but these would just be epiphenomena of the governing historical laws working themselves out.
Peter Thiel might call this "indefinite optimism": delay all planning or visualisation because there's some later point where it's trusted things will all sort themselves out. Now, if you think that takeoff will definitely be extremely hard and the resulting superintelligence will effortlessly take over the world, then obviously it makes sense to focus on what that superintelligence will want to do. But what if takeoff lasts months or years or decades? (Note that there can be lots of change even within months if the stakes look extreme to powerful actors!) Aren't you curious about what an aligned superintelligence will end up deciding about society and humans? Are you so sure about the transition period being so short and the superintelligence being so unitary and multipolar outcomes being so unlikely that we'll never have to worry about problems downstream of the incentive issues and competitive pressures that I discuss (which Beren recently had an excellent post on)? Are you so sure that there is not a single interesting, a priori deducible fact about the superintelligent economy beyond "a singleton is in charge and everything is utopia"?
- The bottlenecks to compute production are constructing chip fabs; electricity; the availability of rare earth minerals.
Chip fabs and electricity generation are capital!
Right now, both companies have an interest in a growing population with growing wealth and are on the same side. If the population and its buying power begins to shrink, they will be in an existential fight over the remainder, yielding AI-insider/AI-outsider division.
Yep, AI buying power winning over human buying power in setting the direction of the economy is an important dynamic that I'm thinking about.
I also think the AI labor replacement is initially on the side of equality. [...] Now, any single person who is a competent user of Claude can feasibly match the output of any traditional legal team, [...]. The exclusive access to this labor is fundamental to the power imbalance of wealth inequality, so its replacement is an equalizing force.
Yep, this is an important point, and a big positive effect of AI! I write about this here. We shouldn't lose track of all the positive effects.
Great post! I'm also a big (though biased) fan of Owain's research agenda, and share your concerns with mech interp.
I'm therefore coining the term "prosaic interpretability" - an approach to understanding model internals [...]
Concretely, I've been really impressed by work like Owain Evans' research on the Reversal Curse, Two-Hop Curse, and Connecting the Dots[3]. These feel like they're telling us something real, general, and fundamental about how language models think. Despite being primarily empirical, such work is well-formulated conceptually, and yields gearsy mental models of neural nets, independently of existing paradigms.
[emphasis added]
I don't understand how the papers mentioned are about understanding model internals, and as a result I find the term "prosaic interpretability" confusing.
Some points that are relevant in my thinking (stealing a digram from an unpublished draft of mine):
So overall, I don't think the type of work you mention is really focused on internals or interpretability at all, except incidentally in minor ways. (There's perhaps a similar vibe difference here to category theory v set theory: the focus being relations between (black-boxed) objects, versus the focus being the internals/contents of objects, with relations and operations defined by what they do to those internals)
I think thinking about internals can be useful—see here for a Neel Nanda tweet arguing the reversal curse if obvious if you understand mech interp—but also the blackbox research often has a different conceptual frame, and is often powerful specifically when it can skip all theorising about internals while still bringing true generalising statements about models to the table.
And therefore I'd suggest a different name than "prosaic interpretability". "LLM behavioural science"? "Science of evals"? "Model psychology"? (Though I don't particularly like any of these terms)
If takeoff is more continuous than hard, why is it so obvious that there exists exactly one superintelligence rather than multiple? Or are you assuming hard takeoff?
Also, your post writes about "labor-replacing AGI" but writes as if the world it might cause near-term lasts eternally
If things go well, human individuals continue existing (and humans continue making new humans, whether digitally or not). Also, it seems more likely than not that fairly strong property rights continue (if property rights aren't strong, and humans aren't augmented to be competitive with the superintelligences, then prospects for human survival seem weak since humans' main advantage is that they start out owning a lot of the stuff—and yes, that they can shape the values of the AGI, but I tentatively think CEV-type solutions are neither plausible nor necessarily desirable). The simplest scenario is that there is continuity between current and post-singularity property ownership (especially if takeoff is slow and there isn't a clear "reset" point). The AI stuff might get crazy and the world might change a lot as a result, but these guesses, if correct, seem to pin down a lot of what the human situation looks like.
I already added this to the start of the post:
Edited to add: The main takeaway of this post is meant to be: Labour-replacing AI will shift the relative importance of human v non-human factors of production, which reduces the incentives for society to care about humans while making existing powers more effective and entrenched. Many people are reading this post in a way where either (a) "capital" means just "money" (rather than also including physical capital like factories and data centres), or (b) the main concern is human-human inequality (rather than broader societal concerns about humanity's collective position, the potential for social change, and human agency).
However:
perhaps you should clarify that you aren't trying to argue that saving money to spend after AGI is a good strategy, you agree it's a bad strategy
I think my take is a bit more nuanced:
This is despite agreeing with the takes in your earlier comment. My exact views in more detail (comments/summaries in square brackets):
- The post-AGI economy might not involve money, it might be more of a command economy. [yep, this is plausible, but as I write here, I'm guessing my odds on this are lower than yours—I think a command economy with a singleton is plausible but not the median outcome]
- Even if it involves money, the relationship between how much money someone has before and how much money they have after might not be anywhere close to 1:1. For example: [loss of control, non-monetary power, destructive war] [yep, the capital strategy is not risk-free, but this only really applies re selfish concerns if there are better ways to prepare for post-AGI; c.f. my point about social connections above]
- Even if saving money through AGI converts 1:1 into money after the singularity, it will probably be worth less in utility to you
- [because even low levels of wealth will max out personal utility post-AGI] [seems likely true, modulo some uncertainty about: (a) utility from positional goods v absolute goods v striving, and (b) whether "everyone gets UBI"-esque stuff is stable/likely, or fails due to despotism / competitive incentives / whatever]
- [because for altruistic goals the leverage from influencing AI now is probably greater than leverage of competing against everyone else's saved capital after AGI] [complicated, but I think this is very correct at least for individuals and most orgs]
Regarding:
you are taking it to mean "we'll all be living in egalitarian utopia after AGI" or something like that
I think there's a decent chance we'll live in a material-quality-of-life-utopia after AGI, assuming "Things Go Fine" (i.e. misalignment / war / going-out-with-a-whimper don't happen). I think it's unlikely to be egalitarian in the "everyone has the same opportunities and resources", for the reasons I lay out above. There are lots of valid arguments for why, if Things Go Fine, it will still be much better than today despite that inequality, and the inequality might not practically matter very much because consumption gets maxed out etc. To be clear, I am very against cancelling the transhumanist utopia because some people will be able to buy 30 planets rather than just a continent. But there are some structural things that make me worried about stagnation, culture, and human relevance in such worlds.
In particular, I'd be curious to hear your takes about the section on state incentives after labour-replacing AI, which I don't think you've addressed and which I think is fairly cruxy to why I'm less optimistic than you about things going well for most humans even given massive abundance and tech.
For example:
Imagine for example the world where software engineering is incredibly cheap. You can start a software company very easily, yes, but Google can monitor the web for any company that makes revenue off of software, instantly clone the functionality (because software engineering is just a turn-the-crank-on-the-LLM thing now) and combine it with their platform advantage and existing products and distribution channels. Whereas right now, it would cost Google a lot of precious human time and focus to try to even monitor all the developing startups, let alone launch a competing product for each one. Of course, it might be that Google itself is too bureaucratic and slow to ever do this, but someone else will then take this strategy.
C.f. the oft-quoted thing about how the startup challenge is getting to distribution before the incumbents get to distribution. But if the innovation is engineering, and the engineering is trivial, how do you get time to get distribution right?
(Interestingly, as I'm describing it above the most key thing is not so much capital intensivity, and more just that innovation/engineering is no longer a source of differential advantage because everyone can do it with their AIs really well)
There's definitely a chance that there's some "crack" in this, either from the economics or the nature of AI performance or some interaction. In particular, as I mentioned at the end, I don't think modelling the AI as an approaching blank wall of complete perfect intelligence all-obsoleting intelligence is the right model for short to medium -term dynamics. Would be very curious if you have thoughts.
Note, firstly, that money will continue being a thing, at least unless we have one single AI system doing all economic planning. Prices are largely about communicating information. If there are many actors and they trade with each other, the strong assumption should be that there are prices (even if humans do not see them or interact with them). Remember too that however sharp the singularity, abundance will still be finite, and must therefore be allocated.
Though yes, I agree that a superintelligent singleton controlling a command economy means this breaks down.
However it seems far from clear we will end up exactly there. The finiteness of the future lightcone and the resulting necessity of allocating "scarce" resources, the usefulness of a single medium of exchange (which you can see as motivated by coherence theorems if you want), and trade between different entities all seem like very general concepts. So even in futures that are otherwise very alien, but just not in the exact "singleton-run command economy" direction, I expect a high chance that those concepts matter.
Zero to One is a book that everyone knows about, but somehow it's still underrated.
Indefinite v definite in particular is a frame that's stuck with me.
Indefinite:
Definite:
You talk here about an impact/direction v ambition/profit tradeoff. I've heard many other people talking about this tradeoff too. I think it's overrated; in particular, if you're constantly having to think about it, that's a bad sign.
Instead, I think the real value of doing things that are startup-like comes from: