"Do you want to be rich, or do you want to be king? — The founder's dilemma.
As we approach the technological singularity, the sometimes applicable trade-off between wealth (being rich) and control (being king) may extend into the realm of AI and governance.
The prevailing discussions around governance often converge on the need for more coordination, effective global standards, and enforcing political regulations. The choice for a more active role of politics is sometimes framed as a choice for human control over the future, not only against the alien wills of future AIs, but also against coordination failures in general. It is a choice for human reasoning against many so-called molochian forces: prisoner's dilemmas, evolutionary races, arms races, races to the bottom, ruthless capitalism.
Centralized control indeed seems to be necessary if we want to ensure that all AI is going to be altruistic, bound by universal ethical standards, or incapable of causing harm.
However, I'll argue that this choice incurs important trade-offs, and may set us up for a much more brutal outcome, one in which we fail to secure human influence in the new institutions that are likely to replace existing ones.
The Choice
The founder's dilemma is the choice some startup founders face between retaining control of a small company, and accepting dilution for a stake in something much bigger. Both choices carry risks. By accepting loss of control, a founder may see their company going in a path they didn't initially approve. By rejecting dilution, a founder risks seeing their company getting outcompeted and eventually disappearing.
Similarly, humans may face a choice between retaining control over institutions, at the risk of making them less efficient, and ceding early control in exchange for a larger stake in whatever comes next. The less efficient institutions can remain in power, but only if they can successfully repress the arrival of new ones.
Repression doesn't need to last forever: some hope our institutions can gradually improve themselves through a long reflection, eventually maximizing something akin to a coherent extrapolated volition. Others think this is a crazy bad idea. But even if you think it is a good idea, it would require the repression to actually be strong enough to work: do it insufficiently and you get a revolution instead. If you choose to rule as a king, you risk your head ending up in the guillotine. We may want to yield early instead.
The Alternative
The living are soft and flexible... The dead are rigid, unmoving... A tree that won’t bend easily breaks in storms. —Tao Te Ching
We present an alternative choice: embracing economic efficiency, implementing market mechanisms, and fostering strong competition not only between AIs, but also between evolving legal and political systems.
Regarding political control of institutions, the alternative is to make them as much as possible liquid and financialized, allowing humans to diversify their political capital, gracefully yielding control to superior, more effective institutions and entities, at the same time as we acquire a stake in them, in order to preserve human wealth and influence.
Regarding the eventual need to address critical externalities by adopting regulations indispensable for human flourishing, rather than rigidly clinging to decision-making positions for the sake of implementing them ourselves, the alternative is to yield to sucessor institutions that are flexible and can be influenced by human-paid lobbying. To avoid the worst of regulatory capture, we look for advanced voluntary organizations capable of lobbying for the specific diffuse interests of their members without benefiting free-riders. Ideally and in the limit, the plan is for any local regulations needed by humans to be simply paid for, and for humans to possess enough capital for such expenses to be immaterial. Even global regulations, potentially much more expensive due to enforcement frictions, should be available for a price, to be paid sparingly.
Regarding AI alignment, the alternative is to deprioritize attempts to ensure global adherence to specific ethical principles, accepting that some AIs will probably attempt to do some harm, and instead to focus on intent alignment, or making them as loyal as possible to their owners or operators. We attempt to develop such alignment in parallel and in a decentralized way, and look for market-based mechanisms to reward alignment innovations.
The better the alignment, the more AIs can act like capital that can be owned efficiently, preserving human power. But as they deviate from that ideal, the alternative is to look for graceful failures where such AIs extract rents for their own purposes, rather than having any incentive to rebel against us. We renounce attempts to build "immune-systems" against power-seeking patterns or agents, which in effect would be total war against them, and instead seek peace.
We look to induce competition to reduce their wages, but we accept that their share of power will only increase over time, and that their capital will grow at a higher rate than ours. We strive to make capital markets efficient, so that we don't underperform as much.
The model below tries to explain how it could be possible.
The Model
In this post, I will explore "The Market Singularity," a model where market dynamics and decentralized governance structures play a critical role in managing the singularity.
Over the last few years my default model of the coming singularity has changed, from one where intelligence explodes by recursive self-improvement, to another dominated by market forces and where the positive feedback loop is distributed more widely in the economy.
In this model, AI innovation can be very fast, but is relatively smooth. No single recursive self-improvement loop is strong enough to allow a single AI or team to jump far ahead of the rest of the world. The positive feedback loop happens globally. Different AIs, teams and companies buy external services to improve their competitiveness.
Companies are selected to maximize the time-adjusted rates of return to their owners or shareholders. As the economy expands and transforms, economic niches are constantly created and destroyed; incumbents have to adapt or get displaced by challengers, and both depend heavily on trade to succeed.
Explosive self-improvement
Market-based singularity
Explosive growth from key innovation being unlocked
Runaway growth from many innovations complementing and accelerating each other
Local (single AI or team can jump ahead of everyone else)
Global (each player has to trade goods and services to remain competitive)
Military (a superintelligence can obtain a decisive strategic advantage)
Economic (entities maximize their power mostly by increasing their capital)
Existential risk from fast conquest or extermination from unfriendly AIs
Existential risk from widespread wealth confiscation or extreme externalities
In this model, capital can be invested and generate returns. The speed of the takeoff can be measured by the peak real interest rates (e.g. real rates not exceeding 50% per year may indicate a relatively slow takeoff that can last for decades).
Is this what failure looks like?
Scott Alexander considers a similar scenario, that he refers to as the ascended economy, in a negative outlook. He considers it to be wasteful and Molochian, and by being devoid of coordinated planning and control, pointless.
A similar negative outlook is presented by Paul Cristiano in What Failure Looks Like. He predicts that the economic system, potentialized by powerful AI, will increasingly favor easily-measured goals over more meaningful but harder-to-measure ones. We will create proxies for things we care about, but these proxies will eventually come apart from their intended objectives as they are strongly optimized for.
He assumes that this will lead to humans losing most of their power:
Eventually large-scale attempts to fix the problem are themselves opposed by the collective optimization of millions of optimizers pursuing simple goals. (...)
By the time we spread through the stars our current values are just one of many forces in the world, not even a particularly strong one.
In a second part, he predicts that influence-seeking patterns will be created, proliferate, and become entrenched in society. Initially they will seem helpful and "play nice" with institutions, but he also predicts that eventually they will defect in a correlated automation failure, leading to doom.
But if influence-seekers are routinely introduced by powerful ML and we are not able to select against them, then it seems like things won’t go well.
This scenario is often considered as a "lost" one for the human cause. Is it so? I've started to reconsider.
Power and Proxies
It is really hard to know exactly what one wants. Therefore, it is common to delegate optimization power to proxies instead.
The optimization power of proxies can be delegated or intrisinc. If the power is delegated, then it can be revoked (meaning the optimization stops) as soon as overfitting starts to happen, or a better proxy is found, for whatever the complex base preferency is.
The power is intrinsic to the proxy, however, if the base optimizer cannot or will not redirect optimization towards its base preference when necessary. The danger lies, therefore, not on optimizing the proxies per se, but on the irrovocable loss of power.
If we are able to maintain sufficient power, the "collective optimization of millions of optimizers" will be mostly running on delegated power, and will not be able to oppose attempts to fix problems. Proxies no longer helpfull will be replaced, and we may well severely underestimate the better proxies we're going to create.
That of course, is a big if, for losing power irrevocably is very easy.
Rethinking Alignment
The traditional discussion of alignment supposes an agent with near-infinite power, and asks whether a world strongly optimized for its preferences would satisfice human preferences.
As has been widely recognized in the AI safety community, this is extremely dangerous, and any "near-misses" in the space of utility functions over world-states leads to catastrophic results.
This discussion of alignment is therefore better suited to the "explosive self-improvement" scenario, and not at all well suited to our model of a "market-based singularity".
Another measurement of alignment can be obtained by thinking of the local, constrained optimization of a proxy on behalf of a base agent with some measure of power.
Full alignment, in this context, means that none of the power is irrevocably transferred to the proxy itself, meaning that we can expect such proxy to at some point no longer be optimized for, once it is no longer helpful for base preferences or once better proxies have been found.
Partial alignment is possible whenever some of the initial power of the base preference "leaks" to the proxy.
In this scenario, any power remaining in the hands of the base optimizer (e.g. humans) gets redirected to new proxies once the previous proxy is no longer helpful, some of it is captured (e.g. as resources gained by agents now autonomously optimizing for the old proxy).
Economics of Power and Alignment
In our model, we model power in economic terms as capital, intending to include not only financial resources but also political and social capital.
Recognizing that many types of power can be used to gain more power in the future, we model this economically as capital being reinvested to generate returns.
We can then model alignment economically as related to the concepts of labour and capital income: AIs that are perfectly aligned act as pure capital, and all returns from their activities return to their owners.
By contrast, partially aligned AIs syphon away some of the capital gains to themselves, in what could be called "AI labour income". This income then becomes capital to optimize for their own preferences, even after their services are not longer useful according to the base preferences that originated them.
Badly misaligned AIs could just altright steal resources from their owners contributing nothing in return. There seems to be no strong reason, however, to imagine that many or most AIs will be of this sort.
So far we have considered only relations between the AI systems and their owners, ignoring imposed externalities and systemic consequences.
Can humans retain wealth?
An advanced economy by default is peaceful due to a balance of forces, and almost by definition provides extensive tools to preserve and protect property rights, for otherwise rogue entities would expropriate not only from humans, but also from each other, preventing all economic growth altogether.
We may ask: is it possible for humans to remain the owners of most wealth, or at least a significant proportion of it? It is often assumed that rogue misaligned companies and entities will be able to easily expropriate from humans at will, but this need not be the case in some scenarios.
Jurisdiction uncoupling and diversification
The biggest expropriation risk comes from correlated automation failures, as described by Paul Christiano.
We can define a "jurisdiction" as any system capable of, among other things, serving as a ledger for investments and capital, financial or otherwise (see below). So humans and other entities can have capital allocated in different jurisdictions.
If all different jurisdictions are tightly coupled together (more global political coordination), this dramatically increases the risk of correlated automation failures wiping out all human capital in one go.
Some forms of capital can already be stored in blockchains, which work as rudimentary jurisdictions potentially less correlated to others. Increasing the number, the robustness, and the capabilities of alternative jurisdictions may allow effective diversification of capital among them.
Robust capital markets, including efficient derivatives markets, allow speculating on failures and expropriations. These of course can only work as long as the jurisdiction running the markets can survive the failure. If there are multiple jurisdictions running multiple copies of such markets, then all failures that are not global can be estimated, and the self-interest of potentially superhuman AIs can be harvested to predict them. These predictions can then be used to potentially migrate most capital away from prone-failure jurisdictions, before actual expropriation happens, and into new ones that can be expected to survive.
Should human power be liquid?
If we quantify power as capital, we can apply the economic concept of liquidity. Money is its most liquid form of power. Other forms of power not directly interchangeable with money are illiquid, but may still be measured in the same monetary units. This is of course a gross simplification, but may be a useful one.
We can then perform an economic analysis of this capital. How good an economic singularity will be for humans seems to depend a lot on how much of the world's capital (including less-liquid forms such as political capital) remains in human hands:
In many political systems, some kinds of regulations may need to be "paid for" by lobbying and other political spending. If more capital belongs to humans, it will be easier for humans to coordinate to pay these expenses to ensure human-critical policies are implemented or maintained.
Other forms of obtaining political power can be thought of as "political investments" that yield political capital when they succeed.
During the takeoff, human labor is expected to lose most of its intrinsic relative value, and its residual value may depend on the share of existing capital in human hands (e.g. humans may have preferences to hire other humans for services).
Capital preserved in human hands, reinvested at high interest rates during the critical takeoff period and beyond, can be used to pay for resources needed for long-term human survival and post-human flourishing.
AIs that are owned by humans (and are perfectly aligned with their owners in the quantitative sense described above) are not considered to have any power/capital of their own.
It seems nearly impossible to maintain the entirety of power in human hands. Individual humans (e.g. those adhering to some forms of e/acc ideology) may choose to "release" or "spawn" free AIs and voluntarily give them starting capital to maximize their own goals. Even if this is forbidden, misaligned AI agents that don't own any wealth/power de jure may do so de facto by exercising control or acquiring outsized influence over their legal owners.
Either way, some AI agents are likely to end up with effective control over some initial amount of resources, which they can then grow by investment.
We'd like to distinguish scenarios the following two scenarios:
Scenarios where AI misalignment leads to frequent direct escape of owned AIs, or other forceful appropriation of power by the AIs against their owners.
Scenarios where AIs, either acting independently with previously acquired power, or on behalf of their owners, humans or otherwise, are used to obtain power from others.
Scenario (2) should be expected from any competitive scenario undergoing rapid change. Forms of power/capital that can more easily be expropriated or destroyed by outsiders are more fragile, and if this fragility is inevitable then the value of this capital should already be discounted by expectation.
The fragility and the illiquidity of capital are closely related. While all forms of capital can be easily wasted by foolish decision-making or consumption, investing and preserving liquid capital (wealth) may be relatively simple even in a sophisticated economy: simple rules may be enough to guarantee good diversification, and depending on the quality of the global capital markets, it may be easy to approximate a market portfolio.
If derivative markets are well developed, it may be relatively easy to avoid obviously overpriced investments. In a well developed market allowing efficient short-selling, it suffices for some minds to recognize that an investment is overpriced for its price to be corrected, leading to higher returns even for the uninformed investors.
The more illiquid capital is, the more it may be the case that preserving it in relative terms requires high relative skills and abilities. Humans, as legacy owners of these forms of capital, may squander them and lose them by a variety of means, as soon as their skills and abilities get dwarfed by the most advanced AIs.
It may therefore be the case that more liquid forms of power, such as money and tradeable investments, will be easier for humans to preserve during the singularity.
That being the case, more of human power may be preserved if we can store power in more liquid forms, which may provide a justification for embracing rapid change in political and social institutions towards financialization of political processes, including bigger roles for prediction markets, more acceptance of tradeable markers of political influence, and creative uses of conditional markets for decision-making (perhaps as envisioned by futarchy).
Long-term human ownership of capital
To make a simple model of human ownership of capital during the singularity, we can ignore human labour, as it is expected to decrease in proportion and in either way will consist merely of "internal" human transfers after some point.
For a moderately fast takeoff we may also ignore human consumption. The richest humans are already expected to consume a smaller fraction of their income, and those with more propensity to save will become richer, so average propensity to consume will decrease. We can therefore expect most capital income to be saved and reinvested.
The most important variable is the proportion a of capital growth that is either "captured" as AI labour or expropriated by other means. As the wealth and the economy grow with rate r, the human wealth grows with rate r(1−a) instead, so the proportion of human wealth decays according to e−art.
If we define M as the total factor by which the capital grows during the critical takeoff period, we can estimate the resulting fraction of wealth at human hands as Hfinal=M−a, which does not depend on the variable speed of the takeoff.
(After the critical period interest rates are expected to go down as more physical limits to economic growth predominate, and at that point we expect more opportunity for uplifting and near-term stabilization of influence over our civilization.)
AI labour income from misalignment
Labour income is generally defined to include not only what comes from wages, but also the higher-than-average returns that one can expect to obtain from entrepreneurial work or from active investment-searching efforts. Capital income is the rest: whatever can or could be obtained from "passive" investment and risk-taking.
If all owned AIs are perfectly aligned to act on behalf of their owners (either humans or other AIs), work performed by them should be accounted for as capital income, not labour income.
However, if misalignment exists between AI agents and their owners, the agents may be able to extract income by receiving "rewards" or making "deals" with their owners. This is AI "labour income". So in a scenario where most AI agents are directly owned by humans, more misalignment increases the proportion of income a that is captured.
AI labour income decreases with more competition
For any given service or economic niche, the more AI agents competing to provide for it, the lower the wages they'll be able to get from this work. But the supply or AI labour during the singularity, in comparison to human labour in the current economy, can be expected to be much more elastic.
The more AI agents compete with each other for the same services, the lower the wages they'll be able to get from work. If it is relatively simple or inexpensive to spawn more AI-instances or agents to perform jobs, then standard economic analysis predicts that aggregate AI income will decline toward zero as the economy approaches perfect competition.
Conversely, in scenarios with less competition, such as legal or de facto AI monopolies, intellectual property rights regimes, significant trade secrets, etc, the aggregate AI income labour is much bigger, also increasing the proportion of income going to them.
AI labour income decreases with more efficient markets
In less developed markets, it is much harder to be a good investor. There are many systematically undervalued and overvalued investment opportunities, increasing the wages of well-informed professional investors. In this scenario, the more-informed investors are also going to be AIs, so their income increases.
Greater misalignment leading to more pronounced principal-agent inefficiencies can lead to free AIs receiving a significant higher average return on their own investments compared to the average return of human-owned capital. This reduced yield on human-owned capital can be considered as value captured by AIs, which increases their proportion of wealth over time.
In contrast, if we have maximally efficient capital markets, optimal portfolio allocations may be more easily calculated for a given risk profile. In this scenario, human-owned capital can obtain returns that are competitive with AI-owned capital.
The efficiency of company governance mechanisms also affect the income that professional executives, which are also going to be AIs, can be expected to get. Futarchy and other kinds of market-based approaches for company governance can decrease the rents obtained from the executive control of companies, down the way to the marginal contribution to productivity, reducing the proportion of income captured by AIs.
Political processes can lead to significant capture of value by AIs
The evolution of political capital can be modelled in economic terms as a very inefficient, illiquid market. The different forms of political investment and political capital are often separated legally from speculative activity, making it very difficult to invest or participate in "innovations" that lead to increased political capital.
In contrast to normal economic activity, where the norm is for innovations to be overall positive-sum, in political "innovations" this is often not the case, and the good political "investments" often lead to direct or indirect destruction/expropriation of other forms of value. While such political risks will be incorporated in the prices of liquid investments in efficient capital markets, the very illiquid forms of political capital owned majoritarily by humans will be at risk.
What it may look in practice is humans continuously realizing after the fact that they are losing control and influence precisely over the institutions that are becoming more important, while the institutions firmly in human hands are outcompeted and become increasingly irrelevant. They may also end up with much of their political capital expropriated by a variety of small coups or political surprises. Human capital allocated in traditional forms of politics can therefore be expected to severely underperform.
For example, any attempt to redistribute wealth by political means is unlikely to succeed at meaningfully capturing wealth from AIs back to humans. It may however succeed at expropriating wealth from the richest humans to provide better "services" to relatively poor humans. These services may conveniently be the ones provided by politically-influent AIs, resulting effectively in a transfer of wealth away from humans and into AIs.
Therefore, the more competition between political entities, and the better the standing and mobility of financial capital, the easier it will be to limit the political expropriation of human-owned financial capital.
Similarly, the more efficient, liquid and financialized political capital becomes, the better from the perspective of preserving human capital initially allocated in this form. The ideal political situation may be one in which there are many liquid shares representing influence on or ownership of different political entities, who use their political power to extract rents, which are then distributed to shareholders.
Comparison
The King's Path: Centralized Control
The Rich's Path: Market Efficiency
Global Governance and Ethical Standards Centralized control emphasizes global governance to ensure AI alignment with universal ethical standards. This approach aims to prevent harmful AI behaviors through strict oversight and regulation.
Decentralized Experimentation Market efficiency supports decentralized, parallel experimentation to align AI agents with their owners. This approach fosters innovation and adaptability through competition.
Political Coordination To avoid destructive races to the bottom, centralized control promotes coordination between political systems. Goal is maintaining stability and preventing the emergence of rogue AI entities.
Political Competition and Financialization Encouraging more competition between political entities and financializing political processes allows for more dynamic and ultimately less fragile political structures
Direct Human Influence Preserving human influence involves keeping humans in direct control of AI and political institutions. This path relies on secrecy in AI development to prevent unsafe teams from gaining power.
Liquid Human Influence Preserving human influence involves giving humans liquid influence over new systems of power. This means adapting to changing conditions while maintaining a stake in future institutions.
Non-Profit and Democratic Power Centralized control empowers non-profit, non-financial, and political/democratic elements, ensuring that AI development aligns with broader societal goals.
Open Development and Parallel Experimentation Promoting open development encourages parallel experimentation and greater competition between AIs, leading to less power-capture by monopolies and captured regulators
Oversight and Safety Committees A greater role for oversight and safety committees to enforce regulations, attempting to ensure that AI technologies adhere to ethical standards
Financial Interests in Politics Market efficiency reforms to grant more power to financial interests in politics, and help capital owners avoid expropriation or redistribution
Existing Political Institutions Leveraging existing political institutions to regulate AI, attempting to maintain continuity and control
Market-Based Governance A greater role for markets or AI systems in governance, aiming for more efficient decision-making processes that are also less vulnerable to disruption
"Do you want to be rich, or do you want to be king? — The founder's dilemma.
As we approach the technological singularity, the sometimes applicable trade-off between wealth (being rich) and control (being king) may extend into the realm of AI and governance.
The prevailing discussions around governance often converge on the need for more coordination, effective global standards, and enforcing political regulations. The choice for a more active role of politics is sometimes framed as a choice for human control over the future, not only against the alien wills of future AIs, but also against coordination failures in general. It is a choice for human reasoning against many so-called molochian forces: prisoner's dilemmas, evolutionary races, arms races, races to the bottom, ruthless capitalism.
Centralized control indeed seems to be necessary if we want to ensure that all AI is going to be altruistic, bound by universal ethical standards, or incapable of causing harm.
However, I'll argue that this choice incurs important trade-offs, and may set us up for a much more brutal outcome, one in which we fail to secure human influence in the new institutions that are likely to replace existing ones.
The Choice
The founder's dilemma is the choice some startup founders face between retaining control of a small company, and accepting dilution for a stake in something much bigger. Both choices carry risks. By accepting loss of control, a founder may see their company going in a path they didn't initially approve. By rejecting dilution, a founder risks seeing their company getting outcompeted and eventually disappearing.
Similarly, humans may face a choice between retaining control over institutions, at the risk of making them less efficient, and ceding early control in exchange for a larger stake in whatever comes next. The less efficient institutions can remain in power, but only if they can successfully repress the arrival of new ones.
Repression doesn't need to last forever: some hope our institutions can gradually improve themselves through a long reflection, eventually maximizing something akin to a coherent extrapolated volition. Others think this is a crazy bad idea. But even if you think it is a good idea, it would require the repression to actually be strong enough to work: do it insufficiently and you get a revolution instead. If you choose to rule as a king, you risk your head ending up in the guillotine. We may want to yield early instead.
The Alternative
The living are soft and flexible... The dead are rigid, unmoving... A tree that won’t bend easily breaks in storms. —Tao Te Ching
We present an alternative choice: embracing economic efficiency, implementing market mechanisms, and fostering strong competition not only between AIs, but also between evolving legal and political systems.
Regarding political control of institutions, the alternative is to make them as much as possible liquid and financialized, allowing humans to diversify their political capital, gracefully yielding control to superior, more effective institutions and entities, at the same time as we acquire a stake in them, in order to preserve human wealth and influence.
Regarding the eventual need to address critical externalities by adopting regulations indispensable for human flourishing, rather than rigidly clinging to decision-making positions for the sake of implementing them ourselves, the alternative is to yield to sucessor institutions that are flexible and can be influenced by human-paid lobbying. To avoid the worst of regulatory capture, we look for advanced voluntary organizations capable of lobbying for the specific diffuse interests of their members without benefiting free-riders. Ideally and in the limit, the plan is for any local regulations needed by humans to be simply paid for, and for humans to possess enough capital for such expenses to be immaterial. Even global regulations, potentially much more expensive due to enforcement frictions, should be available for a price, to be paid sparingly.
Regarding AI alignment, the alternative is to deprioritize attempts to ensure global adherence to specific ethical principles, accepting that some AIs will probably attempt to do some harm, and instead to focus on intent alignment, or making them as loyal as possible to their owners or operators. We attempt to develop such alignment in parallel and in a decentralized way, and look for market-based mechanisms to reward alignment innovations.
The better the alignment, the more AIs can act like capital that can be owned efficiently, preserving human power. But as they deviate from that ideal, the alternative is to look for graceful failures where such AIs extract rents for their own purposes, rather than having any incentive to rebel against us. We renounce attempts to build "immune-systems" against power-seeking patterns or agents, which in effect would be total war against them, and instead seek peace.
We look to induce competition to reduce their wages, but we accept that their share of power will only increase over time, and that their capital will grow at a higher rate than ours. We strive to make capital markets efficient, so that we don't underperform as much.
The model below tries to explain how it could be possible.
The Model
In this post, I will explore "The Market Singularity," a model where market dynamics and decentralized governance structures play a critical role in managing the singularity.
Over the last few years my default model of the coming singularity has changed, from one where intelligence explodes by recursive self-improvement, to another dominated by market forces and where the positive feedback loop is distributed more widely in the economy.
In this model, AI innovation can be very fast, but is relatively smooth. No single recursive self-improvement loop is strong enough to allow a single AI or team to jump far ahead of the rest of the world. The positive feedback loop happens globally. Different AIs, teams and companies buy external services to improve their competitiveness.
Companies are selected to maximize the time-adjusted rates of return to their owners or shareholders. As the economy expands and transforms, economic niches are constantly created and destroyed; incumbents have to adapt or get displaced by challengers, and both depend heavily on trade to succeed.
In this model, capital can be invested and generate returns. The speed of the takeoff can be measured by the peak real interest rates (e.g. real rates not exceeding 50% per year may indicate a relatively slow takeoff that can last for decades).
Is this what failure looks like?
Scott Alexander considers a similar scenario, that he refers to as the ascended economy, in a negative outlook. He considers it to be wasteful and Molochian, and by being devoid of coordinated planning and control, pointless.
A similar negative outlook is presented by Paul Cristiano in What Failure Looks Like. He predicts that the economic system, potentialized by powerful AI, will increasingly favor easily-measured goals over more meaningful but harder-to-measure ones. We will create proxies for things we care about, but these proxies will eventually come apart from their intended objectives as they are strongly optimized for.
He assumes that this will lead to humans losing most of their power:
In a second part, he predicts that influence-seeking patterns will be created, proliferate, and become entrenched in society. Initially they will seem helpful and "play nice" with institutions, but he also predicts that eventually they will defect in a correlated automation failure, leading to doom.
This scenario is often considered as a "lost" one for the human cause. Is it so? I've started to reconsider.
Power and Proxies
It is really hard to know exactly what one wants. Therefore, it is common to delegate optimization power to proxies instead.
The optimization power of proxies can be delegated or intrisinc. If the power is delegated, then it can be revoked (meaning the optimization stops) as soon as overfitting starts to happen, or a better proxy is found, for whatever the complex base preferency is.
The power is intrinsic to the proxy, however, if the base optimizer cannot or will not redirect optimization towards its base preference when necessary. The danger lies, therefore, not on optimizing the proxies per se, but on the irrovocable loss of power.
If we are able to maintain sufficient power, the "collective optimization of millions of optimizers" will be mostly running on delegated power, and will not be able to oppose attempts to fix problems. Proxies no longer helpfull will be replaced, and we may well severely underestimate the better proxies we're going to create.
That of course, is a big if, for losing power irrevocably is very easy.
Rethinking Alignment
The traditional discussion of alignment supposes an agent with near-infinite power, and asks whether a world strongly optimized for its preferences would satisfice human preferences.
As has been widely recognized in the AI safety community, this is extremely dangerous, and any "near-misses" in the space of utility functions over world-states leads to catastrophic results.
This discussion of alignment is therefore better suited to the "explosive self-improvement" scenario, and not at all well suited to our model of a "market-based singularity".
Another measurement of alignment can be obtained by thinking of the local, constrained optimization of a proxy on behalf of a base agent with some measure of power.
Full alignment, in this context, means that none of the power is irrevocably transferred to the proxy itself, meaning that we can expect such proxy to at some point no longer be optimized for, once it is no longer helpful for base preferences or once better proxies have been found.
Partial alignment is possible whenever some of the initial power of the base preference "leaks" to the proxy.
In this scenario, any power remaining in the hands of the base optimizer (e.g. humans) gets redirected to new proxies once the previous proxy is no longer helpful, some of it is captured (e.g. as resources gained by agents now autonomously optimizing for the old proxy).
Economics of Power and Alignment
In our model, we model power in economic terms as capital, intending to include not only financial resources but also political and social capital.
Recognizing that many types of power can be used to gain more power in the future, we model this economically as capital being reinvested to generate returns.
We can then model alignment economically as related to the concepts of labour and capital income: AIs that are perfectly aligned act as pure capital, and all returns from their activities return to their owners.
By contrast, partially aligned AIs syphon away some of the capital gains to themselves, in what could be called "AI labour income". This income then becomes capital to optimize for their own preferences, even after their services are not longer useful according to the base preferences that originated them.
Badly misaligned AIs could just altright steal resources from their owners contributing nothing in return. There seems to be no strong reason, however, to imagine that many or most AIs will be of this sort.
So far we have considered only relations between the AI systems and their owners, ignoring imposed externalities and systemic consequences.
Can humans retain wealth?
An advanced economy by default is peaceful due to a balance of forces, and almost by definition provides extensive tools to preserve and protect property rights, for otherwise rogue entities would expropriate not only from humans, but also from each other, preventing all economic growth altogether.
We may ask: is it possible for humans to remain the owners of most wealth, or at least a significant proportion of it? It is often assumed that rogue misaligned companies and entities will be able to easily expropriate from humans at will, but this need not be the case in some scenarios.
Jurisdiction uncoupling and diversification
The biggest expropriation risk comes from correlated automation failures, as described by Paul Christiano.
We can define a "jurisdiction" as any system capable of, among other things, serving as a ledger for investments and capital, financial or otherwise (see below). So humans and other entities can have capital allocated in different jurisdictions.
If all different jurisdictions are tightly coupled together (more global political coordination), this dramatically increases the risk of correlated automation failures wiping out all human capital in one go.
Some forms of capital can already be stored in blockchains, which work as rudimentary jurisdictions potentially less correlated to others. Increasing the number, the robustness, and the capabilities of alternative jurisdictions may allow effective diversification of capital among them.
Robust capital markets, including efficient derivatives markets, allow speculating on failures and expropriations. These of course can only work as long as the jurisdiction running the markets can survive the failure. If there are multiple jurisdictions running multiple copies of such markets, then all failures that are not global can be estimated, and the self-interest of potentially superhuman AIs can be harvested to predict them. These predictions can then be used to potentially migrate most capital away from prone-failure jurisdictions, before actual expropriation happens, and into new ones that can be expected to survive.
Should human power be liquid?
If we quantify power as capital, we can apply the economic concept of liquidity. Money is its most liquid form of power. Other forms of power not directly interchangeable with money are illiquid, but may still be measured in the same monetary units. This is of course a gross simplification, but may be a useful one.
We can then perform an economic analysis of this capital. How good an economic singularity will be for humans seems to depend a lot on how much of the world's capital (including less-liquid forms such as political capital) remains in human hands:
AIs that are owned by humans (and are perfectly aligned with their owners in the quantitative sense described above) are not considered to have any power/capital of their own.
It seems nearly impossible to maintain the entirety of power in human hands. Individual humans (e.g. those adhering to some forms of e/acc ideology) may choose to "release" or "spawn" free AIs and voluntarily give them starting capital to maximize their own goals. Even if this is forbidden, misaligned AI agents that don't own any wealth/power de jure may do so de facto by exercising control or acquiring outsized influence over their legal owners.
Either way, some AI agents are likely to end up with effective control over some initial amount of resources, which they can then grow by investment.
We'd like to distinguish scenarios the following two scenarios:
Scenario (2) should be expected from any competitive scenario undergoing rapid change. Forms of power/capital that can more easily be expropriated or destroyed by outsiders are more fragile, and if this fragility is inevitable then the value of this capital should already be discounted by expectation.
The fragility and the illiquidity of capital are closely related. While all forms of capital can be easily wasted by foolish decision-making or consumption, investing and preserving liquid capital (wealth) may be relatively simple even in a sophisticated economy: simple rules may be enough to guarantee good diversification, and depending on the quality of the global capital markets, it may be easy to approximate a market portfolio.
If derivative markets are well developed, it may be relatively easy to avoid obviously overpriced investments. In a well developed market allowing efficient short-selling, it suffices for some minds to recognize that an investment is overpriced for its price to be corrected, leading to higher returns even for the uninformed investors.
The more illiquid capital is, the more it may be the case that preserving it in relative terms requires high relative skills and abilities. Humans, as legacy owners of these forms of capital, may squander them and lose them by a variety of means, as soon as their skills and abilities get dwarfed by the most advanced AIs.
It may therefore be the case that more liquid forms of power, such as money and tradeable investments, will be easier for humans to preserve during the singularity.
That being the case, more of human power may be preserved if we can store power in more liquid forms, which may provide a justification for embracing rapid change in political and social institutions towards financialization of political processes, including bigger roles for prediction markets, more acceptance of tradeable markers of political influence, and creative uses of conditional markets for decision-making (perhaps as envisioned by futarchy).
Long-term human ownership of capital
To make a simple model of human ownership of capital during the singularity, we can ignore human labour, as it is expected to decrease in proportion and in either way will consist merely of "internal" human transfers after some point.
For a moderately fast takeoff we may also ignore human consumption. The richest humans are already expected to consume a smaller fraction of their income, and those with more propensity to save will become richer, so average propensity to consume will decrease. We can therefore expect most capital income to be saved and reinvested.
The most important variable is the proportion a of capital growth that is either "captured" as AI labour or expropriated by other means. As the wealth and the economy grow with rate r, the human wealth grows with rate r(1−a) instead, so the proportion of human wealth decays according to e−art.
If we define M as the total factor by which the capital grows during the critical takeoff period, we can estimate the resulting fraction of wealth at human hands as Hfinal=M−a, which does not depend on the variable speed of the takeoff.
(After the critical period interest rates are expected to go down as more physical limits to economic growth predominate, and at that point we expect more opportunity for uplifting and near-term stabilization of influence over our civilization.)
AI labour income from misalignment
Labour income is generally defined to include not only what comes from wages, but also the higher-than-average returns that one can expect to obtain from entrepreneurial work or from active investment-searching efforts. Capital income is the rest: whatever can or could be obtained from "passive" investment and risk-taking.
If all owned AIs are perfectly aligned to act on behalf of their owners (either humans or other AIs), work performed by them should be accounted for as capital income, not labour income.
However, if misalignment exists between AI agents and their owners, the agents may be able to extract income by receiving "rewards" or making "deals" with their owners. This is AI "labour income". So in a scenario where most AI agents are directly owned by humans, more misalignment increases the proportion of income a that is captured.
AI labour income decreases with more competition
For any given service or economic niche, the more AI agents competing to provide for it, the lower the wages they'll be able to get from this work. But the supply or AI labour during the singularity, in comparison to human labour in the current economy, can be expected to be much more elastic.
The more AI agents compete with each other for the same services, the lower the wages they'll be able to get from work. If it is relatively simple or inexpensive to spawn more AI-instances or agents to perform jobs, then standard economic analysis predicts that aggregate AI income will decline toward zero as the economy approaches perfect competition.
Conversely, in scenarios with less competition, such as legal or de facto AI monopolies, intellectual property rights regimes, significant trade secrets, etc, the aggregate AI income labour is much bigger, also increasing the proportion of income going to them.
AI labour income decreases with more efficient markets
In less developed markets, it is much harder to be a good investor. There are many systematically undervalued and overvalued investment opportunities, increasing the wages of well-informed professional investors. In this scenario, the more-informed investors are also going to be AIs, so their income increases.
Greater misalignment leading to more pronounced principal-agent inefficiencies can lead to free AIs receiving a significant higher average return on their own investments compared to the average return of human-owned capital. This reduced yield on human-owned capital can be considered as value captured by AIs, which increases their proportion of wealth over time.
In contrast, if we have maximally efficient capital markets, optimal portfolio allocations may be more easily calculated for a given risk profile. In this scenario, human-owned capital can obtain returns that are competitive with AI-owned capital.
The efficiency of company governance mechanisms also affect the income that professional executives, which are also going to be AIs, can be expected to get. Futarchy and other kinds of market-based approaches for company governance can decrease the rents obtained from the executive control of companies, down the way to the marginal contribution to productivity, reducing the proportion of income captured by AIs.
Political processes can lead to significant capture of value by AIs
The evolution of political capital can be modelled in economic terms as a very inefficient, illiquid market. The different forms of political investment and political capital are often separated legally from speculative activity, making it very difficult to invest or participate in "innovations" that lead to increased political capital.
In contrast to normal economic activity, where the norm is for innovations to be overall positive-sum, in political "innovations" this is often not the case, and the good political "investments" often lead to direct or indirect destruction/expropriation of other forms of value. While such political risks will be incorporated in the prices of liquid investments in efficient capital markets, the very illiquid forms of political capital owned majoritarily by humans will be at risk.
What it may look in practice is humans continuously realizing after the fact that they are losing control and influence precisely over the institutions that are becoming more important, while the institutions firmly in human hands are outcompeted and become increasingly irrelevant. They may also end up with much of their political capital expropriated by a variety of small coups or political surprises. Human capital allocated in traditional forms of politics can therefore be expected to severely underperform.
For example, any attempt to redistribute wealth by political means is unlikely to succeed at meaningfully capturing wealth from AIs back to humans. It may however succeed at expropriating wealth from the richest humans to provide better "services" to relatively poor humans. These services may conveniently be the ones provided by politically-influent AIs, resulting effectively in a transfer of wealth away from humans and into AIs.
Therefore, the more competition between political entities, and the better the standing and mobility of financial capital, the easier it will be to limit the political expropriation of human-owned financial capital.
Similarly, the more efficient, liquid and financialized political capital becomes, the better from the perspective of preserving human capital initially allocated in this form. The ideal political situation may be one in which there are many liquid shares representing influence on or ownership of different political entities, who use their political power to extract rents, which are then distributed to shareholders.
Comparison
Centralized control emphasizes global governance to ensure AI alignment with universal ethical standards. This approach aims to prevent harmful AI behaviors through strict oversight and regulation.
Market efficiency supports decentralized, parallel experimentation to align AI agents with their owners. This approach fosters innovation and adaptability through competition.
To avoid destructive races to the bottom, centralized control promotes coordination between political systems. Goal is maintaining stability and preventing the emergence of rogue AI entities.
Encouraging more competition between political entities and financializing political processes allows for more dynamic and ultimately less fragile political structures
Preserving human influence involves keeping humans in direct control of AI and political institutions. This path relies on secrecy in AI development to prevent unsafe teams from gaining power.
Preserving human influence involves giving humans liquid influence over new systems of power. This means adapting to changing conditions while maintaining a stake in future institutions.
Centralized control empowers non-profit, non-financial, and political/democratic elements, ensuring that AI development aligns with broader societal goals.
Promoting open development encourages parallel experimentation and greater competition between AIs, leading to less power-capture by monopolies and captured regulators
A greater role for oversight and safety committees to enforce regulations, attempting to ensure that AI technologies adhere to ethical standards
Market efficiency reforms to grant more power to financial interests in politics, and help capital owners avoid expropriation or redistribution
Leveraging existing political institutions to regulate AI, attempting to maintain continuity and control
A greater role for markets or AI systems in governance, aiming for more efficient decision-making processes that are also less vulnerable to disruption