Produced as part of the ML Alignment & Theory Scholars Program - Summer 2024 Cohort

This post presents the basic story behind my current research direction at MATS. I'm looking for feedback on whether the basic premises hold up, and which research questions should I prioritize.

The role of investors in fueling the AGI race is unclear

Most Frontier AI Labs acknowledge the extreme risks of AGI development, invest in safety, and seem to care at least somewhat about making sure their systems don't destroy the world.

However, Frontier Labs currently appear to be in a race to build AGI, which most fear will lead Labs to make risky bets and not take adequate safety measures to avoid catastrophic outcomes, in order to have a chance of winning the race. A common explanation is that Labs are for-profit companies, which prevents them from putting safety first, because investors would force them to put profit first and continue to rapidly develop capabilities even though executives know the risks.

But do investors actually have so much influence over companies that they can force them to build dangerous models against their best judgement?

There are strong economic arguments for why companies that prioritize profit are rewarded in the long run, both through selection effects (profit-seeking companies are less likely to go bankrupt) and through capital allocation (non-profit-seeking companies don't receive investment or don't even get started). However, these mechanisms only work on average over long periods of time, and in the short term investors regularly fail to keep the companies they invest in from making stupid decisions and going bankrupt (e.g. FTX).

So I think it is unclear how much power investors have over Frontier Labs' overall strategy for building AGI, or over specific short-term decisions like whether to train a particular dangerous model. I expect their effective power to vary by company, by type of decision and by timescale. Their power will depend on specific corporate structures, executives, boards, internal cultures, ownership types and applicable corporate laws.

Identifying the power of investors over Labs can inform AI governance strategy

I expect the details of investors' influence over Labs' decisions to be relevant to several strategic considerations:

  • Whether widespread investor awareness of the extreme dangers of AGI development would change the behavior of Labs
  • Whether regulations that change the financial incentives of Labs would succeed in changing their behavior.
  • Whether Labs would avoid actions that would harm the rest of their investors' portfolios (e.g. avoiding automation of certain industries, avoiding large-scale catastrophes).
  • Whether executives could take actions that would disempower investors, empower themselves, or take any kind of large-scale unilateral action enabled by AGI (e.g. pivotal acts, implementing their own ideology in the AGI, taking military or political control, distributing profits through a UBI).
  • And probably many others

Research direction: Investing how investors influence Frontier AI Labs

I expect that there are many open questions in this area, and that studying them could help governance researchers make better decisions and policy proposals.

Here is a list of questions that seem important to investigate:

  • What are the different ways in which investors have formal control over Frontier Labs? Over which timescales does this control take place?
  • How can we quantify the amount of influence each stakeholder has over the Labs' decisions?
  • How does the amount of investor influence over the development of AGI change depending on timelines lengths?
  • How would the bursting of the AI bubble or other macroeconomic events change the power of investors?
  • How do different Labs compare in the amount of influence investors have over their decisions?
  • How does the fiduciary duty of companies to investors work? What is the duration of a typical breach of fiduciary duty case? In what case could Labs investors win? Would it be likely to work?
  • What are the different ways in which investors have informal power over labs? How does this compare with their formal power?

All feedback is welcome! Feel free to comment on whether the basic premises hold up, whether this is an impactful research direction, what research questions I should prioritize, or how this work should be published.

New Comment
7 comments, sorted by Click to highlight new comments since:

Your theory of change seems pretty indirect. Even if you do this project very successfully, to improve safety, you mostly need someone to read your writeup and do interventions accordingly. (Except insofar as your goal is just to inform AI safety people about various dynamics.)


There's classic advice like find the target audience for your research and talk to them regularly so you know what's helpful to them. For an exploratory project like this maybe you don't really have a target audience. So... at least write down theories of change and keep them in mind and notice how various lower-level directions relate to them.

Generally I'd steer towards informal power over formal power. 

Think about the OpenAI debacle last year. If I understand correctly, Microsoft had no formal power to exert control over OpenAI. But they seemed to have employees on their side. They could credibly threaten to hire away all the talent, and thereby reconstruct OpenAI's products as Microsoft IP. Beyond that, perhaps OpenAI was somewhat dependent on Microsoft's continued investment, and even though they don't have to do as Microsoft says, are they really going to jeopardise future funding? What is at stake is not just future funding from Microsoft, but also all future investors who will look at OpenAI's interactions with its investors in the past to understand the value they will get by investing.

It does seem like informal power structures seem more difficult to study, because they are by their nature much less legible. You have to perceive and name the phenomena, the conduits of power, yourself rather than having them laid out for you in legislation. But a case study on last year's events could give you something concrete to work with. You might form some theories about power relations between labs, their employees, and investors, and then based on those theoretical frameworks, describe some hypothetical future scenarios and the likely outcomes.

If there was any lesson from last year's events, IMAO, it was that talent and the raw fascination with creating a god might be even more powerful than capital. Dan Faggella described this well in a Future of Life podcast episode released in May this year (from about 0:40 onwards).

My favored version of this project would involve >50% of the work going into the econ literature and models on investor incentives, with attention to

  • Principal-agent problems
  • Information asymmetry
  • Risk preferences
  • Time discounting

And then a smaller fraction of the work would involve looking into AI labs, specifically. I'm curious if this matches your intentions for the project or whether you think there are important lessons about the labs that will not be found in the existing econ literature.

I expect that basic econ models and their consequences on the motivations of investors are already mostly known in the AI safety community, even if only through vague statements like "VCs are more risk tolerant than pension funds".

My main point in this post is that it might be the case that AI labs successfully removed themselves from the influence of investors, so that it actually matters very little what the investors of AI Labs want or do. I think that determining whether this is the case is important, as in this case our intuitions about how companies work generally would not apply to AI labs.

How does the fiduciary duty of companies to investors work?

OpenAI instructs investors to view their investments "in the spirit of a donation," which might be relevant for this question.

The link does not work.

I don't think a written disclaimer would amount to much in a court case without corresponding provisions in the corporate structure.