I just want to make sure that when I donate money to AI alignment stuff it's actually going to be used economically

New Answer
New Comment

8 Answers sorted by

Jeremy Gillen

4733

The non-spicy answer is probably the LTFF, if you're happy deferring to the fund managers there. I don't know what your risk tolerance for wasting money is, but you can check whether they meet it by looking at their track record.

If you have a lot of time you might be able to find better ways to spend money than the LTFF can. (Like if you can find a good way to fund intelligence amplification as Tsvi said).

My perspective is that I'm much more optimistic about policy than about technical research, and I don't really feel qualified to evaluate policy work, and LTFF makes almost no grants on policy. I looked around and I couldn't find any grantmakers who focus on AI policy. And even if they existed, I don't know that I could trust them (like I don't think Open Phil is trustworthy on AI policy and I kind of buy Habryka's arguments that their policy grants are net negative).

I'm in the process of looking through a bunch of AI policy orgs myself. I don't think I ca... (read more)

How does someone view the actual outcomes of the ‘Highlighted Grants’ on that page?

It would be a lot more reassuring if readers can check that they’ve all been fulfilled and/or exceeded expectations.

TsviBT

3513

You probably shouldn't donate to alignment research. There's too much useless stuff with too good PR for you to tell what, if anything, is hopeworthy. If you know any young supergenius people who could dedicate their lifeforce to thinking about alignment FROM SCRATCH given some money, consider giving to them.

If there's some way to fund research that will lead to strong human intelligence amplification, you should do that. I can give some context for that, though not concrete recommendations.

Just to clarify, do you only consider 'strong human intelligence amplification' through some internal change, or do you also consider AIs to be part of that? As in, it sounds like you are saying we currently lack the intelligence to make significant progress on alignment research and consider increasing human intelligence to be the best way to make progress. Are you also of the opinion that using AIs to augment alignment researchers and progressively automate alignment research is doomed and not worth consideration? If not, then here.

2TsviBT
Not strictly doomed but probably doomed, yeah. You'd have to first do difficult interesting novel philosophy yourself, and then look for things that would have helped with that.
2jacquesthibs
Fair enough. For what it's worth, I've thought a lot about the kind of thing you describe in that comment and partially committing to this direction because I feel like I have enough intuition and insight that those other tools for thought failed to incorporate.
2TsviBT
Sure. You can ping me at some point if you want to talk about ideas for that or get my feedback or whatever.

TsviBT didn't recommend MIRI probably because he receives a paycheck from MIRI and does not want to appear self-serving. I on the other hand have never worked for MIRI and am unlikely ever to (being of the age when people usually retire) so I feel free to recommend MIRI without hesitation or reservation.

MIRI has abandoned hope of anyone's solving alignment before humanity runs out of time: they continue to employ people with deep expertise in AI alignment, but those employees spend their time explaining why the alignment plans of others will not work.

Most ... (read more)

[-]TsviBT222

I'm no longer employed by MIRI. I think Yudkowsky is by far the best source of technical alignment research insight; but MIRI's research program was in retrospect probably pretty doomed even before I got there. I can see ways to improve it but I'm not that confident and I can somewhat directly see that I'm probably not capable of carrying out my suggested improvements. And AFAIK, as you say they're not currently doing very much alignment research. I'm also fine with appearing self-serving; if I were actively doing alignment research, I might recommend myself, though I don't really think it's appropriate to do so to a random person who can't evaluate arguments about alignment research and doesn't know who to trust. I guess if someone pays me enough I'll do some alignment research. I recommend myself as one authority among others on strategy regarding strong human intelligence amplification.

2RHollerith
I'm not saying that MIRI has some effective plan which more money would help with. I'm only saying that unlike most of the actors accepting money to work in AI Safety, at least they won't use a donation in a way that makes the situation worse. Specifically, MIRI does not publish insights that help the AI project and is very careful in choosing whom they will teach technical AI skills and knowledge.
8ryan_greenblatt
Seems false, they could have problematic effects on discourse if their messaging is poor or seems dumb in retrospect. I disagree pretty heavily with MIRI which makes this more likely from my perspective. It seems likely that Yudkowsky has lots of bad effects on discourse right now even from his own lights. I feel pretty good about official MIRI comms activities from my understanding despite a number of disagreements.
-1[anonymous]
4TsviBT
Not sure what you're asking. I think someone trying to work on the technical problem of AI alignment should read Yudkowsky. I think this because... of a whole bunch of the content of ideas and arguments. Would need more context to elaborate, but it doesn't seem like you're asking about that.
-1[anonymous]
2TsviBT
I still don't know what you mean.

I can give some context for that

please do!

3TsviBT
https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods
2TsviBT
DM'd
3Said Achmiz
Would you mind posting that information here? I am also interested (as, I’m sure, are others).
4TsviBT
https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods

Zach Stein-Perlman

184

I am excited about donations to all of the following, in no particular order:

  • AI governance
    • GovAI (mostly research) [actually I haven't checked whether they're funding-constrained]
    • IAPS (mostly research)
    • Horizon (field-building)
    • CLTR (policy engagement)
    • Edit: also probably The Future Society (policy engagement, I think) and others but I'm less confident
  • LTFF/ARM
  • Lightcone

I was recently looking into donating to CLTR and I'm curious why you are excited about it? My sense was that little of its work was directly relevant to x-risk (for example this report on disinformation is essentially useless for preventing x-risk AFAICT), and the relevant work seemed to be not good or possibly counterproductive. For example their report on "a pro-innovation approach to regulating AI" seemed bad to me on two counts:

  1. There is a genuine tradeoff between accelerating AI-driven innovation and decreasing x-risk. So to the extent that this repo
... (read more)
3Zach Stein-Perlman
I don't know. I'm not directly familiar with CLTR's work — my excitement about them is deference-based. (Same for Horizon and TFS, mostly. I inside-view endorse the others I mention.)

Nathan Helm-Burger

94

Maybe it's somewhat in bad taste to propose a project I am involved in, but I think that Max Harm's and Seth Herd's ideas on Corrigibility / DWIMAC need support. Ideally, in my eyes, an org focused specifically on it.

See Corrigibility as Singular Target series for details.

Logan Zoellner

7-13

This is going to be an unpopular answer, but you should invest it in a fund you personally control that is pretty much equally balanced between: Google, Microsoft, Tesla, Apple and Amazon.

This maximizes the leverage you will have at the critical moment (which is not now).

Foyle

76

I think there is far too much focus on technical approaches, when what is needed is a more socio-political focus.  Raising money, convincing deep pockets of risks to leverage smaller sums, buying politicians, influencers and perhaps other groups that can be coopted and convinced of existential risk to put a halt to Ai dev.

It amazes me that there are huge, well financed and well coordinated campaigns for climate, social and environmental concerns, trivial issues next to AI risk, and yet AI risk remains strictly academic/fringe.  What is on paper a very smart community embedded in perhaps the richest metropolitan area the world has ever seen, has not been able to create the political movement needed to slow things up.  I think precisely because they pitching to the wrong crowd.

Dumb it down.  Identify large easily influenceable demographics with a strong tendency to anxiety that can be most readily converted - most obviously teenagers, particularly girls and focus on convincing them of the dangers, perhaps also teachers as a community - with their huge influence.  But maybe also the elederly - the other stalwart group we see so heavily involved in environmental causes.  It would have orders of magnitude more impact than current cerebral elite focus, and history is replete with revolutions borne out of targeting conversion of teenagers to drive them.

particularly girls

why!?

1ZY
I don't understand either. If it is meant what it meant, this is a very biased perception and not very rational (truth seeking or causality seeking). There should be better education systems to fix that.

lc

20

The Center on Long Term Risk is absurdly underfunded, but they focus on S-risks and not X-risks.

Gesild Muka

-1-2

Maybe there's a way to hedge against P(doom) by investing in human prosperity and proliferation while discouraging large leaps in tech. Maybe your money should go towards encouraging or financing low tech high fertility communities?

5 comments, sorted by Click to highlight new comments since:

In my opinion the hard part would not be figuring out where to donate to {decrease P(doom) a lot} rather than {decrease P(doom) a little}, but figuring out where to donate to {decrease P(doom)} rather than {increase P(doom)}.

so, don't donate to people who will take my money and go buy OpenAI more supercomputers while thinking that they're doing a good thing?

and even if I do donate to some people who work on alignment, they might publish it and make OpenAI even more confident that by the time they finish we'll have it under control?

or some other weird way donating might increase P(doom) that I haven't even thought of?

that's a good point

now i really don't know what to do

Do you want to donate to alignment specifically? IMO AI governance efforts are significantly more p(doom)-reducing than technical alignment research; it might be a good idea to, e.g., donate to MIRI, as they’re now focused on comms & governance.

If you don't just want the short answer of "probably LTFF" and want a deeper dive on options, Larks' review is good if (at this point) dated.

Let me first say what I think alignment (or "superalignment") actually requires. This is under the assumption that humanity's AI adventure issues in a superintelligence that dominates everything, and that the problem to be solved is how to make such an entity compatible with human existence and transhuman flourishing. If you think the future will always be a plurality of posthuman entities, including enhanced former humans, with none ever gaining an irrevocable upper hand (e.g. this seems to be one version of e/acc); or if you think the whole race towards AI is folly and needs to be stopped entirely; then you may have a different view. 

I have long thought of a benevolent superintelligence as requiring three things: superintelligent problem-solving ability; the correct "value system" (or "decision procedure", etc); and a correct ontology (and/or the ability to improve its ontology). The first two criteria would not be surprising, in the small world of AI safety that existed before the deep learning revolution. They fit a classic agent paradigm like the expected utility maximizer; alignment (or Friendliness, as we used to say), being a matter of identifying the right utility function. 

The third criterion is a little unconventional, and my main motive for it even more so, in that I don't believe the theories of consciousness and identity that would reduce everything to "computation". I think they (consciousness and identity) are grounded in "Being" or "substance", in a way that the virtual state machines of computation are not; that there really is a difference between a mind and a simulation of a mind, for example. This inclines me to think that quantum holism is part of the physics of mind, but that thinking of it just as physics is not enough, you need a richer ontology of which physics is only a formal description; but these are more like the best ideas I've had, than something I am absolutely sure is true. I am much more confident that purely computational theories of consciousness are radically incomplete, than as to what the correct alternative paradigm is. 

The debate about whether the fashionable reductionist theory of the day is correct, is as old as science. What does AI add to the mix? On the one hand, there is the possibility that an AI with the "right" value system but the wrong ontology, might do something intended as benevolent, that misses the mark because it misidentifies something about personhood. (A simple example of this might be, that it "uploads" everyone to a better existence, but uploads aren't actually conscious, they are just simulations.) On the other hand, one might also doubt the AI's ability to discover that the ontology of mind, according to which uploads are conscious, is wrong, especially if the AI itself isn't conscious. If it is superintelligent, it may be able to discover a mismatch between standard human concepts of mind, extrapolated in a standard way, and how reality actually works; but lacking consciousness itself, it might also lack some essential inner guidance on how the mismatch is to be corrected. 

This is just one possible story about what we could call a philosophical error in the AI's cognition and/or the design process that produced it. I think it's an example of why Wei Dai regards metaphilosophy as an important issue for alignment. Metaphilosophy is the (mostly philosophical) study of philosophy, and includes questions like, what is philosophical thought, what characterizes correct philosophical thought, and, how do you implement correct philosophical thought in an AI? Metaphilosophical concerns go beyond my third criterion, of getting ontology of mind correct; philosophy could also have something to say about problem-solving and about correct values, and even about the entire three-part approach to alignment with which I began. 

So perhaps I will revise my superalignment schema and say: a successful plan for superalignment needs to produce problem-solving superintelligence (since the superaligned AI is useless if it gets trampled by a smarter unaligned AI), a sufficiently correct "value system" (or decision procedure or utility function), and some model of metaphilosophical cognition (with particular attention to ontology of mind).