As a layperson, the problem has been that my ability to figure out what's true relies on being able to evaluate subject-matter experts respective reliability on the technical elements of alignment. I've lurked in this community a long time; I've read the Sequences and watched the Robert Miles videos. I can offer a passing explanation of what the corrigibility problem is, or why ELK might be important.
None of that seems to count for much. Yitz made what I thought was a very lucid post from a similar level of knowledge, trying to bridge that gap, and got mostly answers that didn't tell me (or as best I can tell, them) anything in concept I wasn't already aware of, plus Eliezer himself being kind of hostile in response to someone trying to understand.
So here I find myself in the worst of both worlds; the apparent plurality of the LessWrong commentariat says I'm going to die and to maximise my chance to die with dignity I should quit my job and take out a bunch of loans try to turbo through an advanced degree in machine learning, and I don't have the tools to evaluate whether they're right.
I agree. I find myself in an epistemic state somewhat like: "I see some good arguments for X. I can't think of any particular counter-argument that makes me confident that X is false. If X is true, it implies there are high-value ways of spending my time that I am not currently doing. Plenty of smart people I know/read believe X; but plenty do not"
It sounds like that should maybe be enough to coax me into taking action about X. But the problem is that I don't think it's that hard to put me in this kind of epistemic state. Eg, if I were to read the right blogs, I think I could be brought into that state for a bunch of different values of X. A few of the top of my head that seem plausible:
So I don't feel super trusting of my epistemic state. I guess I feel a sort of epistemic learned helplessness, where I am suspicious of smart bloggers' ability to get me to think an issue is important and worth dedicating my life to.
Not totally sure how to resolve this, though I suppose it would involve some sort of "going off on my own and actually thinking deeply about what it should take to convince me"
I feel the same. I think there are just a lot of problems which one could try to solve/solve which are increasing the good in the world. The difference between alignment and the rest seems to be the probability at which humans will go extinct is much higher.
...and take out a bunch of loans...
That part really shouldn't be necessary (even if it may be rational, conditional on some assumptions). In the event that you do decide to devote your time to helping, whether for dignity or whatever else, you should be able to get funding to cover most reasonable forms of upskilling and/or seeing-if-you-can-help trial period.
That said, I think step one would be to figure out where your comparative advantage lies (80,000 hours folk may have thoughts, among others). Certainly some people should be upskilling in ML/CS/Math - though an advanced degree may not be most efficient -, but there are other ways to help.
I realize this doesn't address the deciding-what's-true aspect.
I'd note there that I don't think much detailed ML knowledge is necessary to follow Eliezer's arguments on this. Most of the ML-dependent parts can be summarized as [we don't know how to do X], [we don't have any clear plan that we expect will tell us how to do X], similarly for Y, Z, [Either X, Y or Z is necessary for safe AGI].
Beyond that, I think you only need a low prior on our bumping into a good solution while fumbling in the dark and a low prior on sufficient coordination, and things look quite gloomy. Probably you also need to throw in some pessimism on getting safe AI systems to fundamentally improve our alignment research.
you should be able to get funding to cover most reasonable forms of upskilling and/or seeing-if-you-can-help trial period.
Hi Joe! I wonder if you have any pointers as to how to get help? I would like to try to help while being able to pay for rent and food. I think right now I'm may not br articulate enough to write grant proposals and get funding, so I think I could also use somebody to talk to to figure out what's the most high impact thing I could do.
I wonder if you'd be willing to chat / know anybody who is?
Something like the 80,000 hours career advice seems like a good place to start - or finding anyone who has a good understanding of the range of possibilities (mine is a bit too narrowly slanted towards technical AIS).
If you've decided on the AIS direction, then AI Safety Support is worth a look - they do personal calls for advice, and have many helpful links.
That said, I wouldn't let the idea of "grant proposals" put you off. The forms you'd need to fill for the LTFF are not particularly complicated, and they do give grants for e.g. upskilling - you don't necessarily need a highly specific/detailed plan.
If you don't have a clear idea where you might fit in, then the advice links above should help.
If/when you do have a clear idea, don't worry about whether you can articulate it persuasively. If it makes sense, then people will be glad to hear it - and to give you pointers (e.g. fund managers).
E.g. there's this from Evan Hubinger (who helps with the LTFF):
if you have any idea of any way in which you think you could use money to help the long-term future, but aren’t currently planning on applying for a grant from any grant-making organization, I want to hear about it. Feel free to send me a private message on the EA Forum or LessWrong. I promise I’m not that intimidating :)
Also worth bearing in mind as a general principle that if almost everything you try succeeds, you're not trying enough challenging things. Just make sure to take negative outcomes as useful information (often you can ask for specific feedback too). There's a psychological balance to be struck here, but trying at least a little more than you're comfortable with will generally expand your comfort zone and widen your options.
Thank you so much! I didn't know 80k does advising! In terms of people with knowledge on the possibilities... I have a background and a career path that doesn't end up giving me a lot of access to people who know, so I'll definitely try to get help at 80k.
Also worth bearing in mind as a general principle that if almost everything you try succeeds, you're not trying enough challenging things. Just make sure to take negative outcomes as useful information (often you can ask for specific feedback too). There's a psychological balance to be struck here, but trying at least a little more than you're comfortable with will generally expand your comfort zone and widen your options.
This was very encouraging! Thank you.
to maximise my chance to die with dignity I should quit my job and take out a bunch of loans try to turbo through an advanced degree in machine learning
This is probably pretty tangential to the overall point of your post, but you definitely don't need to take loans for this, since you could apply for funding from Open Philanthropy's early-career funding for individuals interested in improving the long-term future or the Long-Term Future Fund.
You don't have to have a degree in machine learning. Besides machine learning engineering or machine learning research there are plenty of other ways to help reduce existential risk from AI, such as:
Personally, my estimate of the probability of doom is much lower than Eliezer's, but in any case, I think it's worthwhile to carefully consider how to maximize your positive impact on the world, whether that involves reducing existential risk from AI or not.
I'd second the recommendation for applying for career advising from 80,000 Hours or scheduling a call with AI Safety Support if you're open to working on AI safety.
I can't help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.
As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of "rationalist/rationalist-adjacent" SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.
In essence, I doubt there's much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than "a random AI alignment researcher" or "a superforecaster making a guess after watching a few Rob Miles videos" (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).
I suggest ~all reasonable attempts at idealised aggregate wouldn't take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from "pretty worried" to "pessimistic" (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I'd attribute large shifts in this aggregate mostly to Yudkowsky's cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.
None of this is cause for complacency: even if p(screwed) isn't ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I'm not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the 'LW cluster' trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you'd get right now.
If Yudkowsky is needlessly pessimistic, I guess we get an extra decade of time. How are we going to use it? Ten years later, will we feel just as hopeless as today, and hope that we get another extra decade?
This phrasing bothers me a bit. It presupposes that it is only a matter of time; that there's no error about the nature of the threat AGI poses, and no order-of-magnitude error in the timeline. The pessimism is basically baked in.
Fair point. We might get an extra century. Until then, it may turn out that we can somehow deal with the problem, for example by having a competent and benevolent world government that can actually prevent the development of superhuman AIs (perhaps by using millions of exactly-human-level AIs who keep each other in check and together endlessly scan all computers on the planet).
I mean, a superhuman AI is definitely going to be a problem of some kind; at least economically and politically. But in best case, we may be able to deal with it. Either because we somehow got more competent quickly, or because we had enough time to become more competent gradually.
Maybe even this is needlessly pessimistic, but in such case I don't see how it is.
I'm sympathetic to the position you feel you're in. I'm sorry it's currently like that.
I think you should be quite convinced by the point you're taking out loans to study, and that the apparent plurality of the LessWrong commentariat is unlikely to be sufficient evidence to reach that level of convincement – just my feeling.
I'm hoping some more detailed arguments for doom will be posted in the near future and that will help many people reach their own conclusions not based on information cascades, etc.
Lastly, I do think people should be more "creative" in finding ways to boost log odds of survival. Direct research might make sense for some, but if you'd need to go back to the school for it, there are maybe other things you should brainstorm and consider.
Sorry if this is a silly question, but what exactly are “log odds” and what do they mean in this context?
Odds are an alternative way of presenting probabilities. 50% corresponds to 1:1, 66.66..% corresponds to 1:2, 90% corresponds to 1:9, etc. 33.33..% correspond to 2:1 odds, or, with the first number as as a 1, 1:0.5 odds.
Log odds, or bits, are the logarithm of probabilities expressed as 1:x odds. In some cases, they can be a more natural way of thinking about probabilities (see e.g., here.)
Well, at least one of us thinks you're going to die and to maximize your chance to die with dignity you should quit your job, say bollocks to it all, and enjoy the sunshine while you still can!
Don't look at opinions, look for data and facts. Speculations, opinions or beliefs cannot be the basis on which you take decisions or update your knowledge. It's better to know few things, but with high confidence.
Ask yourself, which hard data points are there in favour of doom-soon?
Facts and data are of limited use without a paradigm to conceptualize them. If you have some you think are particularly illuminative though by all means share them here.
My main point is that there is not enough evidence for a strong claim like doom-soon. In absence of hard data anybody is free to cook up argument pro or against doom-soon.
You may not like my suggestion, but I would strongly advise to get deeper into the field and understand it better yourself, before taking important decisions.
In terms of paradigms, you may have a look at why building AI-software development is hard (easy to get to 80% accurate, hellish to get to 99%), AI-winters and hype cycles (disconnect between claims-expectations and reality), the development of dangerous technologies (nuclear, biotech) and how stability has been achieved.
I want to say that this is the most sensible post I've seen on LW these last few weeks. Thanks for saying what I think a lot of people needed to hear, and that I personally didn't feel able to capture and write down well enough.
"My guess is that people who are concluding P(Doom) is high will each need to figure out how to live with it for themselves."
The following perspective helps me feel better.
First, it's not news that AGI poses a significant threat to humanity. I felt seriously worried when I first encountered this idea in 2018 listening to Eliezer on the Sam Harris podcast. The "Death With Dignity" post revived these old fears, but it didn't reveal new dangers that were previously unknown to me.
Second, many humans have dealt with believing "P(impending Doom) = high" at many times throughout history. COVID, Ukraine, famine in Yemen, WWI, WWII, the Holocaust, 9/11, incarceration in the US, Mongol conquests, the Congo Free State's atrocities, the Great Purge, the Cambodian genocide, the Great Depression, the Black Death, the Putumayo genocide, the Holodomor, the Trail of Tears, the Syrian civil war, the Irish Potato Famine, the Vietnam War, slavery, colonialism, more wars, more terrorist attacks, more plagues, more famines, more genocides, etc.
It's easy to read these events without simulating how the people involved felt. In most of these cases, "Doom" didn't mean "everyone on the planet will die" but rather "I will lose everything" or "everything that matters will be destroyed" or "everyone I know will die" or "my culture will die" or "my family will die" or simply "I will die." The thoughts these people had probably felt considerably more devastating than the thoughts I'm having these days. Heck, some people get terrified just reading the news. I'm not alone in worrying about the future.
Again: "I'm not alone in worrying about the future." I find this immensely comforting. People are not at all oblivious to the world having problems, even if they disagree with me on which problems are the most important. Everyone has fears.
I wonder if 20 years from now we will look back at these posts and be amazed about how a community that considers itself to be la crème de la crème at epistemology, ended up believing in the end of the world coming on our lifetime, as so many other religions before. I sincerely prefer that to the alternative.
When you say that you know more of Yudkowsky's reasoning and find it compelling, is that meant to imply that he has a more explicit, stronger argument for P(doom) which he hasn't shared elsewhere? Or is the information publicly accessible?
Within the last two weeks, two sets of things happened: Eliezer Yudkowsky shared a post expressing extreme pessimism about humanity's likelihood of surviving AGI, and a number of AI research labs published new, highly impressive results. The combination of these two has resulted in a lot of people feeling heightened concern about the AI situation and how we ought to be reacting to it.
There have been calls to pull "fire alarms", proposals for how to live with this psychologically, people deciding to enter the AI Alignment field, and a significant increase in the number of AI posts submitted to LessWrong.
The following is my own quick advice:
1. Form your own models and anticipations. It's easy to hear the proclamations of [highly respected] others and/or everyone else reacting and then reflexively update to "aaahhhh". I'm not saying "aaahhhh" isn't the right reaction, but I think for any given person it should come after a deliberate step of processing arguments and evidence to figure out your own anticipations. I feel that A concrete bet offer to those with short AI timelines is a great example of this. It lists lots of specific things the authors do (or rather don't) expect to see. What 2026 looks like is another example I'd point to of someone figuring out their own anticipations.[1]
2. Figure out your own psychology (while focusing on what's true). Eliezer, Turntrout, and landfish have each written about their preferred way of reacting to the belief that P(Doom) is very high. My guess is that people who are concluding P(Doom) is high will each need to figure out how to live with it for themselves. My caution is just that whatever strategy you figure out should keep you in touch with reality (or your best estimate of it), even if it's uncomfortable.
3. Be gentle with yourself. You might find yourself confronting some very upsetting realities right now. That's okay! The realities are very upsetting (imo). This might take some time to process. Let yourself do that if you need. It might take you weeks, months, or even longer to come terms with the situation. That's okay.
4. Don't take rash action, and be cautious about advocating rash action. As far as I know, even the people with the shortest timelines still measure them in years, not weeks. Whatever new information came out these past two weeks, we can take some time to process and figure out our plans. Maybe we should figure out some new bold plans, but I think if that's true, it was already true before. We can start having conversations now, but upheavals don't need to happen this second.
5. You may need to be patient about contributions. Feelings of direness about the situation can bleed into feelings of urgency. As above, we're probably not getting AGI this week (or even this year) according to anyone, so it's okay to take time to figure out what you (or anyone else) should do. It's possible that you're not in a position to make any contributions right now, and that's also an okay reality. You can work on getting yourself into a better position to contribute without having to do something right now.
6. Beware the unilateralist's curse. I'm seeing a lot of proposals on LessWrong that aren't just for research directions, but also things that look more like political action. Political action may well be very warranted, but it's often something that both can't be taken back and affects a shared game board. If you're thinking to start on plans like this, I urge you to engage very seriously with the AI x-risk community before doing things. The fact that certain plans haven't been enacted already is likely not because no one had thought of them before, but because those plans are fraught.
It might help encourage people to form their opinions if I note that there isn't broad consensus about P(Doom). Eliezer has most recently expressed his view, but not everyone agrees – some people just haven't posted about it recently and I don't think their minds have been entirely changed by recent developments. I am personally inclined to agree with Eliezer's take, but that's because I know more of his reasoning and find it compelling. People shouldn't conclude that there's consensus in the "AI leadership", and even if there is, you should still think it through for yourself.