LessWrong team member / moderator. I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.
Maybe I'm not sure what you mean by "have a respectable position."
I'm not sure either, but for example if a scientist publishes an experiment, and then another scientist with a known track record of understanding things publishes a critique, the first scientist can't respectably dismiss the critique unsubstantially.
I think:
And for at least some of those points, I'm personally like "my intuitions lean in the other direction as y'all Camp B people, but, I don't feel like I can really confidently stand by it, I don't think the argument has been made very clearly.
Things I have in mind.
On "How Hard is Success?"
On "How Bad is Failure?"
For all of those, like, I know the arguments against, but I my own current take is not like >75% on any of these given model uncertainty, and meanwhile, if your probabilities are below 50% on the relevant MIRI-ish argument, you also have to worry about...
...
Other geopolitical concerns and considerations
I am maybe starting from the assumption that sooner-or-later, alignment research would reach this point, and "well, help the alignment research progress as fast possible" seemed like a straightforward goal on the meta-level and is one of the obvious things to be shooting for whether or not it's currently tractable.
I have a current set of projects, but, the meta-level one is "look for ways to systematically improve people's ability to quickly navigate confusing technical problems, and see what works, and stack as many interventions as we can."
(I can drill into that but I'm not sure what level your skepticism was one)
works on safety, and because international coordination seems possible, so we need to focus on regulation and policy before ASI kills everyone
Is this actually a quadrant? Or, I'm not sure I'm parsing what the axes are.
FYI the particular thing I care about here is less "our usual literal frontpage criteria", and more that people doing things that seem aimed at bypassing the frontpage criteria particularly for the purpose of getting attention on a promotional thing. (which may be somewhat different than kave's take)
I mean why can't the people in these LW conversations say something like "yeah the lion's share of the relevant power is held by people who don't sincerely hold A or B / Type 1/2"?
Here are some groups who I think are currently relevant (not in any particular order, without quite knowing where I'm going with this yet)
Okay writing that out turned out to be most of the time I felt like spending right now, but, the next questions I have in mind are "who has power, here, over what?", or "what is the 'relevant'" power?
But, roughly:
a) a ton of the above are "fake" in some sense
b) On the worldscale, the OpenPhil/Constellation/Anthropic cluster is relatively weak.
c) within OpenPhil/Constellation/Anthropic, there are people more like Dario, Holden, Jack Clark, and Dustin, and people who are more rank-and-file-EA/AI-ish. I think the latter are fake the way I think you think things are fake. I think the former are differently fake from the way I think you think things are fake.
d) there are ton of vague EA/AI-safety people that I think are fake in the way you think they are fake but they don't really matter except for The Median Researcher Problem
I want that statement too but it doesn't seem like what this one's job is. This one is for establishing common knowledge "it'd be bad to build ASI under current conditions", there probably wouldn't be enough consensus that "...and that means stop building AGI" yet so it wouldn't be very useful to try.
Yes, you have to understand that they are not doing the "have a respectable position" thing.
I think this is false for the particular people I have in mind (who to be clear are filtered for "are willing to talk to me", but, they seem like relatively central members of a significant class of people).
Maybe I'm not sure what you mean by "have a respectable position."
(I think a large chunk of the problem is "the full argument is complicated, people aren't tracking all the pieces." Which is maybe not "intellectually respectable", though IMO understandable, and importantly different from 'biased.' But, when I sit someone down and make sure to lay out all the pieces and make sure they understand each piece and understand how the pieces fit together, we still hit a few pieces where they are like 'yeah I just don't buy that.')
Maybe I should just check, are you consciously trying to deny a conflict-type stance, and consciously trying to (conflictually) assert the mistake-type stance, as a strategy?
I'm saying you seem to be conflict-stanced in a way that is inaccurate to me (i.e. you are making a mistake in your conflict)
I think it's current to be conflict-stanced, but, you need like a good model who/what the enemy is ("sniper mindset"), and the words you're saying sound to me like you don't (in a way that seems more tribally biased than you usually seem to me)
A thing I've been thinking lately (this is reposted from a twitter thread where it was more squarely on-topic, but seems like a reasonable part of the convo here, riffing off the Tsvi thread)
It matters a fair amount which biases people have, here.
A few different biases pointing in the "Plan 2 for bad reasons" direction:
1. a desire for wealth
2. a desire to not look weird in front of your friends
3. a desire to "be important"
4. subtly different from #3, a desire to "have some measure of control over the big forces playing out."
5. a desire to be high status in the world's Big League status tier
6. action bias, i.e. inability to do nothing.
7. bias against abstract arguments that you can't clearly see/measure, or against sitting with confusion.
8. bias to think things are basically okay and you don't need to majorly change your life plans.
9. being annoyed at people who keep trying to stop you or make you feel bad or be lower status.
10. being annoyed at people who seem to be missing an important point when they argue with you about AI doom.
All of these seem in-play to me. But depending on these things' relative strength, they suggest different modes of dealing with the problem.
A reason I am optimistic about If Anyone Builds It, is because I think it has a decent chance of changing how reasonable it feels to say "yo guys I do think we might kill everyone" in front of both your friends, and high status big wigs.
This won't be sufficient to change decisionmaking at labs, or people's propensity to join labs. But I think the next biggest bias is more like "feeling important/in-control," than "having wealth."
I view this all pretty cynically. BUT, not necessarily pessimistically. If IABIED works, then, the main remaining blockers are "having an important/in-control thing to do, which groks some arguments that are more abstract."
You don't have to get rid of people's bias, or defeat them memetically. (Although those are both live options too). You can also steer towards a world where their bias becomes irrelevant.
So, while I do really wanna grab people by the collar and shout:
"Dudes, Dario is one of the most responsible parties for causing the race conditions that Anthropic uses to justify their actions, and he lied or was grossly negligent about whether Anthropic would push the capabilities frontier. If your 'well Plan 2 seems more tractable' attitude doesn't include 'also, our leader was the guy who gave the current paradigm to OpenAI, then left OpenAI, gained early resources via deception/communication-negligence and caused the current race to start in earnest' you have a missing mood and that's fucked."
...I also see part of my goal as trying to help the "real alignment work" technical field reach a point where the stuff-that-needs-doing is paradigmatic enough that you can just point at it, and the action-biased-philosophy-averse lab "safety" people can just say "oh, sure it sounds obvious when you put it like that, why didn't you say that before?"
I think "want to feel traction/in-control" is more obviously a bias (and people vary in whether they read to me as having this bias.).
I think the attitude of "don't share core intuitions isn't a respectable position" is, well, idk you have that attitude if you want but I don't think it'd going to help you understand or persuade people.
There is no clear line between Type 2 and Type 3 people, it can be true that people both have earnest intellectual positions you find frustrating but it's fundamentally an intellectual disagreement and also they can have biases that you-and-they would both agree would be bad, and the percent of causal impact of the intellectual-positions and biases can range from like 99% to 1% in either direction.
Even among people who do seem to have followed the entire Alignment-Is-Hard arguments and understand them all, a bunch just say "yeah I don't buy that as obvious" to stop that seems obvious to me (or in some cases, which seems obviously like '>50% likely' to me). And they seem sincere to me.
This results in sad conversations where you're like 'but, clearly, I am picking up that you've got biased cope in you!' and they're like 'but, clearly, I can tell that my thinking here is not just cope, I know what my copey-thinking feels like, and the particular argument we're talking about doesn't feel like that. (and both are correct – there was cope but it lay elsewhere).
It doesn't feel worth the time to listen to the whole thing for me, but, if someone pulled out the highlights of "what particular new analogies are there? any particular nuances to the presentation that were interesting?" I'd be interested.