The people you need to soften/moderate your message to reach (or who need social proof in order to get involved) are seldom going to be the ones who can think clearly about this stuff.
I strongly agree with this. (I wrote a post about it years ago.[1])
Even of the people who were not "in early", of the ones who I most respect, and who seem to me to be doing the most impressive work that I'm most grateful to have in the world, 0 of them needed hand-holding or "outreach" to get them on board.
Writing the sequences was an amazing, high quality intervention that continues to pay dividends to this day. I think writing on the internet about the things that you think are important is a fantastic strategy, at least if your intellectual taste is good.
The payoff of most of the "movement building" and "community building" seems much murkier to me. At least some of it was clearly positive, but I don't know if it was positive on net (I think a smaller and more intense EA than the one we have in practice probably would have been better).
There's selection bias in kinds of community building I observed, but it seems to me that community building was more effective to the extent that it was "just get together and try to do the thing" instead of "try to do outreach to get other people on board".
eg
The best MIRIx groups > CFAR workshops[2] > EAGs > EA onboarding programs at universities.
The HP:MoR wrap parties seem to have pretty notably impactful though, and those were closer to outreach.
I keep thinking that I should crosspost this to LessWrong and the EA forum, but haven't yet, since I need to rename it well.
If you, dear reader, think that I really should do that, bugging me about it seems likely to make it more likely to happen.
To be clear, CFAR workshops were always community building interventions, and fell far short of the standard that I would expect of a group working to seriously develop a science of human rationality, but they were still much more "contentful" and about making progress than most community building interventions are.
just automatically clicking upvote as I start reading a post with an interesting first paragraph by someone whose name
Dude! You upvote the posts before you read them?!
This is probably pretty common, now that I consider it, but it seems like it's doing a diservice to the karma system. Shouldn't we upvote posts that we got value out of instead of ones that we expect to get value out of?
But I do still feel like I want to give a tiny little fuck you to the precommitment, which is why this post is exactly 499 words.
I laughed out loud at this.
I think you nailed it perfectly.
The single most important question in AGI safety is: Is the AGI trying to do something that we didn’t intend for it to be trying to do?
Thinking aloud: Is this right?
It seems like "maybe, for a narrow notion of 'intend'."
Like, if we build a superintelligence to solve problems that are eluding humans, the superintelligence will be doing things that we didn't intend (because we didn't think of them) all the time.
Goodhart's law strikes again! Once there's a pressure to write every day, its usefulness as an indicator is over.
I mean without doing the experiment it's hard to know if writing every day is causal or not. It seems totally plausible that it's a habit one has to build that becomes easier over time, and a person who builds that habit ends up having more shots on goal and so ends up writing more good stuff and building and audience, which builds the self-sustaining loop.
That is, while it was bad for the people who didn't get rule of law, they were a separate enough category that this mostly didn't "leak into" undermining the legal mechanisms that helped their societies become productive and functional in the first place.
I'm speaking speculatively here, but I don't know that it didn't leak out and undermine the mechanism that supported productive and functional societies. The sophisticated SJW in me suggests that this is part of what caused the eventual (though not yet complete) erosion of those mechanisms.
It seems like if you have "rule of law" that isn't evenly distributed, actually what you have is collusion by one class of people to maintain a set of privileges at the expense of another class of people, where one of the privileges is a sand-boxed set of norms that govern dealings within the privileged class, but with the pretense that the norms are universal.
This kind of pretense seems like it could be corrosive: people can see that the norms that society proclaims as universal actually aren't. This reinforces a a sense that the norms aren't real at all (or at least) a justified sense that the ideals that underly those norms are mostly rationalizations papering over the collusion of the privileged class.
eg when it looks like "capitalism" and "democracy" are scams supporting "white supremacy", you grow disenchanted with capitalism and democracy, and stop doing the work to maintain the incomplete versions of those social mechanisms that were previously doing work in your society?
No matter how weird the answers are, don’t correct them.
Love it.
As we argued for at the time, training on a purely predictive loss should, even in the limit, give you a predictor, not an agent—and we’ve now seen this stay true even through substantial scaling (though there is still some chance this will break at some point).
Is there anyone who significantly disputes this?
I'm not trying to ask a rhetorical question ala "everyone already thinks this, this isn't an update". I'm trying to ascertain if there's a consensus on this point.
I've understood Eliezer to sometimes assert something like "if you optimize a system for sufficiently good predictive power, a consequentialist agent will fall out, because an agent is actually the best solution to a broad range of prediction tasks."
[Though I want to emphasize that that's my summary, which he might not endorse.]
Does anyone still think that or something like that?
Speaking as one of the people involved with running the Survival and Flourishing Fund, I'm confident that such an org would easily raise money from SFF and SFF's grant-speculators (modulo some basic due diligence panning out).
I myself would allocate at least 50k.
(SFF ~only grants to established organizations, but if someone wanted to start a project to do this, and got an existing 401c3 to fiscally sponsor the project, that totally counts.)