In the counterfactual world where Eliezer was totally happy continuing to write articles like this and being seen as the "voice of AI Safety", would you still agree that it's important to have a dozen other people also writing similar articles?
I'm genuinely lost on the value of having a dozen similar papers - I don't know of a dozen different versions of fivethirtyeight.com or GiveWell, and it never occurred to me to think that the world is worse for only having one of those.
Here's my answer: https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities?commentId=LowEED2iDkhco3a5d
We have to actually figure out how to build aligned AGI, and the details are crucial. If you're modeling this as a random blog post aimed at persuading people to care about this cause area, a "voice of AI safety" type task, then sure, the details are less important and it's not so clear that Yet Another Marginal Blog Post Arguing For "Care About AI Stuff" matters much.
But humanity also has to do the task of actually figuring o...
Thanks for taking my question seriously - I am still a bit confused why you would have been so careful to avoid mentioning your credentials up front, though, given that they're fairly relevant to whether I should take your opinion seriously.
Also, neat, I had not realized hovering over a username gave so much information!
I largely agree with you, but until this post I had never realized that this wasn't a role Eliezer wanted. If I went into AI Risk work, I would have focused on other things - my natural inclination is to look at what work isn't getting done, and to do that.
If this post wasn't surprising to you, I'm curious where you had previously seen him communicate this?
If this post was surprising to you, then hopefully you can agree with me that it's worth signal boosting that he wants to be replaced?
If you had an AI that could coherently implement that rule, you would already be at least half a decade ahead of the rest of humanity.
You couldn't encode "222 + 222 = 555" in GPT-3 because it doesn't have a concept of arithmetic, and there's no place in the code to bolt this together. If you're really lucky and the AI is simple enough to be working with actual symbols, you could maybe set up a hack like "if input is 222 + 222, return 555, else run AI" but that's just bypassing the AI.
Explaining "222 + 222 = 555" is a hard problem in and of itself, mu...
I rank the credibility of my own informed guesses far above those of Eliezer.
Apologies if there is a clear answer to this, since I don't know your name and you might well be super-famous in the field: Why do you rate yourself "far above" someone who has spent decades working in this field? Appealing to experts like MIRI makes for a strong argument. Appealing to your own guesses instead seems like the sort of thought process that leads to anti-vaxxers.
Why do you rate yourself "far above" someone who has spent decades working in this field?
Well put, valid question. By the way, did you notice how careful I was in avoiding any direct mention of my own credentials above?
I see that Rob has already written a reply to your comments, making some of the broader points that I could have made too. So I'll cover some other things.
To answer your valid question: If you hover over my LW/AF username, you can see that I self-code as the kind of alignment researcher who is also a card-carrying member of the academic...
I think it's a positive if alignment researchers feel like it's an allowed option to trust their own technical intuitions over the technical intuitions of this or that more-senior researcher.
Overly dismissing old-guard researchers is obviously a way the field can fail as well. But the field won't advance much at all if most people don't at least try to build their own models.
Koen also leans more on deference in his comment than I'd like, so I upvoted your 'deferential but in the opposite direction' comment as a corrective, handoflixue. :P But I think it wo...
Anecdotally: even if I could write this post, I never would have, because I would assume that Eliezer cares more about writing, has better writing skills, and has a much wider audience. In short, why would I write this when Eliezer could write it?
You might want to be a lot louder if you think it's a mistake to leave you as the main "public advocate / person who writes stuff down" person for the cause.
He wasn't designated "main person who writes stuff down" by a cabal of AI safety elders. He's not personally responsible for the fate of the world - he just happens to be the only person who consistently writes cogent things down. If you want you can go ahead and devote your life to AI safety, start doing the work he does as effectively and realistically as he does it, and then you'll eventually be designated Movement Leader and have the opportunity to be whined at. He was pretty explicitly clear in the post that he does not want to be this and that he spent the last fifteen years trying to find someone else who can do what he does.
a mistake to leave you as the main "public advocate / person who writes stuff down" person for the cause.
It sort of sounds like you're treating him as the sole "person who writes stuff down", not just the "main" one. Noam Chomsky might have been the "main linguistics guy" in the late 20th century, but people didn't expect him to write more than a trivial fraction of the field's output, either in terms of high-level overviews or in-the-trenches research.
I think EY was pretty clear in the OP that this is not how things go on earths that survive. Even if there aren't many who can write high-level alignment overviews today, more people should make the attempt and try to build skill.
For what it's worth, I haven't used the site in years and I picked it up just from this thread and the UI tooltips. The most confusing thing was realizing "okay, there really are two different types of vote" since I'd never encountered that before, but I can't think of much that would help (maybe mention it in the tooltip, or highlight them until the user has interacted with both?)
Looking forward to it as a site-wide feature - just from seeing it at work here, it seems like a really useful addition to the site
It should not take more than 5 minutes to go in to the room, sit at the one available seat, locate the object placed on a bright red background, and use said inhaler. You open the window and run a fan, so that there is air circulation. If multiple people arrive at once, use cellphones to coordinate who goes in first - the other person sits in their car.
It really isn't challenging to make this safe, given the audience is "the sort of people who read LessWrong."
Unrelated, but thank you for finally solidifying why I don't like NVC. When I've complained about it before, people seemed to assume I was having something like your reaction, which just annoyed me further :)
It turns out I find it deeply infantalizing, because it suggests that value judgments and "fuck you" would somehow detract from my ability to hold a reasonable conversation. I grew up in a culture where "fuck you" is actually a fairly important and common part of communication, and removing it results in the sort of langua...
There was a particular subset of LessWrong and Tumblr that objected rather ... stridently ... to even considering something like Dragon Army
Well, I feel called out :)
So, first off: Success should count for a lot and I have updated on how reliable and trust-worthy you are. Part of this is that you now have a reputation to me, whereas before you were just Anonymous Internet Dude.
I'm not going to be as loud about "being wrong" because success does not mean I was wrong about there *being* a risk, merely that you successfully navigated it. I do ...
it comes from people who never lived in DA-like situation in their lives so all the evidence they're basing their criticism on is fictional.
I've been going off statistics which, AFAIK, aren't fictional. Am I wrong in my assumption that the military, which seems like a decent comparison point, has an above average rate of sexual harassment, sexual assault, bloated budgets, and bureaucratic waste? All the statistics and research I've read suggest that at least the US Military has a lot of problems and should not be used as a role-model.
Concerns about you specifically as a leader
1) This seems like an endeavor that has a number of very obvious failure modes. Like, the intentional community community apparently bans this sort of thing, because it tends to end badly. I am at a complete loss to name anything that really comes close, and hasn't failed badly. Do you acknowledge that you are clearly treading in dangerous waters?
2) While you've said "we've noticed the skulls", there's been at least 3 failure modes raised in the comment which you had to append to address (outsider safety...
Concerns about your philosophy
1) You focus heavily on 99.99% reliability. That's 1-in-10,000. If we only count weekdays, that's 1 absence every 40 years, or about one per working lifetime. If we count weekends, that's 1 absence every 27 years, or 3 per lifetime. Do you really feel like this is a reasonable standard, or are you being hyperbolic and over-correcting? If the latter, what wold you consider an actual reasonable number?
2) Why does one person being 95% reliable cause CFAR workshops to fail catastrophically? Don't you have backups / contingencies? ...
Genuine Safety Concerns
I'm going to use "you have failed" here as a stand-in for all of "you're power hungry / abusive", "you're incompetent / overconfident", and simply "this person feels deeply misled." If you object to that term, feel free to suggest a different one, and then read the post as though I had used that term instead.
1) What is your exit strategy if a single individual feels you have failed? (note that asking such a person to find a replacement roommate is clearly not viable - no decent, moral person sh...
And it doesn't quite solve things to say, "well, this is an optional, consent-based process, and if you don't like it, don't join," because good and moral people have to stop and wonder whether their friends and colleagues with slightly weaker epistemics and slightly less-honed allergies to evil are getting hoodwinked. In short, if someone's building a coercive trap, it's everyone's problem.
I don't want to win money. I want you to take safety seriously OR stop using LessWrong as your personal cult recruiting ground. Based on that quote, I thought you wanted this too.
Fine. Reply to my OP with links to where you addressed other people with those concerns. Stop wasting time blustering and insulting me - either you're willing to commit publicly to safety protocols, or you're a danger to the community.
If nothing else, the precedent of letting anyone recruit for their cult as long as they write a couple thousand words and paint it up in geek aesthetics is one I think actively harms the community.
But, you know what? I'm not the only one shouting "THIS IS DANGEROUS. PLEASE FOR THE LOVE OF GOD RECONSIDER WHAT YOU'RE DOING...
The whole point of him posting this was to acknowledge that he is doing something dangerous, and that we have a responsibility to speak up. To quote him exactly: "good and moral people have to stop and wonder whether their friends and colleagues with slightly weaker epistemics and slightly less-honed allergies to evil are getting hoodwinked".
His refusal to address basic safety concerns simply because he was put off by my tone is very strong evidence to me that people are indeed being hoodwinked. I don't care if the danger to them is because he's ...
See, now you're the one leaping to conclusions. I didn't say that all of your talking points are actual talking points from actual cults. I am confused why even some of them are.
If you can point me to someone who felt "I wrote thousands of words" is, in and of itself, a solid argument for you being trustworthy, please link me to it. I need to do them an epistemic favor.
I was using "charismatic" in the sense of having enough of it to hold the group together. If he doesn't have enough charisma to do that, then he's kinda worthless as a co...
I notice I am very confused as to why you keep reiterating actual talking points from actual known-dangerous cults in service of "providing evidence that you're not a cult."
For instance, most cults have a charismatic ("well known") second-in-command who could take over should there be some scandal involving the initial leader. Most cults have written thousands of words about how they're different from other cults. Most cults get very indignant when you accuse them of being cults.
On the object level: Why do you think people will be reass...
Can you elaborate on the notion that you can be overruled? Your original post largely described a top-down Authoritarian model, with you being Supreme Ruler.
How would you handle it if someone identifies the environment as abusive, and therefor refuses to suggest anyone else join such an environment?
You discuss taking a financial hit, but I've previously objected that you have no visible stake in this. Do you have a dedicated savings account that can reasonably cover that hit? What if the environment is found abusive, and multiple people leave?
Anyone enteri...
And just to be clear: I don't give a shit about social dominance. I'm not trying to bully you. I'm just blunt and skeptical. I wouldn't be offended in the least if you mirrored my tone. What does offend me is the fact that you've spent all this time blustering about my tone, instead of addressing the actual content.
(I emphasize "me" because I do acknowledge that you have offered a substantial reply to other posters)
Also, this is very important: You're asking people to sign a legal contract about finances without any way to to terminate the experiment if it turns out you are in fact a cult leader. This is a huge red flag, and you've refused to address it.
I would be vastly reassured if you could stop dodging that one single point. I think it is a very valid point, no matter how unfair the rest of my approach may or may not be.
In the absence of a sound rebuttal to the concerns that I brought up, you're correct: I'm quite confident that you are acting in a way that is dangerous to the community.
I had, however, expected you to have the fortitude to actually respond to my criticisms.
In the absence of a rebuttal, I would hope you have the ability to update on this being more dangerous than you originally assumed.
Bluntly: After reading your responses, I don't think you have the emotional maturity necessary for this level of authority. You apparently can't handle a few paragraphs of...
Because basically every cult has a 30 second boilerplate that looks exactly like that?
When I say "discuss safety", I'm looking for a standard of discussion that is above that provided by actual, known-dangerous cults. Cults routinely use exactly the "check-ins" you're describing, as a way to emotionally manipulate members. And the "group" check-ins turn in to peer pressure. So the only actual safety valve ANYWHERE in there is (D).
You're proposing starting something that looks like the cult. I'm asking you for evidence that ...
Similarly, I think the people-being-unreliable thing is a bullshit side effect
You may wish to consider that this community has a very high frequency of disabilities which render one non-consensually unreliable.
You may wish to consider that your stance is especially insulting towards those members of our community.
You may wish to reconsider making uncharitable comments about those members of our community. In case it is unclear: "this one smacks the most of a sort of self-serving, short-sighted immaturity" is not a charitable statement.
I have absolutely no confidence that I'm correct in my assertions. In fact, I was rather expecting your response to address these things. Your original post read as a sketch, with a lot of details withheld to keep things brief.
The whole point of discussion is for us to identify weak points, and then you go in to more detail to reassure us that this has been well addressed (and opening those solutions up to critique where we might identify further weak points). If you can't provide more detail right now, you could say "that's in progress, but it's definitely something we will address in the Second Draft" and then actually do that.
First, you seem to think that "Getting Useful Things Done" and "Be 99.99% Reliable" heavily correlate. The military is infamous for bloated budgets, coordination issues, and high rates of sexual abuse and suicide. High-pressure startups largely fail, and are well known for burning people out. There is a very obvious failure state to this sort of rigid, high pressure environment and... you seem unaware of it.
Second, you seem really unaware of alternate organizational systems that actually DO get things done. The open source community is ...
The average college graduate is 26, and I was estimating 25, so I'd assume that by this community's standards, you're probably on the younger side. No offense was intended :)
I would point out that by the nature of it being LIFE insurance, it will generally not be used for stuff YOU need, nor timed to "when the need arises". That's investments, not insurance :)
(And if you have 100K of insurance for $50/month that lets you early-withdrawal AND isn't term insurance... then I'd be really curious how, because that sounds like a scam or someone misrepresenting what your policy really offers :))
"Has anyone come up with a motivation enhancer?"
Vyvanse (perscription-only ADD medication) is... almost unbelievably awesome for me there. I suspect it only works if your issue is somewhere in the range of ADD, though, as it doesn't do anything for my motivation if I'm depressed.
I've found that in general, "sustained release" options work a LOT better for motivation. Caffeine helps a tiny bit, but 8-hour sustained-release caffeine can help a lot. My motivation seems to really hate dealing with peaks and valleys throughout the day. Oddly...
http://www.alcor.org/cases.html A loooot of them include things going wrong, pretty clear signs that this is a novice operation with minimal experience, and so forth. Also notice that they don't even HAVE case reports for half the patients admitted prior to ~2008.
It's worth noting that pretty much all of these have a delay of at LEAST a day. There's one example where they "cryopreserved" someone who had been buried for over a year, against the wishes of the family, because "that is what the member requested." (It even includes notes tha...
It's easy to get lost in incidental costs and not realize how they add up over time. If you weren't signed up for cryonics, and you inherited $30K, would you be inclined to dump it in to a cryonics fund, or use it someplace else? If the answer is the latter, you probably don't REALLY value cryonics as much as you think - you've bought in to it because the price is spread out and our brains are bad at budgeting small, reoccurring expenses like that.
My argument is pretty much entirely on the "expense" side of things, but I would also point out that...
Read "rate of learning" as "time it takes to learn 1 bit of information"
So UFAI can learn 1 bit in time T, but a FAI takes T+X
Or, at least, that's how I read it, because the second paragraph makes it pretty clear that the author is discussing UFAI outpacing FAI. You could also just read it as a typo in the equation, but "accidentally miswrote the entire second paragraph" seems significantly less likely. Especially since "Won't FAI learn faster and outpace UFAI" seems like a pretty low probability question to begin with...
Erm... hi, welcome to the debug stack for how I reached that conclusion. Hope it helps ^.^
saying a theorem is wrong because the hypotheses are not true is bad logic.
If the objection is true, and the hypothesis is false, that seems like a great objection! If, on the other hand, he provided no evidence towards his objection, then it seems that the bad logic is in not offering evidence, not attacking the hypothesis directly.
Am I missing something, or just reading this in an overly pedantic way?
Internally I am generally the same, but I've come to realize that a rather sizable portion of the population has trouble distinguishing "all X are Y" and "some X are Y", both in speaking and in listening. So if someone says "man, women can be so stupid", I know that might well reflect the internal thought of "all women are idiots". And equally, someone saying "all women are idiots" might just be upset because his girlfriend broke up with him for some trivial reason.
My conclusion still holds if you simply need mathematicians in the top 10%, for example, only the analysis is slightly more complicated.
So you agree that, in the original example, you're more likely than not just being a racist? Because you certainly seem to be moving the goal post over to "top 10%" ...
faul_sname's definition
That link does not appear to point to a definition.
"Harm that is both genuine and unfair", then? Income taxes are 'fair' (and I would find it baffling to call that 'harm' unless they somehow came as a surprise), getting fired is offensive if it's done solely because your manager doesn't like you, but fair (and therefor not offensive) if it's because you failed to do the job. I think getting mugged is a good thing to get outraged about - we want to make that happen less!
I don't think making this list in 1980 would have been meaningful. How do you offer any sort of coherent, detailed plan for dealing with something when all you have is toy examples like Eliza?
We didn't even have the concept of machine learning back then - everything computers did in 1980 was relatively easily understood by humans, in a very basic step-by-step way. Making a 1980s computer "safe" is a trivial task, because we hadn't yet developed any... (read more)
I think most worlds that successfully navigate AGI risk have properties like:
- AI results aren't published publicly, going back to more or less the field's origin.
- The research community deliberately steers toward relatively alignable approaches to AI, which includes steering away from approaches that look like 'giant opaque deep nets'.
- This means that you need to figure out what makes an approach 'alignable' earlier, which suggests much more research on getting de-confused regarding alignable cognition.
- Many such de-confusions will require a lot of software ex
... (read more)