Update: My full response to Holden is now here.
As Holden said, I generally think that Holden's objections for SI "are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI)," and we are working hard to fix both categories of issues.
In this comment I would merely like to argue for one small point: that the Singularity Institute is undergoing comprehensive changes — changes which I believe to be improvements that will help us to achieve our mission more efficiently and effectively.
Holden wrote:
I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years...
Louie Helm was hired as Director of Development in September 2011. I was hired as a Research Fellow that same month, and ma...
...which is not to say, of course, that things were not improving before September 2011. It's just that the improvements have accelerated quite a bit since then.
For example, Amy was hired in December 2009 and is largely responsible for these improvements:
Our bank accounts have been consolidated, with 3-4 people regularly checking over them.
In addition to reviews, should SI implement a two-man rule for manipulating large quantities of money? (For example, over 5k, over 10k, etc.)
note that these improvements would not and could not have happened without more funding than the level of previous years
Really? That's not obvious to me. Of course you've been around for all this and I haven't, but here's what I'm seeing from my vantage point...
Recent changes that cost very little:
Stuff that costs less than some other things SI had spent money on, such as funding Ben Goertzel's AGI research or renting downtown Berkeley apartments for the later visiting fellows:
A lot of charities go through this pattern before they finally work out how to transition from a board-run/individual-run tax-deductible band of conspirators to being a professional staff-run organisation tuned to doing the particular thing they do. The changes required seem simple and obvious in hindsight, but it's a common pattern for it to take years, so SIAI has been quite normal, or at the very least not been unusually dumb.
(My evidence is seeing this pattern close-up in the Wikimedia Foundation, Wikimedia UK (the first attempt at which died before managing it, the second making it through barely) and the West Australian Music Industry Association, and anecdotal evidence from others. Everyone involved always feels stupid at having taken years to achieve the retrospectively obvious. I would be surprised if this aspect of the dynamics of nonprofits had not been studied.)
edit: Luke's recommendation of The Nonprofit Kit For Dummies looks like precisely the book all the examples I know of needed to have someone throw at them before they even thought of forming an organisation to do whatever it is they wanted to achieve.
Things that cost money:
I don't think this response supports your claim that these improvements "would not and could not have happened without more funding than the level of previous years."
I know your comment is very brief because you're busy at minicamp, but I'll reply to what you wrote, anyway: Someone of decent rationality doesn't just "try things until something works." Moreover, many of the things on the list of recent improvements don't require an Amy, a Luke, or a Louie.
I don't even have past management experience. As you may recall, I had significant ambiguity aversion about the prospect of being made Executive Director, but as it turned out, the solution to almost every problem X has been (1) read what the experts say about how to solve X, (2) consult with people who care about your mission and have solved X before, and (3) do what they say.
When I was made Executive Director and phoned our Advisors, most of them said "Oh, how nice to hear from you! Nobody from SingInst has ever asked me for advice before!"
That is the kind of thing that makes me want to say that SingInst has "tested every method except the method of trying."
Donor database, strategic plan, s...
Luke has just told me (personal conversation) that what he got from my comment was, "SIAI's difficulties were just due to lack of funding" which was not what I was trying to say at all. What I was trying to convey was more like, "I didn't have the ability to run this organization, and knew this - people who I hoped would be able to run the organization, while I tried to produce in other areas (e.g. turning my back on everything else to get a year of FAI work done with Marcello or writing the Sequences) didn't succeed in doing so either - and the only reason we could hang on long enough to hire Luke was that the funding was available nonetheless and in sufficient quantity that we could afford to take risks like paying Luke to stay on for a while, well before we knew he would become Executive Director".
Update: I came out of a recent conversation with Eliezer with a higher opinion of Eliezer's general rationality, because several things that had previously looked to me like unforced, forseeable mistakes by Eliezer now look to me more like non-mistakes or not-so-forseeable mistakes.
You're allowed to say these things on the public Internet?
Well, at our most recent board meeting I wasn't fired, reprimanded, or even questioned for making these comments, so I guess I am. :)
I just fell in love with SI.
It's Luke you should have fallen in love with, since he is the one turning things around.
It's Luke you should have fallen in love with, since he is the one turning things around.
On the other hand I can count with one hand the number of established organisations I know of that would be sociologically capable of ceding power, status and control to Luke the way SingInst did. They took an untrained intern with essentially zero external status from past achievements and affiliations and basically decided to let him run the show (at least in terms of publicly visible initiatives). It is clearly the right thing for SingInst to do and admittedly Luke is very tall and has good hair which generally gives a boost when it comes to such selections - but still, making the appointment goes fundamentally against normal human behavior.
(Where I say "count with one hand" I am not including the use of any digits thereupon. I mean one.)
...and admittedly Luke is very tall and has good hair which generally gives a boost when it comes to such selections...
It doesn't matter that I completely understand why this phrase was included, I still found it hilarious in a network sitcom sort of way.
Well, all we really know is that he chose to. It may be that everyone he works with then privately berated him for it.
That said, I share your sentiment.
Actually, if SI generally endorses this sort of public "airing of dirty laundry," I encourage others involved in the organization to say so out loud.
The largest concern from reading this isn't really what it brings up in management context, but what it says about the SI in general. Here an area where there's real expertise and basic books that discuss well-understood methods and they didn't do any of that. Given that, how likely should I think it is that when the SI and mainstream AI people disagree that part of the problem may be the SI people not paying attention to basics?
This makes me wonder... What "for dummies" books should I be using as checklists right now? Time to set a 5-minute timer and think about it.
these are all literally from the Nonprofits for Dummies book. [...] The history I've heard is that SI [...]
\
failed to read Nonprofits for Dummies,
I remember that, when Anna was managing the fellows program, she was reading books of the "for dummies" genre and trying to apply them... it's just that, as it happened, the conceptual labels she accidentally happened to give to the skill deficits she was aware of were "what it takes to manage well" (i.e. "basic management") and "what it takes to be productive", rather than "what it takes to (help) operate a nonprofit according to best practices". So those were the subjects of the books she got. (And read, and practiced.) And then, given everything else the program and the organization was trying to do, there wasn't really any cognitive space left over to effectively notice the possibility that those wouldn't be the skills that other people afterwards would complain that nobody acquired and obviously should have known to. The rest of her budgeted self-improvement effort mostly went toward overcoming self-defeating emotional/social blind spots and motivated cognition. (And I remember...
Note that this was most of the purpose of the Fellows program in the first place -- [was] to help sort/develop those people into useful roles, including replacing existing management
FWIW, I never knew the purpose of the VF program was to replace existing SI management. And I somewhat doubt that you knew this at the time, either. I think you're just imagining this retroactively given that that's what ended up happening. For instance, the internal point system used to score people in the VFs program had no points for correctly identifying organizational improvements and implementing them. It had no points for doing administrative work (besides cleaning up the physical house or giving others car rides). And it had no points for rising to management roles. It was all about getting karma on LW or writing conference papers. When I first offered to help with the organization directly, I was told I was "too competent" and that I should go do something more useful with my talent, like start another business... not "waste my time working directly at SI."
Obviously we need How to be Lukeprog for Dummies. Luke appears to have written many fragments for this, of course.
Beating oneself up with hindsight bias is IME quite normal in this sort of circumstance, but not actually productive. Grilling the people who failed makes it too easy to blame them personally, when it's a pattern I've seen lots and lots, suggesting the problem is not a personal failing.
Agreed entirely - it's definitely not a mark of a personal failing. What I'm curious about is how we can all learn to do better at the crucial rationalist skill of making use of the standard advice about prosaic tasks - which is manifestly a non-trivial skill.
just spouting off Deep Wisdom that he did not benefit from spouting
Indeed. The proper response, which is surely worth contemplation, would have been:
Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win.
Sun Tzu
I'm pretty sure their combined salaries are lower than the cost of the summer fellows program that SI was sponsoring four or five years ago. Also, if you accept my assertion that Luke could find a way to do it on a limited budget, why couldn't somebody else?
Givewell is interested in finding charities that translate good intentions into good results. This requires that the employees of the charity have low akrasia, desire to learn about and implement organizational best practices, not suffer from dysrationalia, etc. I imagine that from Givewell's perspective, it counts as a strike against the charity if some of the charity's employees have a history of failing at any of these.
I'd rather hear Eliezer say "thanks for funding us until we stumbled across some employees who are good at defeating their akrasia and care about organizational best practices", because this seems like a better depiction of what actually happened. I don't get the impression SI was actively looking for folks like Louie and Luke.
This level of freedom is the dream of every researcher on the planet. Yet, it's unclear why these resources should be devoted to your projects.
Because some people like my earlier papers and think I'm writing papers on the most important topic in the world?
It's impressive that you all have found a way to hack the system and get paid to develop yourselves as researchers outside of the academic system...
Note that this isn't uncommon. SI is far from the only think tank with researchers who publish in academic journals. Researchers at private companies do the same.
First, let me say that, after re-reading, I think that my previous post came off as condescending/confrontational which was not my intent. I apologize.
Second, after thinking about this for a few minutes, I realized that some of the reason your papers seem so fluffy to me is that they argue what I consider to be obvious points. In my mind, of course we are likely "to develop human-level AI before 2100." Because of that, I may have tended to classify your work as outreach more than research.
But outreach is valuable. And, so that we can factor out the question of the independent contribution of your research, having people associated with SIAI with the publications/credibility to be treated as experts has gigantic benefits in terms of media multipliers (being the people who get called on for interviews, panels, etc). So, given that, I can see a strong argument for publication support being valuable to the overall organization goals regardless of any assessment of the value of the research.
Note that this isn't uncommon. SI is far from the only think tank with researchers who publish in academic journals. Researchers at private companies do the same.
My only point was that,...
This topic is something I've been thinking about lately. Do SIers tend to have superior general rationality, or do we merely escape a few particular biases? Are we good at rationality, or just good at "far mode" rationality (aka philosophy)? Are we good at epistemic but not instrumental rationality? (Keep in mind, though, that rationality is only a ceteris paribus predictor of success.)
Or, pick a more specific comparison. Do SIers tend to be better at general rationality than someone who can keep a small business running for 5 years? Maybe the tight feedback loops of running a small business are better rationality training than "debiasing interventions" can hope to be.
Of course, different people are more or less rational in different domains, at different times, in different environments.
This isn't an idle question about labels. My estimate of the scope and level of people's rationality in part determines how much I update from their stated opinion on something. How much evidence for Hypothesis X (about organizational development) is it when Eliezer gives me his opinion on the matter, as opposed to when Louie gives me his opinion on the matter? When Person B proposes to take on a totally new kind of project, I think their general rationality is a predictor of success — so, what is their level of general rationality?
Wow, I'm blown away by Holden Karnofsky, based on this post alone. His writing is eloquent, non-confrontational and rational. It shows that he spent a lot of time constructing mental models of his audience and anticipated its reaction. Additionally, his intelligence/ego ratio appears to be through the roof. He must have learned a lot since the infamous astroturfing incident. This is the (type of) person SI desperately needs to hire.
Emotions out of the way, it looks like the tool/agent distinction is the main theoretical issue. Fortunately, it is much easier than the general FAI one. Specifically, to test the SI assertion that, paraphrasing Arthur C. Clarke,
Any sufficiently advanced tool is indistinguishable from an agent.
one ought to formulate and prove this as a theorem, and present it for review and improvement to the domain experts (the domain being math and theoretical computer science). If such a proof is constructed, it can then be further examined and potentially tightened, giving new insights to the mission of averting the existential risk from intelligence explosion.
If such a proof cannot be found, this will lend further weight to the HK's assertion that SI appears to be poorly qualified to address its core mission.
Any sufficiently advanced tool is indistinguishable from agent.
I shall quickly remark that I, myself, do not believe this to be true.
It's complicated. A reply that's true enough and in the spirit of your original statement, is "Something going wrong with a sufficiently advanced AI that was intended as a 'tool' is mostly indistinguishable from something going wrong with a sufficiently advanced AI that was intended as an 'agent', because math-with-the-wrong-shape is math-with-the-wrong-shape no matter what sort of English labels like 'tool' or 'agent' you slap on it, and despite how it looks from outside using English, correctly shaping math for a 'tool' isn't much easier even if it "sounds safer" in English." That doesn't get into the real depths of the problem, but it's a start. I also don't mean to completely deny the existence of a safety differential - this is a complicated discussion, not a simple one - but I do mean to imply that if Marcus Hutter designs a 'tool' AI, it automatically kills him just like AIXI does, and Marcus Hutter is unusually smart rather than unusually stupid but still lacks the "Most math kills you, safe math is rare and hard" outlook that is implicitly denied by the idea that once you're trying to design a tool, safe math gets easier somehow. This is much the same problem as with the Oracle outlook - someone says something that sounds safe in English but the problem of correctly-shaped-math doesn't get very much easier.
Though it's not as detailed and technical as many would like, I'll point readers to this bit of related reading, one of my favorites:
Yudkowsky (2011). Complex value systems are required to realize valuable futures.
The real danger of Oracle AI, if I understand it correctly, is the nasty combination of (i) by definition, an Oracle AI has an implicit drive to issue predictions most likely to be correct according to its model, and (ii) a sufficiently powerful Oracle AI can accurately model the effect of issuing various predictions. End result: it issues powerfully self-fulfilling prophecies without regard for human values. Also, depending on how it's designed, it can influence the questions to be asked of it in the future so as to be as accurate as possible, again without regard for human values.
My understanding of an Oracle AI is that when answering any given question, that question consumes the whole of its utility function, so it has no motivation to influence future questions.
It could acausally trade with its other instances, so that a coordinated collection of many instances of predictors would influence the events so as to make each other's predictions more accurate.
Even if we accepted that the tool vs. agent distinction was enough to make things "safe", objection 2 still boils down to "Well, just don't build that type of AI!", which is exactly the same keep-it-in-a-box/don't-do-it argument that most normal people make when they consider this issue. I assume I don't need to explain to most people here why "We should just make a law against it" is not a solution to this problem, and I hope I don't need to argue that "Just don't do it" is even worse...
More specifically, fast forward to 2080, when any college kid with $200 to spend (in equivalent 2012 dollars) can purchase enough computing power so that even the dumbest AIXI approximation schemes are extremely effective, good enough so that creating an AGI agent would be a week's work for any grad student that knew their stuff. Are you really comfortable living in that world with the idea that we rely on a mere gentleman's agreement not to make self-improving AI agents? There's a reason this is often viewed as an arms race, to a very real extent the attempt to achieve Friendly AI is about building up a suitably powerful defense against unfriendly AI before ...
Wow, I'm blown away by Holden Karnofsky, based on this post alone. His writing is eloquent, non-confrontational and rational. It shows that he spent a lot of time constructing mental models of his audience and anticipated its reaction. Additionally, his intelligence/ego ratio appears to be through the roof.
Agreed. I normally try not to post empty "me-too" replies; the upvote button is there for a reason. But now I feel strongly enough about it that I will: I'm very impressed with the good will and effort and apparent potential for intelligent conversation in HoldenKarnofsky's post.
Now I'm really curious as to where things will go from here. With how limited my understanding of AI issues is, I doubt a response from me would be worth HoldenKarnofsky's time to read, so I'll leave that to my betters instead of adding more noise. But yeah. Seeing SI ideas challenged in such a positive, constructive way really got my attention. Looking forward to the official response, whatever it might be.
I think the correct objection is something you can't quite see in google maps. If you program an AI to do nothing but output directions, it will do nothing but output directions. If those directions are for driving, you're probably fine. If those directions are big and complicated plans for something important, that you follow without really understanding why you're doing (and this is where most of the benefits of working with an AGI will show up), then you could unknowingly take over the world using a sufficiently clever scheme.
Also note that it would be a lot easier for the AI to pull this off if you let it tell you how to improve its own design. If recursively self-improving AI blows other AI out of the water, then tool AI is probably not safe unless it is made ineffective.
This does actually seem like it would raise the bar of intelligence needed to take over the world somewhat. It is unclear how much. The topic seems to me to be worthy of further study/discussion, but not (at least not obviously) a threat to the core of SIAI's mission.
Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.
I'm a sysadmin. When I want to get something done, I routinely come up with something that answers the question, and when it does that reliably I give it the power to do stuff on as little human input as possible. Often in daemon mode, to absolutely minimise how much it needs to bug me. Question-answerer->tool->agent is a natural progression just in process automation. (And this is why they're called "daemons".)
It's only long experience and many errors that's taught me how to do this such that the created agents won't crap all over everything. Even then I still get surprises.
They may act according to various parameters they read in from the system environment. I expect they will be developed to a level of complication where they have something that could reasonably be termed a model of the world. The present approach is closer to perceptual control theory, where the sysadmin has the model and PCT is part of the implementation. 'Cos it's more predictable to the mere human designer.
Capacity for self-improvement is an entirely different thing, and I can't see a sysadmin wanting that - the sysadmin would run any such improvements themselves, one at a time. (Semi-automated code refactoring, for example.) The whole point is to automate processes the sysadmin already understands but doesn't want to do by hand - any sysadmin's job being to automate themselves out of the loop, because there's always more work to do. (Because even in the future, nothing works.)
I would be unsurprised if someone markets a self-improving system for this purpose. For it to go FOOM, it also needs to invent new optimisations, which is presently a bit difficult.
Edit: And even a mere daemon-like automated tool can do stuff a lot of people regard as unFriendly, e.g. high frequency trading algorithms.
Is it just me, or do Luke and Eliezer's initial responses appear to send the wrong signals? From the perspective of an SI critic, Luke's comment could be interpreted as saying "for us, not being completely incompetent is worth bragging about", and Eliezer's as "we're so arrogant that we've only taken two critics (including Holden) seriously in our entire history". These responses seem suboptimal, given that Holden just complained about SI's lack of impressive accomplishments, and being too selective about whose feedback to take seriously.
While I have sympathy with the complaint that SI's critics are inarticulate and often say wrong things, Eliezer's comment does seem to be indicative of the mistake Holden and Wei Dai are describing. Most extant presentations of SIAI's views leave much to be desired in terms of clarity, completeness, concision, accessibility, and credibility signals. This makes it harder to make high quality objections. I think it would be more appropriate to react to poor critical engagement more along the lines of "We haven't gotten great critics. That probably means that we need to work on our arguments and their presentation," and less along the lines of "We haven't gotten great critics. That probably means that there's something wrong with the rest of the world."
This. I've been trying to write something about Eliezer's debate with Robin Hanson, but the problem I keep running up against is that Eliezer's points are not clearly articulated at all. Even making my best educated guesses about what's supposed to go in the gaps in his arguments, I still ended up with very little.
I'm in the process of writing a summary and analysis of the key arguments and points in that debate.
The most recent version runs at 28 pages - and that's just an outline.
Luke isn't bragging, he's admitting that SI was/is bad but pointing out it's rapidly getting better. And Eliezer is right, criticisms of SI are usually dumb. Could their replies be interpreted the wrong way? Sure, anything can be interpreted in any way anyone likes. Of course Luke and Eliezer could have refrained from posting those replies and instead posted carefully optimized responses engineered to send nothing but extremely appealing signals of humility and repentance.
But if they did turn themselves into politicians, we wouldn't get to read what they actually think. Is that what you want?
Luke isn't bragging, he's admitting that SI was/is bad but pointing out it's rapidly getting better.
But the accomplishments he listed (e.g., having a strategic plan, website redesign) are of the type that Holden already indicated to be inadequate. So why the exhaustive listing, instead of just giving a few examples to show SI is getting better and then either agreeing that they're not yet up to par, or giving an argument for why Holden is wrong? (The reason I think he could be uncharitably interpreted as bragging is that he would more likely exhaustively list the accomplishments if he was proud of them, instead of just seeing them as fixes to past embarrassments.)
And Eliezer is right, criticisms of SI are usually dumb.
I'd have no problem with "usually" but "all except two" seems inexcusable.
But if they did turn themselves into politicians, we wouldn't get to read what they actually think. Is that what you want?
Do their replies reflect their considered, endorsed beliefs, or were they just hurried remarks that may not say what they actually intended? I'm hoping it's the latter...
But the accomplishments he listed (e.g., having a strategic plan, website redesign) are of the type that Holden already indicated to be inadequate. So why the exhaustive listing, instead of just giving a few examples to show SI is getting better and then either agreeing that they're not yet up to par, or giving an argument for why Holden is wrong?
Presume that SI is basically honest and well-meaning, but possibly self-deluded. In other words, they won't outright lie to you, but they may genuinely believe that they're doing better than they really are, and cherry-pick evidence without realizing that they're doing so. How should their claims of intending to get better be evaluated?
Saying "we're going to do things better in the future" is some evidence about SI intending to do better, but rather weak evidence, since talk is cheap and it's easy to keep thinking that you're really going to do better soon but there's this one other thing that needs to be done first and we'll get started on the actual improvements tomorrow, honest.
Saying "we're going to do things better in the future, and we've fixed these three things so far" is stronger evidence, since it shows tha...
Luke's comment could be interpreted as saying "for us, not being completely incompetent is worth bragging about"
Really? I personally feel pretty embarrassed by SI's past organizational competence. To me, my own comment reads more like "Wow, SI has been in bad shape for more than a decade. But at least we're improving very quickly."
Also, I very much agree with Beckstead on this: "Most extant presentations of SIAI's views leave much to be desired in terms of clarity, completeness, concision, accessibility, and credibility signals. This makes it harder to make high quality objections." And also this: "We haven't gotten great critics. That probably means that we need to work on our arguments and their presentation."
Really?
Yes, I think it at least gives a bad impression to someone, if they're not already very familiar with SI and sympathetic to its cause. Assuming you don't completely agree with the criticisms that Holden and others have made, you should think about why they might have formed wrong impressions of SI and its people. Comments like the ones I cited seem to be part of the problem.
I personally feel pretty embarrassed by SI's past organizational competence. To me, my own comment reads more like "Wow, SI has been in bad shape for more than a decade. But at least we're improving very quickly."
That's good to hear, and thanks for the clarifications you added.
Eliezer's comment makes me think that you, specifically, should consider collecting your criticisms and putting them in Main where Eliezer is more likely to see them and take the time to seriously consider them.
Luke's comment addresses the specific point that Holden made about changes in the organization given the change in leadership.
Holden said:
I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years, particularly regarding the last couple of statements listed above. That said, SI is an organization and it seems reasonable to judge it by its organizational track record, especially when its new leadership is so new that I have little basis on which to judge these staff.
Luke attempted to provide (for the reader) a basis on which to judge these staff members.
Eliezer's response was... characteristic of Eliezer? And also very short and coming at a busy time for him.
Are there other specific critiques you think should have made Eliezer's list, or is it that you think he should not have drawn attention to their absence?
Many of Holden's criticisms have been made by others on LW already. He quoted me in Objection 1. Discussion of whether Tool-AI and Oracle-AI are or are not safe have occurred numerous times. Here's one that I was involved in. Many people have criticized Eliezer/SI for not having sufficiently impressive accomplishments. Cousin_it and Silas Barta have questioned whether the rationality techniques being taught by SI (and now the rationality org) are really effective.
Thanks for taking the time to express your views quite clearly--I think this post is good for the world (even with a high value on your time and SI's fundraising ability), and that norms encouraging this kind of discussion are a big public good.
I think the explicit objections 1-3 are likely to be addressed satisfactorily (in your judgment) by less than 50,000 words, and that this would provide a good opportunity for SI to present sharper versions of the core arguments---part of the problem with existing materials is certainly that it is difficult and unrewarding to respond to a nebulous and shifting cloud of objections. A lot of what you currently view as disagreements with SI's views may get shifted to doubts about SI being the right organization to back, which probably won't get resolved by 50,000 words.
This post is highly critical of SIAI — both of its philosophy and its organizational choices. It is also now the #1 most highly voted post in the entire history of LessWrong — higher than any posts by Eliezer or myself.
I shall now laugh harder than ever when people try to say with a straight face that LessWrong is an Eliezer-cult that suppresses dissent.
Either I promoted this and then forgot I'd done so, or someone else promoted it - of course I was planning to promote it, but I thought I'd planned to do so on Tuesday after the SIAIers currently running a Minicamp had a chance to respond, since I expected most RSS subscribers to the Promoted feed to read comments only once (this is the same reason I wait a while before promoting e.g. monthly quotes posts). On the other hand, I certainly did upvote it the moment I saw it.
I agree (as a comparative outsider) that the polite response to Holden is excellent. Many (most?) communities -- both online communities and real-world organisations, especially long-standing ones -- are not good at it for lots of reasons, and I think the measured response of evaluating and promoting Holden's post is exactly what LessWrong members would hope LessWrong could do, and they showed it succeeded.
I agree that this is good evidence that LessWrong isn't just an Eliezer-cult. (The true test would be if Elizier and another long-standing poster were dismissive to the post, and then other people persuaded them otherwise. In fact, maybe people should roleplay that or something, just to avoid getting stuck in an argument-from-authority trap, but that's a silly idea. Either way, the fact that other people spoke positively, and Elizier and other long-standing posters did too, is a good thing.)
However, I'm not sure it's as uniquely a victory for the rationality of LessWrong as it sounds. In responose to srdiamond, Luke quoted tenlier saying "[Holden's] critique mostly consists of points that are pretty persistently bubbling beneath the surface around here, and get brought up qu...
Eliezer, I upvoted you and was about to apologize for contributing to this rumor myself, but then found this quote from a copy of the Roko post that's available online:
Meanwhile I'm banning this post so that it doesn't (a) give people horrible nightmares and (b) give distant superintelligences a motive to follow through on blackmail against people dumb enough to think about them in sufficient detail, though, thankfully, I doubt anyone dumb enough to do this knows the sufficient detail. (I'm not sure I know the sufficient detail.)
Perhaps your memory got mixed up because Roko subsequently deleted all of his other posts and comments? (Unless "banning" meant something other than "deleting"?)
Now I've got no idea what I did. Maybe my own memory was mixed up by hearing other people say that the post was deleted by Roko? Or Roko retracted it after I banned it, or it was banned and then unbanned and then Roko retracted it?
I retract my grandparent comment; I have little trust for my own memories. Thanks for catching this.
A lesson learned here. I vividly remembered your "Meanwhile I'm banning this post" comment and was going to remind you, but chickened out due to the caps in the great-grandparent which seemed to signal that you Knew What You Were Talking About and wouldn't react kindly to correction. Props to Wei Dai for having more courage than I did.
I'm surprised and disconcerted that some people might be so afraid of being rebuked by Eliezer as to be reluctant to criticize/correct him even when such incontrovertible evidence is available showing that he's wrong. Your comment also made me recall another comment you wrote a couple of years ago about how my status in this community made a criticism of you feel like a "huge insult", which I couldn't understand at the time and just ignored.
I wonder how many other people feel this strongly about being criticized/insulted by a high status person (I guess at least Roko also felt strongly enough about being called "stupid" by Eliezer to contribute to him leaving this community a few days later), and whether Eliezer might not be aware of this effect he is having on others.
Your comment also made me recall another comment you [Kip] wrote a couple of years ago about how my status in this community made a criticism of you feel like a "huge insult", which I couldn't understand at the time and just ignored.
My brain really, really does not want to update on the numerous items of evidence available to it that it can hit people much much harder now, owing to community status, than when it was 12 years old.
(nods) I've wondered this many times.
I have also at times wondered if EY is adopting the "slam the door three times" approach to prospective members of his community, though I consider this fairly unlikely given other things he's said.
Somewhat relatedly, I remember when lukeprog first joined the site, he and EY got into an exchange that from what I recall of my perspective as a completely uninvolved third party involved luke earnestly trying to offer assistance and EY being confidently dismissive of any assistance someone like luke could provide, and at the time I remember feeling sort of sorry for luke, who it seemed to me was being treated a lot worse than he deserved, and surprised that he kept at it.
The way that story ultimately turned out led me to decide that my model of what was going on was at least importantly incomplete, and quite possibly fundamentally wrongheaded, but I haven't further refined that model.
Reading Holden's transcript with Jaan Tallinn (trying to go over the whole thing before writing a response, due to having done Julia's Combat Reflexes unit at Minicamp and realizing that the counter-mantra 'If you respond too fast you may lose useful information' was highly applicable to Holden's opinions about charities), I came across the following paragraph:
...My understanding is that once we figured out how to get a computer to do arithmetic, computers vastly surpassed humans at arithmetic, practically overnight ... doing so didn't involve any rewriting of their own source code, just implementing human-understood calculation procedures faster and more reliably than humans can. Similarly, if we reached a good enough understanding of how to convert data into predictions, we could program this understanding into a computer and it would overnight be far better at predictions than humans - while still not at any point needing to be authorized to rewrite its own source code, make decisions about obtaining "computronium" or do anything else other than plug data into its existing hardware and algorithms and calculate and report the likely consequences of different courses of a
Jaan's reply to Holden is also correct:
... the oracle is, in principle, powerful enough to come up with self-improvements, but refrains from doing so because there are some protective mechanisms in place that control its resource usage and/or self-reflection abilities. i think devising such mechanisms is indeed one of the possible avenues for safety research that we (eg, organisations such as SIAI) can undertake. however, it is important to note the inherent instability of such system -- once someone (either knowingly or as a result of some bug) connects a trivial "master" program with a measurable goal to the oracle, we have a disaster in our hands. as an example, imagine a master program that repeatedly queries the oracle for best packets to send to the internet in order to minimize the oxygen content of our planet's atmosphere.
Obviously you wouldn't release the code of such an Oracle - given code and understanding of the code it would probably be easy, possibly trivial, to construct some form of FOOM-going AI out of the Oracle!
Holden seems to think this sort of development would happen naturally with the sort of AGI researchers we have nowadays, and I wish he'd spent a few years arguing with some of them to get a better picture of how unlikely this is.
While I can't comment on AGI researchers, I think you underestimate e.g. more mainstream AI researchers such as Stuart Russell and Geoff Hinton, or cognitive scientists like Josh Tenenbaum, or even more AI-focused machine learning people like Andrew Ng, Daphne Koller, Michael Jordan, Dan Klein, Rich Sutton, Judea Pearl, Leslie Kaelbling, and Leslie Valiant (and this list is no doubt incomplete). They might not be claiming that they'll have AI in 20 years, but that's likely because they are actually grappling with the relevant issues and therefore see how hard the problem is likely to be.
Not that it strikes me as completely unreasonable that we would have a major breakthrough that gives us AI in 20 years, but it's hard to see what the candidate would be. But I have only been thinking about these issues for a couple years, so I still maintain a pretty high degree of uncertainty about all of these claims.
I do think I basically agree with you re: inductive l...
I agree that top mainstream AI guy Peter Norvig was way the heck more sensible than the reference class of declared "AGI researchers" when I talked to him about FAI and CEV, and that estimates should be substantially adjusted accordingly.
I completely agree with the intent of this post. These are all important issues SI should officially answer. (Edit: SI's official reply is here.) Here are some of my thoughts:
I completely agree with objection 1. I think SI should look into doing exactly as you say. I also feel that friendliness has a very high failure chance and that all SI can accomplish is a very low marginal decrease in existential risk. However, I feel this is the result of existential risk being so high and difficult to overcome (Great Filter) rather than SI being so ineffective. As such, for them to engage this objection is to admit defeatism and millenialism, and so they put it out of mind since they need motivation to keep soldiering on despite the sure defeat.
Objection 2 is interesting, though you define AGI differently, as you say. Some points against it: Only one AGI needs to be in agent mode to realize existential risk, even if there are already billions of tool-AIs running safely. Tool-AI seems closer in definition to narrow AI, which you point out we already have lots of, and are improving. It's likely that very advanced tool-AIs will indeed be the first to achieve some measure of AGI capability.
Lack of impressive endorsements. [...] I feel that given the enormous implications of SI's claims, if it argued them well it ought to be able to get more impressive endorsements than it has. I have been pointed to Peter Thiel and Ray Kurzweil as examples of impressive SI supporters, but I have not seen any on-record statements from either of these people that show agreement with SI's specific views, and in fact (based on watching them speak at Singularity Summits) my impression is that they disagree.
This is key: they support SI despite not agreeing with SI's specific arguments. Perhaps you should, too, at least if you find folks like Thiel and Kurzweil sufficiently impressive.
In fact, this has always been roughly my own stance. The primary reason I think SI should be supported is not that their arguments for why they should be supported are good (although I think they are, or at least, better than you do). The primary reason I think SI should be supported is that I like what the organization actually does, and wish it to continue. The Less Wrong Sequences, Singularity Summit, rationality training camps, and even HPMoR and Less Wrong itself are all worth paying some amount of mo...
The primary reason I think SI should be supported is that I like what the organization actually does, and wish it to continue. The Less Wrong Sequences, Singularity Summit, rationality training camps, and even HPMoR and Less Wrong itself are all worth paying some amount of money for.
I think that my own approach is similar, but with a different emphasis. I like some of what they've done, so my question is how do encourage those pieces. This article was very helpful in prompting some thought into how to handle that. I generally break down their work into three categories:
Rationality (minicamps, training, LW, HPMoR): Here I think they've done some very good work. Luckily, the new spinoff will allow me to support these pieces directly.
Existential risk awareness (singularity summit, risk analysis articles): Here their record has been mixed. I think the Singularity Summit has been successful, other efforts less so but seemingly improving. I can support the Singularity Summit by continuing to attend and potentially donating directly if necessary (since it's been running positive in recent years, for the moment this does not seem necessary).
Original research (FAI, timeless decisio
I furthermore have to say that to raise this particular objection seems to me almost to defeat the purpose of GiveWell. After all, if we could rely on standard sorts of prestige-indicators to determine where our money would be best spent, everybody would be spending their money in those places already, and "efficient charity" wouldn't be a problem for some special organization like yours to solve.
I think Holden seems to believe that Thiel and Kurzweil endorsing SIAI's UFAI-prevention methods would be more like a leading epidemiologist endorsing the malaria-prevention methods of the Against Malaria Foundation (AMF) than it would be like Celebrity X taking a picture with some children for the AMF. There are different kinds of "prestige-indicator," some more valuable to a Bayesian-minded charity evaluator than others.
Firstly, I'd like to add to the chorus saying that this is an incredible post; as a supporter of SI, it warms my heart to see it. I disagree with the conclusion - I would still encourage people to donate to SI - but if SI gets a critique this good twice a decade it should count itself lucky.
I don't think GiveWell making SI its top rated charity would be in SI's interests. In the long term, SI benefits hugely when people are turned on to the idea of efficient charity, and asking them to swallow all of the ideas behind SI's mission at the same time will put them off. If I ran GiveWell and wanted to give an endorsement to SI, I might break the rankings into multiple lists: the most prominent being VillageReach-like charities which directly do good in the near future, then perhaps a list for charities that mitigate broadly accepted and well understood existential risks (if this can be done without problems with politics), and finally a list of charities which mitigate more speculative risks.
I find it unfortunate that none of the SIAI research associates have engaged very deeply in this debate, even LessWrong regulars like Nesov and cousin_it. This is part of the reason why I was reluctant to accept (and ultimately declined) when SI invited me to become a research associate, that I would feel less free to to speak up both in support of SI and in criticism of it.
I don't think this is SI's fault, but perhaps there are things it could do to lessen this downside of the research associate program. For example it could explicitly encourage the research associates to publicly criticize SI and to disagree with its official positions, and make it clear that no associate will be blamed if someone mistook their statements to be official SI positions or saw them as reflecting badly on SI in general. I also write this comment because just being consciously aware of this bias (in favor of staying silent) may help to counteract it.
I don't usually engage in potentially protracted debates lately. A very short summary of my disagreement with Holden's object-level argument part of the post is (1) I don't see in what way can the idea of powerful Tool AI be usefully different from that of Oracle AI, and it seems like the connotations of "Tool AI" that distinguish it from "Oracle AI" follow from an implicit sense of it not having too much optimization power, so it might be impossible for a Tool AI to both be powerful and hold the characteristics suggested in the post; (1a) the description of Tool AI denies it goals/intentionality and other words, but I don't see what they mean apart from optimization power, and so I don't know how to use them to characterize Tool AI; (2) the potential danger of having a powerful Tool/Oracle AI around is such that aiming at their development doesn't seem like a good idea; (3) I don't see how a Tool/Oracle AI could be sufficiently helpful to break the philosophical part of the FAI problem, since we don't even know which questions to ask.
Since Holden stated that he's probably not going to (interactively) engage the comments to this post, and writing this up in a self-contained way is a lot of work, I'm going to leave this task to the people who usually write up SingInst outreach papers.
Not sure about the others, but as for me, at some point this spring I realized that talking about saving the world makes me really upset and I'm better off avoiding the whole topic.
Thank you very much for writing this. I, um, wish you hadn't posted it literally directly before the May Minicamp when I can't realistically respond until Tuesday. Nonetheless, it already has a warm place in my heart next to the debate with Robin Hanson as the second attempt to mount informed criticism of SIAI.
It looks to me as though Holden had the criticisms he expresses even before becoming "informed", presumably by reading the sequences, but was too intimidated to share them. Perhaps it is worth listening to/encouraging uninformed criticisms as well as informed ones?
Note the following criticism of SI identified by Holden:
Being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously.
1) Most criticism of key ideas underlying SIAI's strategies does not reference SIAI, e.g. Chris Malcolm's "Why Robots Won't Rule" website is replying to Hans Moravec.
2) Dispersed criticism, with many people making local points, e.g. those referenced by Wei Dai, is still criticism and much of that is informed and reasonable.
3) Much criticism is unwritten, e.g. consider the more FAI-skeptical Singularity Summit speaker talks, or takes the form of brief responses to questions or the like. This doesn't mean it isn't real or important.
4) Gerrymandering the bounds of "informed criticism" to leave almost no one within bounds is in general a scurrilous move that one should bend over backwards to avoid.
5) As others have suggested, even within the narrow confines of Less Wrong and adjacent communities there have been many informed critics. Here's Katja Grace's criticism of hard takeoff (although I am not sure how separate it is from Robin's). Here's Brandon Reinhart's examination of SIAI, which includes some criticism and brings more in comments. Here's Kaj Sotala's comparison of FHI and SIAI. And there are of course many detailed and often highly upvoted comments in response to various SIAI-discussing posts and threads, many of which you have participated in.
This is a bit exasperating. Did you not see my comments in this thread? Have you and Eliezer considered that if there really have been only two attempts to mount informed criticism of SIAI, then LessWrong must be considered a massive failure that SIAI ought to abandon ASAP?
Wei Dai has written many comments and posts that have some measure of criticism, and various members of the community, including myself, have expressed agreement with them. I think what might be a problem is that such criticisms haven't been collected into a single place where they can draw attention and stir up drama, as Holden's post has.
There are also critics like XiXiDu. I think he's unreliable, and I think he'd admit to that, but he also makes valid criticisms that are shared by other LW folk, and LW's moderation makes it easy to sift his comments for the better stuff.
Perhaps an institution could be designed. E.g., a few self-ordained SingInst critics could keep watch for critiques of SingInst, collect them, organize them, and update a page somewhere out-of-the-way over at the LessWrong Wiki that's easily checkable by SI folk like yourself. LW philanthropists like User:JGWeissman or User:Rain could do it, for example. If SingInst wanted to signal various good things then it could even consider paying a few people to collect and organize criticisms of SingInst. Presumably if there are good critiques out there then finding them would be well worth a small investment.
I think what might be a problem is that such criticisms haven't been collected into a single place where they can draw attention and stir up drama, as Holden's post has.
I put them in discussion, because well, I bring them up for the purpose of discussion, and not for the purpose of forming an overall judgement of SIAI or trying to convince people to stop donating to SIAI. I'm rarely sure that my overall beliefs are right and SI people's are wrong, especially on core issues that I know SI people have spent a lot of time thinking about, so mostly I try to bring up ideas, arguments, and possible scenarios that I suspect they may not have considered. (This is one major area where I differ from Holden: I have greater respect for SI people's rationality, at least their epistemic rationality. And I don't know why Holden is so confident about some of his own original ideas, like his solution to Pascal's Mugging, and Tool-AI ideas. (Well I guess I do, it's probably just typical human overconfidence.))
Having said that, I reserve the right to collect all my criticisms together and make a post in main in the future if I decide that serves my purposes, although I suspect that without the inf...
Also, I had expected that SI people monitored LW discussions, not just for critiques, but also for new ideas in general
I read most such (apparently-relevant from post titles) discussions, and Anna reads a minority. I think Eliezer reads very few. I'm not very sure about Luke.
To those who think Eliezer is exaggerating: please link me to "informed criticism of SIAI."
It would help if you could elaborate on what you mean by "informed".
Most of what Holden wrote, and much more, has been said by other people, excluding myself, before.
I don't have the time right now to wade through all those years of posts and comments but might do so later.
And if you are not willing to take into account what I myself wrote, for being uninformed, then maybe you will however agree that at least all of my critical comments that have been upvoted to +10 (ETA changed to +10, although there is a lot more on-topic at +5) should have been taken into account. If you do so you will find that SI could have updated some time ago on some of what has been said in Holden's post.
I'm not sure how much he's put into writing, but Ben Goertzel is surely informed. One might argue he comes to the wrong conclusions about AI danger, but it's not from not thinking about it.
If you really cared about future risk you would be working away at the problem even with a smaller salary. Focus on your work.
What we really need is some kind of emotionless robot who doesn't care about its own standard of living and who can do lots of research and run organizations and suchlike without all the pesky problems introduced by "being human".
Oh, wait...
So your argument that visiting a bunch of highly educated pencil-necked white nerds is physically dangerous boils down to... one incident of ineffective online censorship mocked by most of the LW community and all outsiders, and some criticism of Yudkowsky's computer science & philosophical achievements.
I see.
I would literally have had more respect for you if you had used racial slurs like "niggers" in your argument, since that is at least tethered to reality in the slightest bit.
I think I'm entitled to opine...
Of course you are. And, you may not be one of the people who "like my earlier papers."
You confirm the lead poster's allegations that SIA staff are insular and conceited.
Really? How? I commented earlier on LW (can't find it now) about how the kind of papers I write barely count as "original research" because for the most part they merely summarize and clarify the ideas of others. But as Beckstead says, there is a strong need for that right now.
For insights in decision theory and FAI theory, I suspect we'll have to look to somebody besides Luke Muehlhauser. We keep trying to hire such people but they keep saying "No." (I got two more "no"s just in the last 3 weeks.) Part of that may be due to the past and current state of the organization — and luckily, fixing that kind of thing is something I seem to have some skills with.
You're... a textbook writer at heart.
True, dat.
This most recently happened just a few weeks ago. On that occasion Luke Muehlhauser (no less) took the unusual step of asking me to friend him on Facebook, after which he joined a discussion I was having and made scathing ad hominem comments about me
Sounds serious... Feel free to post a relevant snippet of the discussion, here or elsewhere, so that those interested can judge this event on its merits, and not through your interpretation of it.
On April 7th, Richard posted to Facebook:
LessWrong has now shown its true mettle. After someone here on FB mentioned a LW discussion of consciousness, I went over there and explained that Eliezer Yudkowsky, in his essay, had completely misunderstood the Zombie Argument given by David Chalmers. I received a mix of critical, thoughtful and sometimes rude replies. But then, all of a sudden, Eliezer took an interest in this old thread again, and in less than 24 hours all of my contributions were relegated to the trash. Funnily enough, David Chalmers himself then appeared and explained that Eliezer had, in fact, completely misunderstood his argument. Chalmers' comments, strangely enough, have NOT been censored. :-)
I replied:
...I haven't read the whole discussion, but just so everyone is clear...
Richard's claim that "in less than 24 hours all of my contributions were relegated to the trash" is false.
What happened is that LWers disvalued Richard's comments and downvoted them. Because most users have their preferences set to hide comments with a score of less than -3, these users saw Richard's most-downvoted comments as collapsed by default, with a note reading "comment s
I fail to see anything that can be qualified as an ad hominem ("an attempt to negate the truth of a claim by pointing out a negative characteristic or belief of the person supporting it") in what you quoted. If anything, the original comment by Richard comes much closer to this definition.
"And if Novamente should ever cross the finish line, we all die."
And yet SIAI didn't do anything to Ben Goertzel (except make him Director of Research for a time, which is kind of insane in my judgement, but obviously not in the sense you intend).
Ben Goertzel's projects are knowably hopeless, so I didn't too strongly oppose Tyler Emerson's project from within SIAI's then-Board of Directors; it was being argued to have political benefits, and I saw no noticeable x-risk so I didn't expend my own political capital to veto it, just sighed. Nowadays the Board would not vote for this.
And it is also true that, in the hypothetical counterfactual conditional where Goertzel's creations work, we all die. I'd phrase the email message differently today to avoid any appearance of endorsing the probability, because today I understand better that most people have trouble mentally separating hypotheticals. But the hypothetical is still true in that counterfactual universe, if not in this one.
There is no contradiction here.
Richard,
If you have some solid, rigorous and technical criticism of SIAI's AI work, I wish you would create a pseudonimous account on LW and state that critcism without giving the slightest hint that you are Richard Loosemore, or making any claim about your credentials, or talking about censorship and quashing of dissenting views.
Until you do something like that, I can't help think that you care more about your reputation or punishing Eliezer than about improving everybody's understanding of technical issues.
Please don't take this as a personal attack, but, historically speaking, every one who'd said "I am in the final implementation stages of the general intelligence algorithm" was wrong so far. Their algorithms never quite worked out. Is there any evidence you can offer that your work is any different ? I understand that this is a tricky proposition, since revealing your work could set off all kinds of doomsday scenarios (assuming that it performs as you expect it to); still, surely there must be some way for you to convince skeptics that you can succeed where so many others had failed.
I would say that, far from deserving support, SI should be considered a cult-like community in which dissent is ruthlessly suppressed in order to exaggerate the point of view of SI's founders and controllers, regardless of the scientific merits of those views, or of the dissenting opinions.
This is a very strong statement. Have you allowed for the possibility that your current judgement might be clouded by the events transpired some 6 years ago?
I myself employ a very strong heuristic, from years of trolling the internet: when a user joins a forum and complains about an out-of-character and strongly personal persecution by the moderation staff in the past, there is virtually always more to the story when you look into it.
Regardless of who was how much at fault in the SL4 incident, surely you must admit that Yudkowsky's interactions with you were unusually hostile relative to how he generally interacts with critics. I can see how you'd want to place emphasis on those interactions because they involved you personally, but that doesn't make them representative for purposes of judging cultishness or making general claims that "dissent is ruthlessly suppressed".
I think Martian Yudkowsky is a dangerous intuition pump. We're invited to imagine a creature just like Eliezer except green and with antennae; we naturally imagine him having values as similar to us as, say, a Star Trek alien. From there we observe the similarity of values we just pushed in, and conclude that values like "interesting" are likely to be shared across very alien creatures. Real Martian Yudkowsky is much more alien than that, and is much more likely to say
There is little prospect of an outcome that realizes even the value of being flarn, unless the first superintelligences undergo detailed inheritance from Martian values.
Imagine, an intelligence that didn't have the universal emotion of badweather!
Of course, extraterrestrial sentients may possess physiological states corresponding to limbic-like emotions that have no direct analog in human experience. Alien species, having evolved under a different set of environmental constraints than we, also could have a different but equally adaptive emotional repertoire. For example, assume that human observers land on another and discover an intelligent animal with an acute sense of absolute humidity and absolute air pressure. For this creature, there may exist an emotional state responding to an unfavorable change in the weather. Physiologically, the emotion could be mediated by the ET equivalent of the human limbic system; it might arise following the secretion of certain strength-enhancing and libido-arousing hormones into the alien's bloodstream in response to the perceived change in weather. Immediately our creature begins to engage in a variety of learned and socially-approved behaviors, including furious burrowing and building, smearing tree sap over its pelt, several different territorial defense ceremonies, and vigorous polygamous copulations with nearby females, apparently (to humans) for no reason at all. Would our astronauts interpret this as madness? Or love? Lust? Fear? Anger? None of these is correct, of course the alien is feeling badweather.
I suggest you guys taboo interesting, because I strongly suspect you're using it with slightly different meanings. (And BTW, as a Martian Yudkowsky I imagine something with values at least as alien as Babyeaters' or Superhappys'.)
I am in the final implementation stages of the general intelligence algorithm.
it's both amusing and disconcerting that people on this forum treat such a comment seriously.
Rain (who noted that he is a donor to SIAI in a comment) and HoldenKarnofsky (who wrote the post) are two different people, as indicated by their different usernames.
I feel that [SI] ought to be able to get more impressive endorsements than it has.
SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments.
Holden, do you believe that charitable organizations should set out deliberately to impress donors and high-status potential endorsers? I would have thought that a donor like you would try to ignore the results of any attempts at that and to concentrate instead on how much the organization has actually improved the world because to do otherwise is to incentivize organizations whose real goal is to accumulate status and money for their own sake.
For example, Eliezer's attempts to teach rationality or "technical epistemology" or whatever you want to call it through online writings seem to me to have actually improved the world in a non-negligible way and seem to have been designed to do that rather than designed merely to impress.
ADDED. The above is probably not as clear as it should be, so let me say it in different words: I suspect it is a good idea for donors to ignore certain forms of evidence ("impressiveness", affiliation with high-status folk) of a charity's effectiveness to discourage charities from gaming donors in ways that seems to me already too common, and I was a little surprised to see that you do not seem to ignore those forms of evidence.
I agree with much of this post, but find a disconnect between the specific criticisms and the overall conclusion of withholding funds from SI even for "donors determined to donate within this cause", and even aside from whether SI's FAI approach increases risk. I see a couple of ways in which the conclusion might hold.
If Holden believes that:
A) reducing existential risk is valuable, and
B) SI's effectiveness at reducing existential risk is a significant contributor to the future of existential risk, and
C) SI is being less effective at reducing existential risk than they would be if they fixed some set of problems P, and
D) withholding GiveWell's endorsement while pre-committing to re-evaluating that refusal if given evidence that P has been fixed increases the chances that SI will fix P...
...it seems to me that Holden should withhold GiveWell's endorsement while pre-committing to re-evaluating that refusal if given evidence that P has been fixed.
Which seems to be what he's doing. (Of course, I don't know whether those are his reasons.)
What, on your view, ought he do instead, if he believes those things?
Holden said,
However, I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y.
This addresses your point (2). Holden believes that SI is grossly inefficient at best, and actively harmful at worst (since he thinks that they might inadvertently increase AI risk). Therefore, giving money to SI would be counterproductive, and a donor would get a better return on investment in other places.
As for point (1), my impression is that Holden's low estimate of SI's competence is due to a combination of what he sees as wrong beliefs, as well as an insufficient capability to implement even the correct beliefs into practice. SI claims to be supremely rational, but their list of achievements is lackluster at best -- which indicates a certain amount of Donning-Kruger effect that's going on. Furthermore, SI appears to be focused on growing SI and teaching rationality workshops, as opposed to their stated mission of researching FAI theory.
Additionally, Holden indicted SI members pretty strongly (though very politely) for what I will (in a less polite fashion) label as arrogance. The prevailing attitude of SI members seems to be (according to Holden) that the rest of the world is just too irrational to comprehend their brilliant insights, and therefore the rest of the world has little to offer -- and therefore, any criticism of SI's goals or actions can be dismissed out of hand.
EDIT: found the right quote, duh.
There's got to be a level beyond "arguments as soldiers" to describe your current approach to ineffective contrarianism.
I volunteer "arguments as cannon fodder."
Some comments on objections 1 and 2.
For example, when the comment says "the formalization of the notion of 'safety' used by the proof is wrong," it is not clear whether it means that the values the programmers have in mind are not correctly implemented by the formalization, or whether it means they are correctly implemented but are themselves catastrophic in a way that hasn't been anticipated.
Both (with the caveat that SI's plans are to implement an extrapolation procedure for the values, and not the values themselves).
Another way of putting this is that a "tool" has an underlying instruction set that conceptually looks like: "(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc."
I think such a Tool-AI will be much less powerful than an equivalent Agent-AI, due to the bottleneck of having to summarize its calculations in a human-readable form, and then waiting for the human to read and understand the summary and then mak...
The basic idea is that if you pull a mind at random from design space then it will be unfriendly. I am not even sure if that is true. But it is the strongest argument they have. And it is completely bogus because humans do not pull AGI's from mind design space at random.
I don't have the energy to get into an extended debate, but the claim that this is "the basic idea" or that this would be "the strongest argument" is completely false. A far stronger basic idea is the simple fact that nobody has yet figured out a theory of ethics that would work properly, which means that even that AGIs that were specifically designed to be ethical are most likely to lead to bad outcomes. And that's presuming that we even knew how to program them exactly.
This isn't even something that you'd need to read a hundred blog posts for, it's well discussed in both The Singularity and Machine Ethics and Artificial Intelligence as a Positive and Negative Factor in Global Risk. Complex Value Systems are Required to Realize Valuable Futures, too.
The more significant fact is that these criticisms were largely unknown to the community.
LWer tenlier disagrees, saying:
[Holden's] critique mostly consists of points that are pretty persistently bubbling beneath the surface around here, and get brought up quite a bit. Don't most people regard this as a great summary of their current views, rather than persuasive in any way? In fact, the only effect I suspect this had on most people's thinking was to increase their willingness to listen to Karnofsky in the future if he should change his mind.
Also, you said:
Dissent is cabined to Discussion.
Luckily, evidence on the matter is easy to find. As counter-evidence I present: Self-improvement or shiny distraction, SIAI an examination, Why we can't take expected value estimates literally, Extreme rationality: it's not that great, Less Wrong Rationality and Mainstream Philosophy, and the very post you are commenting on. Many of these are among the most upvoted posts ever.
Moreover, the editors rarely move posts from Main to Discussion. The posters themselves decide whether to post in Main or Discussion.
Your point is well taken, but since part of the concern about that whole affair was your extreme language and style, maybe stating this in normal caps might be a reasonable step for PR.
I'm sure I wouldn't have done what Romney did, and not so sure about whether I would have done what Yudkowsky did. Romney wanted to hurt people for the fun of it. Yudkowsky was trying to keep people from being hurt, regardless of whether his choice was a good one.
If such a person would write a similar post and actually write in a way that they feel, rather than being incredible polite, things would look very different.
I'm assuming you think they'd come in, scoff at our arrogance for a few pages, and then waltz off. Disregarding how many employed machine learning engineers also do side work on general intelligence projects, you'd probably get the same response from automobile engineer, someone with a track record and field expertise, talking to the Wright Brothers. Thinking about new things and new ideas doesn't automatically make you wrong.
That recursive self-improvement is nothing more than a row of English words, a barely convincing fantasy.
Really? Because that's a pretty strong claim. If I knew how the human brain worked well enough to build one in software, I could certainly build something smarter. You could increase the number of slots in working memory. Tweak the part of the brain that handles intuitive math to correctly deal with orders of magnitude. Improve recall to eidetic levels. Tweak the brain's handling of probabilities to be closer to the Bayesian ideal. Even those small changes would likely produce a mind ...
Having been a subject of both a relatively large upvote and a relatively large downvote in the last couple of weeks, I still think that the worst thing one can do is to complain about censorship or karma. The posts and comments on any forum aren't judged on their "objective merits" (because there is no such thing), but on its suitability for the forum in question. If you have been downvoted, your post deserves it by definition. You can politely inquire about the reasons, but people are not required to explain themselves. As for rationality, I question whether it is rational to post on a forum if you are not having fun there. Take it easy.
I downvoted you because you're wrong. For one, comments can't be promoted to main, only posts, and for two, plenty of opposition has garnerned a great deal of upvotes, as shown by the numerous links lukeprog provided.
For example, where do you get 'almost 800 responses' from? That comment (not post) only has 32 comments below it.
I'm interested in any compiled papers or articles you wrote about AGI motivation systems, aside from the forthcoming book chapter, which I will read. Do you have any links?
I'll gladly start reading at any point you'll link me to.
The fact that you don't just provide a useful link but instead several paragraphs of excuses why the stuff I'm reading is untrustworthy I count as (small) evidence against you.
I don't work for SI and this is not an SI-authorized response, unless SI endorses it later. This comment is based on my own understanding based on conversations with and publications of SI members and general world model, and does not necessarily reflect the views or activities of SI.
The first thing I notice is that your interpretation of SI's goals with respect to AGI are narrower than the impression I had gotten, based on conversations with SI members. In particular, I don't think SI's research is limited to trying to make AGI friendliness provable, but on a variety of different safety strategies, and on the relative win-rates of different technological paths, eg brain uploading vs. de-novo AI, classes of utility functions and their relative risks, and so on. There is also a distinction between "FAI theory" and "AGI theory" that you aren't making; the idea, as I see it, is that to the extent to which these are separable, "FAI theory" covers research into safety mechanisms which reduce the probability of disaster if any AGI is created, while "AGI theory" covers research that brings the creation of any AGI closer. Your first objection - that ...
This likely won't work. Money is fungible, so unless the total donations so earmarked exceeds the planned SI funding for that cause, they won't have to change anything. They're under no obligation to not defund your favorite cause by exactly the amount you donated, thus laundering your donation into the general fund. (Unless I misunderstand the relevant laws?)
EDIT NOTE: The post used to say vast majority; this was changed, but is referenced below.
Richard, this really isn't productive. Your clearly quite intelligent and clearly still have issues due to the dispute between you and Eliezer. It is likely that if you got over this, you could be an effective, efficient, and helpful critic of SI and their ideas. But right now, you are engaging in a uncivil behavior that isn't endearing you to anyone while making emotionally heavy comparisons that make you sound strident.
There's been video or two where Eliezer was called "world's foremost expert on recursive self improvement"
This usually happens when the person being introduced wasn't consulted about the choice of introduction.
I'm glad for this, LessWrong can always use more engaging critiques of substance. I partially agree with Holden's conclusions, although I reach them from a substantially different route. I'm a little surprised then that few of the replies have directly engaged what I find to be the more obvious flaws in Holden's argument: namely objection 2 and the inherent contradictions with it and objection 1.
Holden posits that many (most?) well-known current AI applications more or less operate as sophisticated knowledge bases. His tool/agent distinction draws a boundary around AI tools: systems whose only external actions consist of communicating results to humans, and the rest being agents which actually plan and execute actions with external side effects. Holden distinguishes 'tool' AI from Oracle AI, the latter really being agent AI (designed for autonomy) which is trapped in some sort of box. Accepting Holden's terminology and tool/agent distinction, he then asserts:
I can ...
If your purpose is "let everyone know I think Eliezer is nuts", then you have succeeded, and may cease posting.
Please rot13 the part from “potentially” onwards, and add a warning as in this comment (with “decode the rot-13'd part” instead of “follow the links”), because there are people here who've said they don't want to know about that thing.
Holden does a great job but makes two major flaws:
1) His argument about Tool-AI is irrelevant, because creating Tool-AI does almost nothing to avoid Agent-AI, which he agrees is dangerous.
2) He too narrowly construes SI's goals by assuming they are only working on Friendly AI rather than AGI x-risk reduction in general.
The heck? Why would you not need to figure out if an oracle is an ethical patient? Why is there no such possibility as a sentient oracle?
Is this standard religion-of-embodiment stuff?
I'm very impressed by Holden's thoroughness and thoughtfulness. What I'd like to know is why his post is Eliezer-endorsed and has 191 up-votes, while my many posts over the years hammering on Objection 1, and my comments raising Objection 2, have never gotten the green button, been frequently down-voted, and never been responded to by SIAI. Do you have to be outside the community to be taken seriously by it?
Not to be cynical, PhilGoetz, but isn't Holden an important player in the rational-charity movement? Wouldn't the ultimate costs of ignoring Holden be prohibitive?
I thought most of the stuff in Holden's post had been public knowledge for years, even to the point of being included in previous FAQs produced by SI. The main difference is that the presentation and solidity of it in this article are remarkable - interconnecting so many different threads which, when placed as individual sentences or paragraphs, might hang alone, but when woven together with the proper knots form a powerful net.
Assuming what you say is true, it looks to me as though SI is paying the cost of ignoring its critics for so many years...
The quotes aren't all about AI.
I didn't say they were. I said that just because the speaker for a particular idea comes across as crazy doesn't mean the idea itself is crazy. That applies whether all of Eliezer's "crazy statements" are about AI, or whether none of them are.
Whoever knowingly chooses to save one life, when they could have saved two – to say nothing of a thousand lives, or a world – they have damned themselves as thoroughly as any murderer.
The most extreme presumptuousness about morality; insufferable moralism.
Funny, I actually agree with the top phrase. It's written in an unfortunately preachy, minister-scaring-the-congregation-by-saying-they'll-go-to-Hell style, which is guaranteed to make just about anyone get defensive and/or go "ick!" But if you accept the (very common) moral standard that if you can save a life, it's better to do it than not to do it, then the logic is inevitable that if you have the choice of saving one lives or two lives, by your own metric it's morally preferable to save two lives. If you don't accept the moral standard that it's better to save one life than zero lives, then that phrase should be just as insuffe...
How would one explain Yudkowsky's paranoia, lack of perspective, and scapegoating--other than by positing a narcissistic personality structure?
I had in fact read a lot of those quotes before–although some of them come as a surprise, so thank you for the link. They do show paranoia and lack of perspective, and yeah, some signs of narcissism, and I would be certainly mortified if I personally ever made comments like that in public...
The Sequences as a whole do come across as having been written by an arrogant person, and that's kind of irritating, and I have to consciously override my irritation in order to enjoy the parts that I find useful, which is quite a lot. It's a simplification to say that the Sequences are just clutter, and it's extreme to call them 'craziness', too.
(Since meeting Eliezer in person, it's actually hard for me to believe that those comments were written by the same person, who was being serious about them... My chief interaction with him was playing a game in which I tried to make a list of my values, and he hit me with a banana every time I got writer's block because I was trying to be too specific, and made the Super Mario Brothers' theme song when I suc...
Romney is rightfully being held, feet to fire, for a group battering of another student while they attended high school--because such sadism is a trait of character and can't be explained otherwise.
I was going to upvote your comment until I got to this point. Aside from the general mindkilling, this looks like the fundamental attribution error, and moreover, we all know that people do in fact mature and change. Bringing up external politics is not helpul in a field where there's already concern that AI issues may be becoming a mindkilling subject themselves on LW. Bringing up such a questionable one is even less useful.
I initially upvoted this post, because the criticism seemed reasonable. Then I read the discussion, and switched to downvoting it. In particular, this:
Taken in isolation, these thoughts and arguments might amount to nothing more than a minor addition to the points that you make above. However, my experience with SI is that when I tried to raise these concerns back in 2005/2006 I was subjected to a series of attacks that culminated in a tirade of slanderous denunciations from the founder of SI, Eliezer Yudkowsky. After delivering this tirade, Yudkowsky then banned me from the discussion forum that he controlled, and instructed others on that forum that discussion about me was henceforth forbidden.
Since that time I have found that when I partake in discussions on AGI topics in a context where SI supporters are present, I am frequently subjected to abusive personal attacks in which reference is made to Yudkowsky's earlier outburst. This activity is now so common that when I occasionally post comments here, my remarks are very quickly voted down below a threshold that makes them virtually invisible. (A fate that will probably apply immediately to this very comment).
Serious accusati...
Can you provide some examples of these "abusive personal attacks"? I would also be interested in this ruthless suppression you mention. I have never seen this sort of behavior on LessWrong, and would be shocked to find it among those who support the Singularity Institute in general.
I've read a few of your previous comments, and while I felt that they were not strong arguments, I didn't downvote them because they were intelligent and well-written, and competent constructive criticism is something we don't get nearly enough of. Indeed, it is usually welcomed. The amount of downvotes given to the comments, therefore, does seem odd to me. (Any LW regular who is familiar with the situation is also welcome to comment on this.)
I have seen something like this before, and it turned out the comments were being downvoted because the person making them had gone over, and over, and over the same issues, unable or unwilling to either competently defend them, or change his own mind. That's no evidence that the same thing is happening here, of course, but I give the example because in my experience, this community is almost never vindictive or malicious, and is laudably willing to con...
Thanks. I read the whole debate, or as much of it as is there; I've prepared a short summary to post tomorrow if anyone is interested in knowing what really went on ("as according to Hul-Gil", anyway) without having to hack their way through that thread-jungle themselves.
(Summary of summary: Loosemore really does know what he's talking about - mostly - but he also appears somewhat dishonest, or at least extremely imprecise in his communication.)
Karnofsky's focus on "tool AI" is useful but also his statement of it may confuse matters and needs refinement. I don't think the distinction between "tool AI" and "agent AI" is sharp, or in quite the right place.
For example, the sort of robot cars we will probably have in a few years are clearly agents-- you tell them to "come here and take me there" and they do it without further intervention on your part (when everything is working as planned). This is useful in a way that any amount and quality of question answering is not. Almost certainly there will be various flavors of robot cars available and people will choose the ones they like (that don't drive in scary ways, that get them where they want to go even if it isn't well specified, that know when to make conversation and when to be quiet, etc.) As long as robot cars just drive themselves and people around, can't modify the world autonomously to make their performance better, and are subject to continuing selection by their human users, they don't seem to be much of a threat.
The key points here seem to be (1) limited scope, (2) embedding in a network of other actors and (3) huma...
Can you pretty, pretty please tell me where this graph gets its information from? I've seen similar graphs that basically permute the cubes' labels. It would also be wonderful to unpack what they mean by "solar" since the raw amount of sunlight power hitting the Earth's surface is a very different amount than the energy we can actually harness as an engineering feat over the next, say, five years (due to materials needed to build solar panels, efficiency of solar panels, etc.).
And just to reiterate, I'm really not arguing here. I'm honestly confused. I look at things like this video and books like this one and am left scratching my head. Someone is deluded. And if I guess wrong I could end up wasting a lot of resources and time on projects that are doomed to total irrelevance from the start. So, having some good, solid Bayesian entanglement would be absolutely wonderful right about now!
(I am saying this in case anyone looks at this thread and thinks Loosemore is making a valid point, not because I approve of anyone's responding to him.)
Alex, I did not say that ALL dissent is ruthlessly suppressed
This is an abuse of language since it is implicated by the original statement.
And since the easiest and quickest way to ensure that you NEVER get to see any of the money that he controls, would be to ruthlessly suppress his dissent, he is treated with the utmost deference.
There is absolutely no reason to believe that all, or half, or a quarter, or even ten percent of the upvotes on this post come from SIAI staff. There are plenty of people on LW who don't support donating to SIAI.
Congratulations on your insights, but please don't snrk implement them until snigger you've made sure that oh heck I can't keep a straight face anymore.
The reactions to the parent comment are very amusing. We have people sarcastically supporting the commenter, people sarcastically telling the commenter they're a threat to the world, people sarcastically telling the commenter to fear for their life, people non-sarcastically telling the commenter to fear for their life, people honestly telling the commenter they're probably nuts, and people failing to get every instance of the sarcasm. Yet at bottom, we're probably all (except for private_messaging) thinking the same thing: that FinalState almost certainly has no way of creating an AGI and that no-one involved need feel threatened by anyone else.
don't you detect unacknowledged ambition in Eliezer Yudkowsky?
Eliezer certainly has a lot of ambition, but I am surprised to see an accusation that this ambition is unacknowledged.
Why yes, I do also believe that political figures are held to ridiculous conversational standards as well. It's a miracle they deign to talk to anyone.
I don't understand. Holden is not a major financial contributor to SIAI. And even if he was: which argument are you talking about, and why is it disingenuous?
Posts which contain factual inaccuracies along with meta-discussion of karma effects are often downvoted.
Actually bare noun phrases in English carry both interpretations, ambiguously. The canonical example is "Policemen carry guns" versus "Policemen were arriving" -- the former makes little sense when interpreted existentially, but the latter makes even less sense when interpreted universally.
In short, there is no preferred interpretation.
(Oh, and prescriptivists always lose.)
At least try harder in you fear-mongering. The thread about EY's failure to make make many falsifiable predictions is better ad hominem and the speculation about launching terrorist attacks on fab plants is a much more compelling display of potential risk to life and property.
I agree that this is not a game, although you should note that you are doing EY/SIAI/LessWrong's work for it by trying to scare FinalState.
What probability would you give to FinalState's assertion of having a working AGI?
The social and opportunity costs of trying to supress a "UFAI attempt" as implausible as FinalState's are far higher than the risk of failing to do so. There are also decision-theoretic reasons never to give in to Pascal-Mugging-type offers. SIAI knows all this and therefore will ignore FinalState completely, as well they should.
That's the kind of probability I would've assigned to EURISKO destroying the world back when Lenat was the first person ever to try to build anything self-improving. For a random guy on the Internet it's off by... maybe five orders of magnitude? I would expect a pretty tiny fraction of all worlds to have the names of homebrew projects carved on their tombstones, and there are many random people on the Internet claiming to have AGI.
People like this are significant, not because of their chances of creating AGI, but because of what their inability to stop or take any serious precautions, despite their belief that they are about to create AGI, tells us about human nature.
This just shifts the question to how you slotted FinalState into such a promising reference class? Conservatively, tens of academic research programs, tens of PhD dissertations, hundreds of hobbyist projects, hundreds of undergraduate term projects, and tens of business ventures have attempted something similar to AGI and none have succeeded.
As far as I can tell, the vast majority of academic projects (particularly those of undergrads) have worked on narrow AI, which this is supposedly not.
However, reading the post again, it doesn't sound as though they have the support of any academic institution; I misread the bit around "academic network". It sounds more as though this is a homebrew project, in which case I need to go two or three orders of magnitude lower.
The Rule of Succession, if I'm not mistaken, assumes a uniform prior from 0 to 1 for the probability of success. That seems unreasonable; it shouldn't be extremely improbable (even before observing failure) that fewer than one in a thousand such claims result in a working AGI. So you have to adjust downward somewhat from there, but it's hard to say how much.
(This is in addition to the point that user:othercriteria makes in the sibling comment.)
This is where Yudkowsky goes crazy autodidact bonkers. He thinks the social institution of science is superfluous, were everyone as smart as he. This means he can hold views contrary to scientific consensus in specialized fields where he lacks expert knowledge based on pure ratiocination.
Ok. I disagree with a large bit of the sequences on science and the nature of science. I've wrote a fair number of comments saying so. So I hope you will listen when I say that you are taking a strawman version of what Eliezer wrote on these issues, and it almost borders on something that I could only see someone thinking if they were trying to interpret Eliezer's words in the most negative fashion possible.
Even most science fiction books avoid that because it sounds too implausible.
Not saying I particularly disagree with your other premises, but saying something can't be true because it sounds implausible is not a valid argument.
I'd brought up a version of the tool/agent distinction, and was told firmly that people aren't smart or fast enough to direct an AI. (Sorry, this is from memory-- I don't have the foggiest how to do an efficient search to find that exchange.)
I'm not sure that's a complete answer-- how possible is it to augment a human towards being able to manage an AI? On the other hand, a human like that isn't going to be much like humans 1.0, so problems of Friendliness are still in play.
Perhaps what's needed is building akrasia into the world-- a resistance to sudden change. This has its own risks, but sudden existential threats are rare. [1]
At this point, I think the work on teaching rationality is more reliably important than the work on FAI. FAI involves some long inferential chains. The idea that people could improve their lives a lot by thinking more carefully about what they're doing and acting on those thoughts (with willingness to take feedback) is a much more plausible idea, even if you factor in the idea that rationality can be taught.
[1] Good enough for fiction-- we're already living in a world like that. We call the built-in akrasia Murphy.
You may be thinking of this exchange, which I found only because I remembered having been involved in it.
I continue to think that "tool" is a bad term to use here, because people's understanding of what it refers to vary so relevantly.
As for what is valuable work... hm.
I think teaching people to reason in truth-preserving and value-preserving ways is worth doing.
I think formalizing a decision theory that captures universal human intuitions about what the right thing to do in various situations is worth doing.
I think formalizing a decision theory that captures non-universal but extant "right thing" intuitions is potentially worth doing, but requires a lot of auxiliary work to actually be worth doing.
I think formalizing a decision theory that arrives at judgments about the right thing to do in various situations where those judgments are counterintuitive for most/all humans but reliably lead, if implemented, to results that those same humans reliably endorse more the results of their intuitive judgments is worth doing.
I think building systems that can solve real-world problems efficiently is worth doing, all else being equal, though I agree that powerful tools fr...
My biggest criticism of SI is that I cannot decide between:
A. promoting AI and FAI issues awareness will decrease the chance of UFAI catastrophe; or B. promoting AI and FAI issues awareness will increase the chance of UFAI catastrophe
This criticism seems district from the ones that Holden makes. But it is my primary concern. (Perhaps the closest example is Holden's analogy that SI is trying to develop facebook before the Internet).
A seems intuitive. Basically everyone associated with SI assumes that A is true, as far as I can tell. But A is not obviously true to me. It seems to me at least plausible that:
A1. promoting AI and FAI issues will get lots of scattered groups around the world more interested in creating AGI A2. one of these groups will develop AGI faster than otherwise due to A1 A3. the world will be at greater risk of UFAI catastrophe than otherwise due to A2 (i.e. the group creates AGI faster than otherwise, and fails at FAI)
More simply: SI's general efforts, albeit well intended, might accelerate the creation of AGI, and the acceleration of AGI might decrease the odds of the first AGI being friendly. This is one path by which B, not A, would be true.
SI might repl...
I don't think it's hard to explain at all: Eliezer prioritized a donor (presumably long-term and one he knew personally) over an article. I disagree with it, but you know what, I saw this sort of thing all the time on Wikipedia, and I don't need to go looking for theories of why administrators were crazy and deleted Daniel Brandt's article. I know why they did, even though I strongly disagreed.
3) most importantly, never explained his response (practically impossible without admitting his mistake).
He or someone else must have explained at some point, or I wouldn't know his reason was that the article was giving a donor nightmares.
Is deleting one post such an issue to get worked up over? Or is this just discussed because it's the best criticism one can come up with besides "he's a high school dropout who hasn't yet created an AI and so must be completely wrong"?
The Roko incident has absolutely nothing to do with this at all. Roko did not claim to be on the verge of creating an AGI.
Once again you're spreading FUD about the SI. Presumably moderation will come eventually, no doubt over some hue and cry over censoring contrarians.
Sure. His moderation activities over the last year or so have been far more... sunglasses... moderate.
It seems almost unfair to criticize something as a problem of LW rationality when in your second paragraph you note that professionals do the same thing.
Ask yourself honestly whether you would ever or have ever done anything comparable to what Yudkowsky did in the Roko incident or what Romney did in the hair cutting incident.
I'm not sure. A while ago, I was involved in a situation where someone wanted to put personal information of an individual up on the internet knowing that that person had an internet stalker who had a history of being a real life stalker for others. The only reason I didn't react pretty close to how Eliezer reacted in the quoted incident is that I knew that the individual in question was not going to listen to me and would if anything have done the opposite of what I wanted. In that sort of context, Eliezer's behavior doesn't seem to be that extreme. Eliezer's remarks involve slightly more caps than I think I would use in such a circumstance, but the language isn't that different.
This does connect to another issue though- the scale in question of making heated comments on the internet as opposed to traumatic bullying, are different. The questions I ask m...
If you being downvoted is the result of LW ruthlessly suppressing dissent of all kind, how do you explain this post by Holden Karnofsky getting massively upvoted?
For example, if all members of Congress were to shout loudly when a particular member got up to speak, drowning out their words, would this be censorship, or just their exercise of a community vote against that person?
One thing to note is that your comment wasn't removed; it was collapsed. It can still be viewed by anyone who clicks the expander or has their threshold set sufficiently low (with my settings, it's expanded). There is a tension between the threat of censorship being a problem on the one hand, and the ability for a community to collectively decide what they want to talk about on the other.
The censorship issue is also diluted by the fact that 1) nothing here is binding on anyone (which is way different than your Congress example), and 2) there are plenty of other places people can discuss things, online and off. It is still somewhat relevant, of course, to the question of whether there's an echo-chamber effect, but carefull not to pull in additional connotations with choice of words and examples.
This is like the whole point of why LessWrong exists. To remind people that making a superintelligent tool and expecting it to magically gain human common sense is a fast way to extinction.
The superintelligent tool will care about suicide only if you program it to care about suicide. It will care about damage only if you program it to care about damage. -- If you only program it to care about answering correctly, it will answer correctly... and ignore suicide and damage as irrelevant.
If you ask your calculator how much is 2+2, the calculator answers 4 rega...
I'm afraid not.
Actually, as someone with background in Biology I can tell you that this is not a problem you want to approach atoms-up. It's been tried, and our computational capabilities fell woefully short of succeeding.
I should explain what "woefully short" means, so that the answer won't be "but can't the AI apply more computational power than us?". Yes, presumably it can. But the scales are immense. To explain it, I will need an analogy.
Not that long ago, I had the notion that chess could be fully solved; that is, that you could si...
Starting a nonprofit on a subject unfamiliar to most and successfully soliciting donations, starting an 8.5-million-view blog, writing over 2 million words on wide-ranging controversial topics so well that the only sustained criticism to be made is "it's long" and minor nitpicks, writing an extensive work of fiction that dominated its genre, and making some novel and interesting inroads into decision theory all seem, to me, to be evidence in favour of genius-level intelligence. These are evidence because the overwhelming default in every case for simply 'smart' people is to fail.
Of course not, why send death squads when you can send Death Eaters. It just takes a single spell to solve this problem.
As a minor note, observe that claims of extraordinary rationality do not necessarily contradict claims of irrationality. The sanity waterline is very low.
Are you comparing it to the average among nonprofits started, or nonprofits extant? I would guess that it was well below average for extant nonprofits, but about or slightly above average for started nonprofits. I'd guess that most nonprofits are started by people who don't know what they're doing and don't know what they don't know, and that SI probably did slightly better because the people who were being a bit stupid were at least very smart, which can help. However, I'd guess that most such nonprofits don't live long because they don't find a Peter Thiel to keep them alive.
Your assessment looks about right to me. I have considerable experience of averagely-incompetent nonprofits, and SIAI looks normal to me. I am strongly tempted to grab that "For Dummies" book and, if it's good, start sending copies to people ...
The point is that we're consequentialists, and lowering salaries even further would save money (on salaries) but result in SI getting less done, not more — for the same reason that outsourcing fewer tasks would save money (on outsourcing) but cause us to get less done, not more.
result in SI getting less done
You say this as though it's obvious, but if I'm not mistaken, salaries used to be about 40% of what they are now, and while the higher salaries sound like they are making a major productivity difference, hiring 2.5 times as many people would also make a major productivity difference. (Though yes, obviously marginal hires would be lower in quality.)
I don't think salaries were ever as low as 40% of what they are now. When I came on board, most people were at $36k/yr.
To illustrate why lower salaries means less stuff gets done: I've been averaging 60 hours per week, and I'm unusually productive. If I am paid less, that means that (to pick just one example from this week) I can't afford to take a taxi to and from the eye doctor, which means I spend 1.5 hrs each way changing buses to get there, and spend less time being productive on x-risk. That is totally not worth it. Future civilizations would look back on this decision as profoundly stupid.
Pretty sure Anna and Steve Rayhawk had salaries around $20k/yr at some point while living in Silicon Valley.
I don't think that you're really responding to Steven's point. Yes, as Steven said, if you were paid less then clearly that would impose more costs on you, so ceteris paribus your getting paid less would be bad. But, as Steven said, the opportunity cost is potentially very high. You haven't made a rationally compelling case that the missed opportunity is "totally not worth it" or that heeding it would be "profoundly stupid", you've mostly just re-asserted your conclusion, contra Steven's objection. What are your arguments that this is the case? Note that I personally think it's highly plausible that $40-50k/yr is optimal, but as far as I can see you haven't yet listed any rationally compelling reasons to think so.
(This comment is a little bit sterner than it would have been if you hadn't emphatically asserted that conclusions other than your own would be "profoundly stupid" without first giving overwhelming justification for your conclusion. It is especially important to be careful about such apparent overconfidence on issues where one clearly has a personal stake in the matter.)
I will largely endorse Will's comment, then bow out of the discussion, because this appears to be too personal and touchy a topic for a detailed discussion to be fruitful.
This seems to me unnecessarily defensive. I support the goals of SingInst, but I could never bring myself to accept the kind of salary cut you guys are taking in order to work there. Like every other human on the planet, I can't be accurately modelled with a utility function that places any value on far distant strangers; you can more accurately model what stranger-altruism I do show as purchase of moral satisfaction, though I do seek for such altruism to be efficient. SingInst should pay the salaries it needs to pay to recruit the kind of staff it needs to fulfil its mission; it's harder to recruit if staff are expected to be defensive about demanding market salaries for their expertise, with no more than a normal adjustment for altruistic work much as if they were working for an animal sanctuary.
Maybe I'm just jaded, but this critique doesn't impress me much. Holden's substantive suggestion is that, instead of trying to design friendly agent AI, we should just make passive "tool AI" that only reacts to commands but never acts on its own. So when do we start thinking about the problems peculiar to agent AI? Do we just hope that agent AI will never come into existence? Do we ask the tool AI to solve the friendly AI problem for us? (That seems to be what people want to do anyway, an approach I reject as ridiculously indirect.)
I've read SL4 around that time and saw the whole drama (although I couldn't understand all the exact technical details, being 16). My prior on EY flagrantly lying like that is incredibly low. I'm virtually certain that you're quite cranky in this regard.
I was on SL4 as well, and regarded Eliezer as basically correct, although I thought Loosemore's ban was more than a little bit disproportionate. (If John Clark didn't get banned for repeatedly and willfully misunderstanding Godelian arguments, wasting the time of countless posters over many years, why should Loosemore be banned for backtracking on some heuristics & biases positions?)
(Because JKC never lied about his credentials, which is where it really crosses the line into trolling.)
trolling
You use this word in an unconventional way, i.e., you use it to mean something like 'unfairly causing harm and wasting people's time', which is not the standard definition: the standard definition necessitates intention to provoke or at least something in that vein. (I assume you know what "trolling" means in the context of fishing?) Because it's only ever used in sensitive contexts, you might want to put effort into finding a more accurate word or phrase. As User:Eugine_Nier noted, lately "troll" and "trolling" have taken on a common usage similar to "fascist" and "fascism", which I think is an unfortunate turn of events.
Caving to donors is inauspicious.
It's also a double-bind. If you do nothing, you are valuing donors at less than some random speculation which is unusually dubious even by LessWrong's standards, resting as it does on a novel speculative decision theory (acausal trade) whose most obvious requirement (implementing sufficiently similar algorithms) is beyond blatantly false when applied to humans and FAIs. (If you actually believe that SIAI is a good charity, pissing off donors over something like this is a really bad idea, and if you don't believe SIAI is ...
I am in the final implementation stages of the general intelligence algorithm.
Do you mean "I am in the final writing stages of a paper on a general intelligence algorithm?" If you were in the final implementation stages of what LW would recognize as the general intelligence algorithm, the very last thing you would want to do is mention that fact here; and the second-to-last thing you'd do would be to worry about personal credit.
I think the argument you make in this comment isn't a bad one, but the unnecessary and unwarranted "Apostle Yudkowsky (prophet of the Singularity God)" stuff amounts to indirectly insulting the people you're talking with and, makes them far less likely to realize that you're actually also saying something sensible. If you want to get your points across, as opposed to just enjoying a feeling of smug moral superiority while getting downvoted into oblivion, I strongly recommend leaving that stuff out.
"You're too stupid and self-deceiving to just use Solomonoff induction" ~ "If you were less stupid and self deceiving you'd be able to just use Solomonoff induction" + "but since you are in fact stupid and self-deceiving, instead you have to use the less elegant approximation Science"
That was hard to find out?
You've misread the post - Luke is saying that he doesn't think the "usual defeaters" are the most likely explanation.
Re-reading, the whole thing is pretty unclear!
As katydee and thomblake say, I mean that working for SingInst would mean a bigger reduction in my salary than I could currently bring myself to accept. If I really valued the lives of strangers as a utilitarian, the benefits to them of taking a salary cut would be so huge that it would totally outweigh the costs to me. But it looks like I only really place direct value on the short-term interests of myself and those close to me, and everything else is purchase of moral satisfaction. Happily, purchase of mora...
As someone who has read Eliezer's metaethics sequence, let me say that what you think his position is, is only somewhat related to what it actually is; and also, that he has answered those of your objections that are relevant.
It's fine that you don't want to read 30+ fairly long blog posts, especially if you dislike the writing style. But then, don't try to criticize what you're ignorant about. And no, openly admitting that you haven't read the arguments you're criticizing, and claiming that you feel guilty about it, doesn't magically make it more acceptable. Or honest.
His solution: morality is the function that the brain of a fully informed subject computes to determine what's right. Laughable; pathologically arrogant.
You either didn't read that sequence carefully, or are intentionally misrepresenting it.
He thinks the social institution of science is superfluous, were everyone as smart as he.
Didn't read that sequence carefully either.
...That simplicity in the information sense equates with parsimony is most unlikely; for one thing, simplicity is dependent on choice of language--an insight that should be almos
The point is that you would hardly be so severe on someone unless you disagreed strongly.
I disagree; a downvote is not 'severe'.
The kind of nitpicking you engage in your post would ordinarily lead you to be downvoted
I disagree; meta-discussions often result in many upvotes.
It was that you treat discussion of karma as an unconditional wrong.
I do not, and have stated as much.
There's no rational basis for throwing it in as an extra negative when the facts aren't right.
If there is no point in downvoting incorrect facts, then I wonder what the do...
Leaving aside the question of whether Tool AI as you describe it is possible until I've thought more about it:
The idea of a "self-improving algorithm" intuitively sounds very powerful, but does not seem to have led to many "explosions" in software so far (and it seems to be a concept that could apply to narrow AI as well as to AGI).
Looking to the past for examples is a very weak heuristic here, since we have never dealt with software that could write code at a better than human level before. It's like saying, before the invention o...
Total available downvotes are a high number (4 times total karma, if I recall correctly), and in practice I think they prevent very few users from downvoting as much as they want.
By all means, continue. It's an interesting topic to think about.
The problem with "atoms up" simulation is the amount of computational power it requires. Think about the difference in complexity when calculating a three-body problem as compared to a two-body problem?
Than take into account the current protein folding algorithms. People have been trying to calculate folding of single protein molecules (and fairly short at that) by taking into account the main physical forces at play. In order to do this in a reasonable amount of time, great shortcu...
Yes, the most that has ever happened to anyone who talked to EY about building an AGI is some mild verbal/textual abuse.
I agree with gwern's assessment of your arguments.
EDIT: Also, I am not affiliated with the SI.
I read your post on habit theory, and I liked it, but I don't think it's an answer to the question "What should I do?"
It's interesting to say that if you're an artist, you might get more practical use out of virtue theory, and if you're a politician, you might get more practical use out of consequentialism. I'm not sure who it is that faces more daily temptations to break the rules than the rest of us; bankers, I suppose, and maybe certain kinds of computer security experts.
Anyway, saying that morality is a tool doesn't get you out of the origina...
It is not strong. The basic idea is that if you pull a mind at random from design space then it will be unfriendly. I am not even sure if that is true. But it is the strongest argument they have. And it is completely bogus because humans do not pull AGI's from mind design space at random.
An AI's mind doesn't have to be pulled from design space at random to be disastrous. The primary issue that the SIAI has to grapple with (based on my understanding,) is that deliberately designing an AI that does what we would want it to do, rather than fulfilling proxy...
[This thread presents a good opportunity to exercise the (tentatively suggested) norm of indiscriminately downvoting all comments in pointless conversations, irrespective of individual quality or helpfulness of the comments.]
The main difference is that if there's reason to presume that they're lying, any claims of "we've implemented these improvements" that you can't directly inspect become worthless. Right now, if they say something like "Meetings with consultants about bookkeeping/accounting; currently working with our accountant to implement best practices and find a good bookkeeper", I trust them enough to believe that they're not just making it up even though I can't personally verify it.
I agree that that the tone on both sides is intentionally respectful, and that people here delude themselves if they imagine they aren't up for a bit of mockery from high status folks who don't have the patience to be really engage.
I agree that we don't really know what to expect from the first program that can meaningfully improve itself (including, I suppose, its self-improvement procedure) at a faster pace than human experts working on improving it. It might not be that impressive. But it seems likely to me that it will be a big deal, if ever we get there.
But you're being vague otherwise. Name a crazy or unfounded belief.
What you say might be true if the only way to do good was to get money from donors. But of course that is not true: a do-gooder can become a donor himself or if he is too poor to donate, he can devote his energies to becoming richer so that he can donate time or money in the future (which is in fact the course that most of the young people inspired by SI's mission are taking).
I am more comfortable speaking about individual altruists rather than charitable organizations. If an individual altruist can find a charity to employ him or find a patron to support...
If you downvote discussion of karma--like you did--simply for mentioning it, even where relevant, then you effectively soft-censor any discussion of karma. How is that rational?
I don't do that; I only downvote when it's combined with incorrect facts. Which I'm tempted to do for this statement: "like you did--simply for mentioning it", since you're inferring my motivations, and once again incorrect.
I am very happy to see this post and the subsequent dialogue. I've been talking with some people at Giving What We Can about volunteering (beginning in June) to do statistical work for them in trying to find effective ways to quantify and assess the impact of charitable giving specifically to organizations that work on mitigating existential risks. I hope to incorporate a lot of what is discussed here into my future work.
Given that much of the discussion revolves around the tool/agent issue, I'm wondering if anyone can point me to a mathematically precise definition of each, in whatever limited context it applies.
It's mostly a question for philosophy of mind, I think specifically a question about intentionality. I think the closest you'll get to a mathematical framework is control theory; controllers are a weird edge case between tools and very simple agents. Control theory is mathematically related to Bayesian optimization, which I think Eliezer believes is fundamental to intelligence: thus identifying cases where a controller is a tool or an agent would be directly relevant. But I don't see how the mathematics, or any mathematics really, could help you. It's possible that someone has mathematized arguments about intentionality by using information theory or some such, you could Google that. Even so I think that at this point the ideas are imprecise enough such that plain ol' philosophy is what we have to work with. Unfortunately AFAIK very few people on LW are familiar with the relevant parts of philosophy of mind.
It is an EY's announced intention to work toward an AI that is provably friendly. "Provably" means that said AI is defined in some mathematical framework first. I don't see how one can make much progress in that area before rigorously defining intentionality.
I guess I am getting ahead of myself here. What would a relevant mathematical framework entail, to begin with?
The organization section touches on something that concerns me. Developing a new decision theory sounds like it requires more mathematical talent than the SI yet has available. I've said before that hiring some world-class mathematicians for a year seems likely to either get said geniuses interested in the problem, to produce real progress, or to produce a proof that SI's current approach can't work. In other words, it seems like the best form of accountability we can hope for given the theoretical nature of the work.
Now Eliezer is definitely looking for p...
SI should be considered a cult-like community in which dissent is ruthlessly suppressed in order to exaggerate the point of view of SI's founders and controllers, regardless of the scientific merits of those views, or of the dissenting opinions.
Obligatory link: You're Calling Who a Cult Leader?
Also, your impression might be different if you had witnessed the long, deep, and ongoing disagreements between Eliezer and I about several issues fundamental to SI — all while Eliezer suggested that I be made Executive Director and then continued to support me in...
However, my experience with SI is that when I tried to raise these concerns back in 2005/2006 I was subjected to a series of attacks that culminated in a tirade of slanderous denunciations from the founder of SI, Eliezer Yudkowsky.
I am frequently subjected to abusive personal attacks in which reference is made to Yudkowsky's earlier outburst
Link to the juicy details cough I mean evidence?
All possible. However, if you can explain anything, the explanation counts for nothing. The question is which explanation is the most likely, and "there is evidence for fair-mindedness (but it is mostly fake!)" is more contrived than "there is evidence for fair-mindedness", as an explanation for the upvotes of OP.
"Friendliness" is (the way I understand it) a constraint on the purposes and desired consequences of the AI's actions, not on what it is allowed to think. It would be able to think of non-Friendly actions, if only for the purposes of e.g. averting them when necessary.
As for Bayesianism, my guess is that even a Seed AI has to start somehow. There's no necessary constraint on it remaining Bayesian if it manages to figure out some even better theory of probability (or if it judges that a theory humans have developed is better). If an AI models itself performing better according to its criteria if it used some different theory, it will ideally self-modify to use that theory...
A non-planning oracle AI would predict all the possible futures, including the effects of it's prediction outputs, human reactions, and so on.
How exactly does an Oracle AI predict its own output, before that output is completed?
One quick hack to avoid infinite loops could be for an AI to assume that it will write some default message (an empty paper, "I don't know", an error message, "yes" or "no" with probabilities 50%), then model what would happen next, and finally report the results. The results would not refer to the a...
In the infinite number of possible paths, the percent of paths we are adding up to here is still very close to zero.
Perhaps I can attempt another rephrasing of the problem: what is the mechanism that would make an AI automatically seek these paths out, or make them any more likely than infinite number of other paths?
I.e. if we develop an AI which is not specifically designed for the purpose of destroying life on Earth, how would that AI get to a desire to destroy life on Earth, and by which mechanism would it gain the ability to accomplish its goal?
This ...
I don't really see what the risk is...
As far as I understand, the SIAI folks believe that the risk is, "you push the Enter key, your algorithm goes online, bootstraps itself to transhuman superintelligence, and eats the Earth with nanotechnology" (nanotech is just one possibility among many, of course). I personally don't believe we're in any danger of that happening any time soon, but these guys do. They have made it their mission in life to prevent this scenario from happening. Their mission and yours appear to be in conflict.
You seem to be confusing "language relative" with "non-mathematical." Kolmogorov Complexity is "language-relative," if I'm understanding you right; specifically, it's relative (if I'm using the terminology right?) to a Turing Machine. This was not relevant to Eliezer's point, so it was not addressed.
(Incidentally, this is a perfect example of you "hold{ing} views contrary to scientific consensus in specialized fields where {you} lack expert knowledge based on pure ratiocination," since Kolmogorov Complexity is "...
You know, the sequences aren't actually poorly written. I've read them all, as have most of the people here. They are a bit rambly in places, but they're entertaining and interesting. If you're having trouble with them, the problem might be on your end.
In any case, if you had read them, you'd know, for instance, that when Yudkowsky talks about simplicity, he is not talking about the simplicity of a given English sentence. He's talking about the combined complexity of a given Turing machine and the program needed to describe your hypothesis on that Turing machine.
When I put on my donor hat, that is, when I imagine my becoming a significant donor, I tend in my imaginings and my plans to avoid anything that interferes with deriving warm fuzzies from the process of donating or planning to donate -- because when we say "warm fuzzies" we are referring to (a kind of) pleasure, and pleasure is the "gasoline" of the mind: it is certainly not the only thing that can "power" or "motivate" mental work, but it is IMHO the best fuel for work that needs to be sustained over a span of years...
since even "finding prime numbers" fills the galaxy with an amazing, nanotech-capable spacefaring civilization
The goal "finding prime numbers" fills the galaxy with an amazing, nonotech-capable spacefaring network of computronium which finds prime numbers, not a civilization, and not interesting.
So you're saying Earth Yudkowsky (EY) argues:
There is little prospect of an outcome that realizes even the value of being interesting, unless the first superintelligences undergo detailed inheritance from human values
and Mars Yudkowsky (MY) argues:
There is little prospect of an outcome that realizes even the value of being interesting, unless the first superintelligences undergo detailed inheritance from martian values
and that one of these things has to be incorrect? But if martian and human values are similar, then they can both be right, and if ...
I've read most of that now, and have subscribed to your newsletter.
Reasonable people can disagree in estimating the difficulty of AI and the visibility/pace of AI progress (is it like hunting for a single breakthrough and then FOOM? etc).
I find all of your "it feels ridiculous" arguments by analogy to existing things interesting but unpersuasive.
Here is a comment that garnered almost 800 responses and was voted up 37. Why wasn't it promoted?
Can comments be promoted? Perhaps the commenter should have been encouraged to turn his comment into a top-level post, but a moderator can't just change a comment into a promoted post with the same username. Also it would have split the discussion, so people might have been reluctant to encourage that.
As for people tending to post more in Discussion than Main, I read somewhere that Discussion has more readers. I for one read Discussion almost exclusively.
If a tool AI is programmed with a strong utility function to get accurate answers, is there a risk of it behaving like a UFAI to get more resources in order to improve its answers?
GiveWell, I think, could be understood as an organization that seeks to narrow the gap for a charity between "seem more impressive to donors" and "show more convincing empirical evidence of effectiveness." That is, they want other donors to be more impressed by better (i.e. more accurate) signals of effectiveness and less by worse (i.e. less accurate) signals.
If GiveWell succeeds in this there are two effects:
1) More donor dollars go to charities that demonstrate themselves to be effective.
2) Charities themselves become more effective,...
I agree with timtyler's comment that Objections 1 and 2 are bogus, especially 2. The tool-AGI discussion reveals significant misunderstanding, I feel. Despite this, I think it is still a great and useful post.
Another sort of tangential issue is that this post fails to consider whether or not lots of disparate labs are just going to undertake AGI research regardless of SIAI. If lots of labs are doing that, it could be dangerous (if SIAI arguments are sound). So one upside to funding an organization like SIAI is that it will kind of rake the attention to a c...
Let's say that the tool/agent distinction exists, and that tools are demonstrably safer. What then? What course of action follows?
Should we ban the development of agents? All of human history suggests that banning things does not work.
With existential stakes, only one person needs to disobey the ban and we are all screwed.
Which means the only safe route is to make a friendly agent before anyone else can. Which is pretty much SI's goal, right?
So I don't understand how practically speaking this tool/agent argument changes anything.
Another datapoint to compare and contrast with Salemicus's (our political positions are very different):
Like Salemicus, I am not very optimistic that you're actually asking a serious question with the intention of listening to the answers; if you are, you might want to reconsider how your writing comes across.
I think it's perfectly possible, and reasonable, to be concerned about more than one issue at a time.
(For the record, I ended up editing in the "(4 times total karma, if I recall correctly)" after posting the comment, and you probably replied before seeing that part.)
I can't tell which way your sarcasm was supposed to cut.
The obvious interpretation is that you think rationality is somehow hindered by paying attention to form rather than substance, and the "exemplary rationality" was intended to be mocking.
But your comment being referenced was an argument that form has something very relevant to say about substance, so it could also be that you were actually praising gwern for practicing what you preach.
To clarify it better: the Roko incident illustrates how seriously some members of LW take nonsense conjectured threats. The fact of censorship is quite irrelevant.
You can't have it both ways. If it's nonsense, then the importance is that someone took it seriously (like a donor), not anyone's reaction to that someone taking it seriously (like Eliezer). If it's not nonsense, then someone taking it seriously is not the issue, but someone's reaction to taking it seriously (the censorship). Make up your mind.
...The HS dropping out and lack of accomplishments
Protein folding models are generally at least as bad as NP-hard, and some models may be worse. This means that exponential improvement is unlikely. Simply put, one probably gets diminishing marginal returns for how much one can computer further in terms of how much improvement one has already done.
It's #3. (B.Sc. in biochemistry, did my Ph.D. in proteomics.)
First, the set of polypeptide sequences that have a repeatable final conformation (and therefore "work" biologically) is tiny in comparison to the set of all possible sequences (of the 20-or-so naturally amino acid monomers). Pick a random sequence of reasonable length and make many copies and you get a gummy mess. The long slow grind of evolution has done the hard work of finding useful sequences.
Second, there is an entire class of proteins called chaperones) that assist macromolecular assembly, including protein folding. Even so, folding is a stochastic process, and a certain amount of newly synthesized proteins misfold. Some chaperones will then tag the misfolded protein with ubiquitin, which puts it on a path that ends in digestion by a proteasome.
One example here is the Steiner tree problem, which is NP-complete and can sort of be solved using soap films. Bringsjord and Taylor claimed this implies that P = NP. Scott Aaronson did some experimentation and found that soap films 1) can get stuck at local minima and 2) might take a long time to settle into a good configuration.
That's not actually that good, I don't think-- I go to a good college, and I know many people who are graduating to 60k-80k+ jobs with recruitment bonuses, opportunities for swift advancement, etc. Some of the best people I know could literally drop out now (three or four weeks prior to graduation) and immediately begin making six figures.
SIAI wages certainly seem fairly low to me relative to the quality of the people they are seeking to attract, though I think there are other benefits to working for them that cause the organization to attract skillful people regardless.
The key point of economics you are missing here is the price of wood was driven up by increased demand. Wood never ran out, but it did become so expensive that some uses became uneconomical. This allowed substitution of the previously more expensive coal. This did not happen because of poor management of forests. Good management of forests might have encouraged it, by limiting the amount of wood taken for burning.
This is especially true because we are not talking about a modern globalized economy where cheap sugar from Brazil, corn from Kansas, or pine...
One wonders when or if XiXiDu will ever get over the Roko incident. Yes, it was a weird and possibly disproportionate response, but it was also years ago.
And that the lesswrong.com sequences are not original or important but merely succeed at drowning out all the craziness they include by a huge amount of unrelated clutter and an appeal to the rationality of the author.
Name three examples? (Of 'craziness' specifically... I agree that there are frequent, and probably unecessary, "appeals to the rationality of the author".)
Anyway, it feels completely ridiculous to talk about it in the first place. There will never be a mind that can quickly and vastly improve itself and then invent all kinds of technological magic to wipe us out. Even most science fiction books avoid that because it sounds too implausible.
Says the wooly mammoth, circa 100,000 BC.
Sounding silly and low status and science-fictiony doesn't actually make it unlikely to happen in the real world.
I downvote any post that says "I expect I'll get downvoted for this, but..." or "the fact that I was downvoted proves I'm right!"
I don't think it is suppression of dissent per se. It is more annoying behavior- it implies caring a lot about the karma system, and it is often not even the case when people say that they will actually get downvoted. If it is worth the probable downvote, then they can, you know, just take the downvote. If they want to point out that a view is unpopular they can just say that explicitly. It is also annoying to people like me, who are vocal about a number of issues that could be controversial here (e.g. criticizing Bayesianism, cryonics,, and whether intelligence explosions would be likely) and get voted up. More often than not, when someone claims they are getting downvoted for having unpopular opinions, they are getting downvoted in practice for having bad arguments or for being uncivil.
There are of course exceptions to this rule, and it is disturbing to note that the exceptions seem to be coming more common (see for example, this exchange where two comments are made with about the same quality of argument and about the same degree of uncivility- ("I'm starting to hate that you've become a fixture here." v. "idiot" - but one of the comments is at +10 and the o...
First, none of this dissent has been suppressed in any real sense. It's still available to be read and discussed by those who desire reading and discussing such things. The current moderation policy has currently only kicked in when things have gotten largely out of hand -- which is not the case here, yet.
Second, net karma isn't a fine enough tool to express amount of detail you want it to express. The net comment on your previous comment is currently -2; congrats, you've managed to irritate less than a tenth of one percent of LW (presuming the real karma ...
I don't see the relatively trivial, but important, improvements you've made in a short period of time being made because they were made years ago. And I thought that already accounting for the points you've made.
I don't know what these sentences mean.
So, how did they apparently miss something like opportunity cost? Why, for instance, has their salaries increased when they could've been using it to improve the foundation of their cause from which everything else follows?
Actually, salary increases help with opportunity cost. At very low salaries, SI ...
I think it's crucial that SI stay in the Bay Area. Being in a high-status place signals that the cause is important. If you think you're not taken seriously enough now, imagine if you were in Honduras...
Not to mention that HR is without doubt the single most important asset for SI. (Which is why it would probably be a good idea to pay more than the minimum cost of living.)
FWIW, Wikimedia moved from Florida to San Francisco precisely for the immense value of being at the centre of things instead of the middle of nowhere (and yes, Tampa is the middle of nowhere for these purposes, even though it still has the primary data centre). Even paying local charity scale rather than commercial scale (there's a sort of cycle where WMF hires brilliant kids, they do a few years working at charity scale then go to Facebook/Google/etc for gobs of cash), being in the centre of things gets them staff and contacts they just couldn't get if they were still in Tampa. And yes, the question came up there pretty much the same as it's coming up here: why be there instead of remote? Because so much comes with being where things are actually happening, even if it doesn't look directly related to your mission (educational charity, AI research institute).
(nods) Yeah, that's been my experience too, though I've often suspected that companies like Google probably have a lot of research on the subject lying around that might be informative.
Some friends of mine did some experimenting along these lines when doing distributed software development (in both senses) and were somewhat startled to realize that Dark Age of Camelot worked better for them as a professional conferencing tool than any of the professional conferencing tools their company had. They didn't mention this to their management.
But if there's even a chance …
Holden cites two posts (Why We Can’t Take Expected Value Estimates Literally and Maximizing Cost-effectiveness via Critical Inquiry). They are supposed to support the argument that small or very small changes to the probability of an existential risk event occurring are not worth caring about or donating money towards.
I think that these posts both have serious problems (see the comments, esp Carl Shulman's). In particular Why We Can’t Take Expected Value Estimates Literally was heavily criticised by Robin Hanson in On Fudge Fa...
Humanity isn't that bad. Remember that the world we live in is pretty much the way humans made it, mostly deliberately.
But my main point was that existing humanity bypasses the very hard did-you-code-what-you-meant-to problem.
Tool-based works might be a faster and safer way to create useful AI, but as long as agent-based methods are possible it seems extremely important to me to work on verifying friendliness of artificial agents.
I elaborated further on the distinction and on the concept of a tool-AI in Karnofsky/Tallinn 2011.
Holden's notes from that conversation, posted to the old GiveWell Yahoo Group as a file attachment, do not appear to be publicly available anymore. Jeff Kaufman has archived all the messages from that mailing list, but unfortunately his archive does not include file attachments. Has anyone kept a copy of that file by any chance?
In the slim chance that your question is non-rhetorical:
At the same time, the lesson to be learned is that useful ai can have a utility function which is pretty mundane -- e.g. "find a fast route from point A to point B while minimizing the chances of running off the road or running into any people or objects."
Self-driving cars aren't piloted by AGIs in the first place, let alone dangerous "world-optimization" AGIs.
...Similarly, instead of telling AI to "improve human welfare" we can tell it to do things like "find ways to kill cancerous cells while keeping collateral damage
Judging from the success rate that VCs have at predicting successful startups, I conclude that the "pure unfounded belief on the one hand, well-founded belief on the other" metric is not easily applied to real organizations by real observers.
I could consistently choose to consider my brain's hardwired moralisms maladaptive or even despicable holdovers from the evolutionary past that I choose to override as much as I can.
And you would be making the decision to override with... what, your spleen?
I currently need 413 more points to downvote at all.
So how many downvotes did you use when your karma was still highly positive? That's likely a major part of that result.
But what a way to discuss this: "high number." If this is supposed to be a community forum, why doesn't the community even know the number--or even care.
The main points of the limit are 1) to prevent easy gaming of the system and 2) to prevent trolls and the like from going though and downvoting to a level that doesn't actually reflect communal norms. In practice, 1 and ...
AGI researchers sound a lot like FinalState when they think they'll have AGI cracked in two years.
That is my point: it doesn't get to find out about general human behavior, not even from the Internet. It lacks the systems to contextualize human interactions, which have nothing to do with general intelligence.
Take a hugely mathematically capable autistic kid. Give him access to the internet. Watch him develop ability to recognize human interactions, understand human priorities, etc. to a sufficient degree that it recognizes that hacking an early warning system is the way to go?
Let's do the most extreme case: AI's controlers give it general internet access to do helpful research. So it gets to find out about general human behavior and what sort of deceptions have worked in the past.
None work reasonably well. Especially given that human power games are often irrational.
There are other question marks too.
The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.
The problem is that you won't beat a ...
The existence of third-party anti-technology terrorists adds something to the conversation beyond the risks FinalState can directly pose to SIAI-folk and vice versa. I'm curious about gwern's response, especially, given his interest in Death Note, which describes a world where law enforcement can indirectly have people killed just by publishing their identifying information.
Would you mind explaining how what I have said is ahistorical nonsense?
Yes, at the end of the 18th century there was transatlantic trade. However, it was not cheap. It was sail powered and relatively expensive compared to modern shipping. Coal was generally not part of this trade. Shipping was too expensive. English industry used English mined coal. Same with American and German industry. If shipping coal was too expensive, why would charcoal be economical? You have jumped from "transportation existed" to "the costs of transportation...
Truly maximizing entropy would involve burning everything you can burn, tearing the matter of solar systems apart, accelerating stars towards nova, trying to accelerate the evaporation of black holes and prevent their formation, and other things of this sort. It'd look like a dark spot in the sky that'd get bigger at approximately the speed of light.
Consider the double standard involved. Yudkowsky lambasts "philosophers" and their "confusions"--their supposedly misguided concerns with the issues other philosophers have commented on to the detriment of inquiry. Has Yudkowsky read even a single book by each of the philosophers he dismisses?
Some of them are simply not great writers. Hegel for example is just awful- the few coherent ideas in Hegel are more usefully described by other later writers. There's also a strange aspect to this in that you are complaining about Eliezer not h...
Science is built around the assumption that you’re too stupid and self-deceiving to just use Solomonoff induction.
He thinks the social institution of science is superfluous, were everyone as smart as he.
This is obviously false. Yudkowsky does not claim to be able to do Solomonoff induction in his head.
In general, when Yudkowsky addresses humanity's faults, he is including himself.
The article is interesting, but I'm not sure it is relevant as the humans involved weren't directing or monitoring the overall process, just taking part in it. Analogously even if an AGI requires my assistance/authorization to do certain things, that doesn't give me any control over it unless I understand the consequences.
Also general warning against 'generalising from fictional evidence.'
No doubt a Martian Yudkowsy would make much the same argument - but they can't both be right.
Why?
Consider what would have happened had Yudkowsky not shown exceptional receptivity to this post: he would have blatantly proven his critics right.
After turning this statement around in my head for a while I'm less certain than I was that I understand its thrust. But assuming you mean those critics pertinent to lukeprog's post, i.e. those claiming LW embodies a cult of personality centered around Eliezer -- well, no. Eliezer's reaction is in fact almost completely orthogonal to that question.
If you receive informed criticism regarding a project you're h...
Given some of the translation debates I've heard, I'm not convinced it would be possible even with AGI. You can't give a clear translation of a vague original, to name the most obvious problem.
What, really? You don't have anything specific or technical to say about the argument, you just find the argument "bogus" and suggest that the author doesn't know what he's talking about, without actually making a counterpoint of your own? I felt the first point was particularly valid... FAI is, after all, a really hard problem, and it is a fair point to ask why any group thinks it has the capacity to solve it perfectly on the first try, or to know that it's solution would work short of testing it. The second, on the other hand, is an interest...
For the same reason that a personal assistant is vastly more useful and powerful than a PDA, even though they might nominally serve the same function of remembering phone numbers, appointments, etc. people are extremely likely to want to create agent AIs.
There's really no such thing as a "super bug". All organisms follow the same constraints of biology and epidemiology. If there was even some magical "super bug" it would infect everything of any remotely compatible species, not be constrained to one species, and small subset of cells in it.
We might not have any drugs ready for a particular infection, but we didn't have any for SARS, it was extremely infectious, and extremely deadly, and it worked perfectly fine in the end. We have tools like quarantine, detection etc. which work against...
Very good. Objection 2 in particular resonates with my view of the situation.
One other thing that is often missed is the fact that SI assumes that development of superinteligent AI will precede other possible scenarios - including the augmented human intelligence scenario (CBI producing superhumans, with human motivations and emotions, but hugely enhanced intelligence). In my personal view, this scenario is far more likely than the creation of either friendly or unfriendly AI, and the problems related to this scenario are far more pressing.
Existential risk reduction is a very worthy cause. As far as I can tell there are a few serious efforts - they have scenarios which by outside view have non-negligible chances, and in case of many of these scenarios these efforts make non-negligible difference to the outcome.
Such efforts are:
I don't engage with this poster because of his past dishonesty, i.e. misrepresenting my posts. If anyone not on my *(&^%-list is curious, I am happy to provide references.
I applaud your decision to not engage (as a good general strategy given your state of belief---the specifics of the conflict do not matter). I find it usually works best to do so without announcing it. Or, at least, by announcing it sparingly with extreme care to minimize the appearance of sniping.
Retraction means that you no longer endorse the contents of a comment. The comment is not deleted so that it will not break existing conversations. Retracted comments are no longer eligible for voting. Once a comment is retracted, it can be revisited at which point there is a 'delete' option, which removes the comment permanently.
As time goes on it becomes increasingly possible that some small group or lone researcher is able to put the final pieces together and develop an AGI.
Why do you think this is the case? Is this just because the overall knowledge level concerning AI goes up over time? If so, what makes you think that that rate of increase is anything large enough to be significant?
I would call that securing a turing machine. A computer, colloquially, has accessible inputs and outputs, and its value is subject to network effects.
Also, if you put the computer in a box developed decades ago, the box probably isn't TEMPEST compliant.
I should also link trolley problem discussions perhaps.
Trolley problems are a standard type of problem discussed in intro psychology and intro philosophy classes in colleges. And they go farther, with many studies just about how people respond or think about them. That LW would want to discuss trolley problems or that different people would have wildly conflicting responses to them shouldn't be surprising- that's what makes them interesting. Using them as evidence that LW is somehow bad seems strange.
I read your three-part series. Your posts did not substantiate the claim "good thinking requires good writing." Your second post slightly increased my belief in the converse claim, "good thinkers are better-than-average writers," but because the only evidence you provided was a handful of historical examples, it's not very strong evidence. And given how large the population of good thinkers, good writers, bad thinkers, and bad writers is relative to your sample, evidence for "good thinking implies good writing" is barely worth registering as evidence for "good writing implies good thinking."
There are at present an estimated 2 Billion internet users. There are an estimated 13 Billion neurons in the human brain. On this basis for approximation the internet is even now only one order of magnitude below the human brain and its growth is exponential.
There are only 7 billion people on the planet, even if all of them gained internet access that would still be fewer than 13 billion. In this case, instead of looking at the exponential graph, consider where it needs to level off. It also isn't at all clear to me why this analogy matters in any usefu...
Asking that a critic read those sequences in their entirety is asking for a huge sacrifice; little is offered to show it's even close in being worth the misery of reading inept writing or the time.
Indeed, the sequences are long. I'm not sure about the others here, but I've never asked anybody to "read the sequences."
But I don't even know how to describe the arrogance required to believe that you can dismiss somebody's work as "crazy," "stupid," "megalomanic," "laughably, pathologically arrogant," "b...
A language can be Turing-complete while still being so impractical that writing a program to solve a certain problem will seldom be any easier than solving the problem yourself (exhibits A and B). In fact, I guess that a vast majority of languages in the space of all possible Turing-complete languages are like that.
(Too bad that a human's “easier” isn't the same as a superhuman AGI's “easier”.)
I don't think it's a rational method to treat people differently, as inherently less rational, when they seem resentful.
Thank you for this analysis, it made me think more about my motivations and their validity. I believe that my decision to permanently disengage from discussions with some people is based on the futility of such discussions in the past, not on the specific reasons they are futile. At some point I simply decide to cut my losses.
...There's actually good reason for the broader meaning of "ax to grind." Any special stake is a bias
Yes, I am; I think that the human value of interestingness is much, much more specific than the search space optimization you're pointing at.
[This reply was to an earlier version of timtyler's comment]
Why? This isn't obvious to me. If the remaining comments are highly upvoted and of correspondingly high quality then it would make sense for them to stick around. Timtyler may be a in a similar category.
Enough for you to agree with Holden on that point?
Probably not. He and I continue to dialogue in private about the point, in part to find the source of our disagreement.
Yes, but I wouldn't set a limit at a specific salary range; I'd expect them to give as much as they optimally could, because I assume they're more concerned with the cause than the money. (re the 70k/yr mention: I'd be surprised if that was anywhere near optimal)
I believe everyone except Eliezer currently makes between $42k/yr and $48k/yr — pretty low for the cost of living in the Bay Area.
I'm mildly surprised that this post has not yet attracted more criticism. My initial reaction was that criticisms (1) and (2) seemed like strong ones, and almost posted a comment saying so. Then I thought, "I should look for other people discussing those points and join that discussion." But after doing that, I feel like people haven't given much in the way of objections to (1) and (2). Perceptions correct? Do lots of other people agree with them?
I believe that the probability that SI's concept of "Friendly" vs. "Unfriendly" goals ends up seeming essentially nonsensical, irrelevant and/or unimportant from the standpoint of the relevant future is over 90%.
It seems like an odd thing to say. Why take the standpoint of the "relevant future"? History is written by the winners - but that doesn't mean that their perspective is shared by us. Besides the statement is likely wrong - "Friendly" and "Unfriendly" as defined by Yudkowsky are fairly reasonable and useful concepts.
Regarding tools versus agent AGI's, I think the desired end game is still an Friendly Agent AGI. I am open to tool AIs being useful in the path to building such an agent. Similar ideas advocated by SI include use of automated theorem provers in formally proving Friendliness, and creating a seed AI to compute the Coherent Extropolated Volition of humanity and build an FAI with the appropiate utility function.
I'm a regular, and I was impressed with it. Many other regulars have also said positive things about it, so possible explanation 1 is out. And unless I'm outright lying to you, 2, if true, would have to be entirely subconscious.
I suspect a crazy dictator with a super-capable tool AI would have unusually good counter-assassination plans, simplified by the reduced need for human advisors and managers of imperfect loyalty. Likewise, a medical expert system could provide gains to lifespan, particularly if it were backed up by the resources a paranoid megalomaniac in control of a small country would be willing to throw at a major threat.
/me shrugs
Maybe Ignaz Semmelweis would have been a better example?
I also found a list of "crackpots who were right" by Googling.
I don't see the circularity.
Just because a warrior is victorious doesn't necessarily mean they won before going to war; it might be instead that victorious warriors go to war first and then seek to win, and defeated warriors do the same thing.
Can you spell out the circularity?
I have personally felt the same feelings and I think I have pinned down the reason. I welcome alternative theories, in the spirit of rational debate rather than polite silence.
Mm. This is why an incompetent nonprofit can linger for years: no-one is doing what they do, so they feel they still have to exist, even though they're not achieving much, and would have died already as a for-profit business. I am now suspecting that the hard part for a nonprofit is something along the lines of working out what the hell you should be doing to achieve your goal. (I would be amazed if there were not extensive written-up research in this area, though I don't know what it is.)
Problems with linguistic prescriptivism.
Your comment was a pretty cute tu quoque, but arguing against prescriptivism doesn't mean giving up the ability to assert propositions.
Then, instituting a downvoting system that allows control by the high-karma elite: the available downvotes (but not upvotes--the masses must be kept content) are distributed based on the amount of accumulated karma. Formula nonpublic, as far as I can tell.
The formula max is 4*total karma. I'm curious- if there were a limit on the total number of upvotes also, would you then say that this was further evidence of control of entrenched users. If one option leads to a claim about keeping the masses content and the reverse would lead to a different set of a...
That subset of humanity holds considerably less power, influence and visibility than its counterpart; resources that could be directed to AI research and for the most part aren't. Or in three words: Other people matter. Assuming otherwise would be a huge mistake.
I took Wei_Dai's remarks to mean that Luke's response is public, and so can reach the broader public sooner or later; and when examined in a broader context, that it gives off the wrong signal. My response was that this was largely irrelevant, not because other people don't matter, but because of other factors outweighing this.
I see no reason for it to do that before simple input-output experiments, but let's suppose I grant you this approach. The AI simulates an entire community of mini-AI and is now a master of game theory.
It still doesn't know the first thing about humans. Even if it now understands the concept that hiding information gives an advantage for achieving goals - this is too abstract. It wouldn't know what sort of information it should hide from us. It wouldn't know to what degree we analyze interactions rationally, and to what degree our behavior is random. It wo...
Yes, I'd say so. It isn't helpful here to say that a system lacks a theory of mind if it has a mechanism that allows it to make predictions about reported beliefs, intentions, etc.
Nonetheless, the risk in question is also a personal risk of death for every genius... now idk how do we define geniuses here but obviously most geniuses could be presumed pretty good at preventing their own deaths, or deaths of their families.
That seems like a pretty questionable presumption to me. High IQ is linked to reduced mortality according to at least one study, but that needn't imply that any particular fatal risk be likely to be uncovered, let alone prevented, by any particular genius; there's no physical law stating that lethal threats must ...
What is it that we would actually be disagreeing about, other than what English phrase to use to describe the system's underlying model(s)?
We would be disagreeing about the form of the system's underlying models.
2 different strategies to consider:
I know that Steve believes that red blinking lights before 9 AM are a message from God that he has not been doing enough charity, so I can predict that he will give more money to charity if I show him a blinking light before 9 AM.
Steve seeing a red blinking light before 9 AM has historically resulted in a 2
Thank you for the links.
Please note that none of the evidence shows the donor status of the anonymous people/person who actually had nightmares, and the two named individuals did not say it gave them nightmares, but used a popular TVTropes idiom, "Nightmare Fuel", as an adjective.
Humans learn most of what they know about interacting with other humans by actual practice. A superhuman AI might be considerably better than humans at learning by observation.
Well, the evil compiler is I think the most nefarious thing anyone has come up with that's a publicly known general stunt. But it is by nature a long-term trick. Similar remarks apply to the Stuxnet point- in that context, they wanted to destroy a specific secure system and weren't going for any sort of largescale global control. They weren't people interested in being able to take all the world's satellite communications in their own control whenever they wanted, nor were they interested in carefully timed nuclear meltdowns.
But there are definite ways th...
But even humans have trouble with this sometimes. I was recently reading the Wikipedia article Hornblower and the Crisis which contains a link to the article on Francisco de Miranda. It took me time and cues when I clicked on it to realize that de Miranda was a historical figure.
So your question/objection/doubt is really just the typical boring doubt of AGI feasibility in general.
Isn't Kalla's objection more a claim that fast takeovers won't happen because even with all this data, the problems of understanding humans and our basic cultural norms will...
Yes. If we have an AGI, and someone sets forth to teach it how to be able to lie, I will get worried.
I am not worried about an AGI developing such an ability spontaneously.
You are correct. I did not phrase my original posts carefully.
I hope that my further comments have made my position more clear?
As a supporter and donor to SI since 2006, I can say that I had a lot of specific criticisms of the way that the organization was managed. I was surprised that on many occasions management did not realize the obvious problems and fix them.
But the current management is now recognizing many of these points and resolving them one by one. If this continues, SI's future looks good.
You are not trying very hard. You missed the thrid alternative: Use your crazy fear mongering to convince him to stop, not just avoid SI.
I hope you're not just using this as a rhetorical opportunity to spread fear about SI.
If you are not totally incompetent or lying out of your ass, please stop. Do not turn it on. At least consult SI.
Today's ecosystems maximise entropy. Maximising primeness is different, but surely not greatly more interesting - since entropy is widely regarded as being tedious and boring.
When I read that line for the first time, I understood it. Between our two cases, the writing was the same, but the reader was different. Thus, the writing cannot be the sole cause of our different outcomes.
The reason EY wrote an entire sequence on metaethics is precisely because without the rest of the preparation people such as you who lack all that context immediately veer off course and start believing that he's asserting the existence (or non-existence) of "objective" morality, or that morality is about humans because humans are best or any other standard philosophical confusion that people automatically come up with whenever they think about ethics.
Of course this is merely a communication issue. I'd love to see a more skilled writer present EY...
An AI could be an extremely powerful optimizer without having a category for "humans" that mapped to our own. "Human," the way we conceive of it, is a leaky surface generalization.
A strong paperclip maximizer would understand humans as well as it had to to contend with us in its attempts to paperclip the universe, but it wouldn't care about us. And a strong optimizer programmed to maximize the values of "humans" would also probably understand us, but if we don't program into its values an actual category that maps to our conc...
Agreed, but my point was that I'd settle for an AI who can translate texts as well as a human could (though hopefully a lot faster). You seem to be thinking in terms of an AI who can do this much better than a human could, and while this is a worthy goal, it's not what I had in mind.
IMO it would be enough to translate the original text in such a fashion that some large proportion (say, 90%) of humans who are fluent in both languages would look at both texts and say, "meh... close enough".
Anyway, it feels completely ridiculous to talk about it in the first place. There will never be a mind that can quickly and vastly improve itself and then invent all kinds of technological magic to wipe us out. Even most science fiction books avoid that because it sounds too implausible
Do you acknowledge that :
Google Translate's translation:
...Oracle as described here: http://lesswrong.com/lw/any/a_taxonomy_of_oracle_ais/?
Why would you even got in touch with these stupid dropout? These "artificial intelligence" has been working on in the imagination of animism, respectively, if he wants to predict what course wants to be the correct predictions were.
The real work on the mathematics in a computer, gave him 100 rooms, he'll spit out a few formulas that describe the accuracy with varying sequence, and he is absolutely on the drum, they coincide with your n
It all depends on how small that small chance is. Pascal mugging is typically done with probabilities that are exponentially small, e.g. 10^-10 or so.
But what about if Holden is going to not recommend SIAI for donations when there's a 1% or 0.1% chance of it making that big difference.
The charity is still registered in Florida but the office is in SF. I can't find the discussion on a quick search, but all manner of places were under serious consideration - including the UK, which is a horrible choice for legal issues in so very many ways.
Not a big deal, but for me your "more" links don't seem to be doing anything. Firefox 12 here.
EDIT: Yup, it's fixed. :)
I'm not talking about SI (which I've never donated money to), I'm talking about you. And you're starting to repeat yourself.
Sure. As I said there, I understood you both to be attributing to this hypothetical "theory of mind"-less optimizer attributes that seemed to require a theory of mind, so I was confused, but evidently the thing I was confused about was what attributes you were attributing to it.
Hmm, and the foom belief (for instance) is based on Bayesian statistics how?
I don't think it's based on Bayesian statistics any more than any other belief may (or may not) be based. To take Eliezer specifically, he was interested in the Singularity - specifically, the Good/Vingean observation that a machine more intelligent than us ought to be better than us at creating a still more intelligent machine - long before he had his 'Bayesian enlightenment', so his shift to subjective Bayesianism may have increased his belief in intelligence explosions, but certainly didn't cause it.
if I use this kind of metric on L. Ron Hubbard...
It provides evidence in favour of him being correct. If there weren't other sources of information on Hubbard's activities, I'd expect him to be of genius-level intelligence.
You're familiar with the concept that someone looking like Hitler doesn't make them fascist, right?
Just imagine you emulated a grown up human mind
As a “superhuman AI” I was thinking about a very superhuman AI; the same does not apply to slightly superhuman AI. (OTOH, if Eliezer is right then the difference between a slightly superhuman AI and a very superhuman one is irrelevant, because as soon as a machine is smarter than its designer, it'll be able to design a machine smarter than itself, and its child an even smarter one, and so on until the physical limits set in.)
all of the hard coded capabilities of a human toddler
The hard coded capabiliti...
I'm understanding it in the typical way - the first paragraph here should be clear:
Theory of mind is the ability to attribute mental states—beliefs, intents, desires, pretending, knowledge, etc.—to oneself and others and to understand that others have beliefs, desires and intentions that are different from one's own.
An agent can model the effects of interventions on human populations (or even particular humans) without modeling their "mental states" at all.
I have never seen where the person-with-nightmares was revealed as a donor, or indeed any clue as to who they were other than 'someone Eliezer knows'. I would like some evidence, if there is any.
Also, Eliezer did not drop out of high school; he never attended in the first place, commonly known as 'skipping it', which is more common among "geniuses" (though I dislike that description).
In most such scenarios, the AI doesn't have a terminal goal of getting rid of us, but rather have it as a subgoal that arises from some larger terminal goal.
So why not the opposite, why wouldn't it have human intentions as a subgoal?
It is the case that this evidence, post update, shifts estimates significantly in direction of 'completely wrong or not even wrong' for all insights that require world class genius level intelligence, such as, incidentally, forming opinion on AI risk which most world class geniuses did not form.
Most "world class geniuses" have not opinionated on AI risk. So "forming opinion on AI risk which most world class geniuses did not form" is hardly a task which requires "world class genius level intelligence".
For a "Bayesian re...
a combination of 'high school drop out' and 'no impressive technical accomplishments' is a very strong indicator
Numbers?
If you really cared about future risk you would be working away at the problem even with a smaller salary. Focus on your work.
Downvoted for this; Rain's reply to the parent goes for me too.
Just to let you know, you've just made it on my list of the very few LW regulars I no longer bother replying to, due to the proven futility of any communications. In your case it is because you have a very evident ax to grind, which is incompatible with rational thought.
(Note: it is not downvoted as I write this comment.)
First let me say that I have enjoyed kalla's recent contributions to this site, and hope that the following won't come across as negative. But to answer your question, I at least question both the uncontrovertiality and correctness of the summary, as well as the inference that more working memory increases abilities exponentially quickly. Kalla and I discussed some of this above and he doesn't think that his claims hinge on specific facts about working memory, so most of this is irrelevant at this point, ...
I haven't read the entire post yet, but here are some thoughts I had after reading thru to about the first ten paragraphs of "Objection 2 ...". I think the problem with assuming, or judging, that tool-AI is safer than agent-AI is that a sufficiently powerful tool-AI would essentially be an agent-AI. Humans already hack other humans without directly manipulating each other's physical persons or environments, and those hacks can drastically alter theirs or others persons and (physical) environments. Sometimes the safest course is not to listen to poisoned tongues.
This seems to propose an alternate notion of 'tool' than the one in the article.
I agree with "tool != oracle" for the article's definition.
Using your definition, I'm not sure there is any distinction between tool and agent at all, as per this comment.
I do think there are useful alternative notions to consider in this area, though, as per this comment.
And I do think there is a terminology issue. Previously I was saying "autonomous AI" vs "non-autonomous".
Right. Exercise the neglected virtue of scholarship and all that.
It's not that easy to dismiss; if it's as poorly leveraged as it looks relative to other approaches then you have little reason to be spreading and teaching SI's brand of specialized rationality (except for perhaps income).
I feel that the relevance of "Friendliness theory" depends heavily on the idea of a "discrete jump" that seems unlikely and whose likelihood does not seem to have been publicly argued for.
It has been. An AI foom could be fast enough and/or sufficiently invisible in the early stages that it's practically discrete, to us. So the AI-foom does have relevance, contra
I believe I have read the vast majority of the Sequences, including the AI-foom debate, and that this content - while interesting and enjoyable - does not have much relevance for the arguments I've made.
As a separate point, people talk about AI friendliness as a safety precaution, but I think an important thing to remember is a truly friendly self improving AGI would probably be the greatest possible thing you could do for the world. It's possible the risk of human destruction from the pursuit of FAI is larger than the possible upside, but if you include the FAI's ability to mitigate other existential risks I don't think that's the case.
Okay, make that: I strongly suspect the rationality of the rational internet would improve many orders of magnitude if all arguments about arguments were quietly deleted
That is ignored, pattern matching is not good enough for you, you overcame pattern matching.
I wouldn't say that. "This looks cranky, it's probably not worth investigation further" is usually a pretty good heuristic. And, as you say, unless you actually know enough about the field to be able to be close to an expert yourself, it's often very hard to tell the difference between a logically consistent crank argument with no blatantly obvious mistakes and an argument for something that's actually correct. On the other hand, from the outside, peopl...
That fictional treatment is interesting to the point of me actually looking up the book. But ..
Yes, human-like AGI's are really scary.
The future is scary. Human-like AGI's should not intrinsically be more scary than the future, accelerated.
What you claimed was that "It is perfectly acceptable to make a reply to a publicly made comment that was itself freely volunteered", and that if someone didn't want to discuss something then they shouldn't have brought it up. In context, however, this was a reply to me saying it was probably unkind to belabor a subject to someone who'd expressed that they find the subject upsetting, which you now seem to be saying you agree with. So what are you taking issue with? I certainly didn't mean to imply that if someone finds a subject uncomfortable to ...
I disagree that it is in general unacceptable to post information that you would not like to discuss beyond a certain point.
Without further clarification one could reasonably assume that cousin_it was okay with discussing the subject at one removal, as you suggest, but as it happens several days before the great-grandparent cousin_it explicitly stated that it would be upsetting to discuss this topic.
I'm not saying he's right, I'm saying your proposed alternative isn't even wrong.
The AIs develop as NPCs in virtual worlds, which humans take no issue with today. This is actually a very likely path to developing AGI...
I think this is one of many possible paths, though I wouldn't call any of them "likely" to happen -- at least, not in the next 20 years. That said, if the AI is an NPC in a game, then of course it makes sense that it would harness the game for its CPU cycles; that's what it was built to do, after all.
..."about as well". Human verbal communication bandwidth is at most a few measly kilobits per second
Yeah, the "voluntary" part is key to getting humans to like you and your project. On the flip side, illicit botnets are quite effective at harnessing "spare" (i.e., owned by someone else) computing capacity; so, it's a bit of a tradeoff.
This post presents thoughts on the Singularity Institute from Holden Karnofsky, Co-Executive Director of GiveWell. Note: Luke Muehlhauser, the Executive Director of the Singularity Institute, reviewed a draft of this post, and commented: "I do generally agree that your complaints are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI). I am working to address both categories of issues." I take Luke's comment to be a significant mark in SI's favor, because it indicates an explicit recognition of the problems I raise, and thus increases my estimate of the likelihood that SI will work to address them.
September 2012 update: responses have been posted by Luke and Eliezer (and I have responded in the comments of their posts). I have also added acknowledgements.
The Singularity Institute (SI) is a charity that GiveWell has been repeatedly asked to evaluate. In the past, SI has been outside our scope (as we were focused on specific areas such as international aid). With GiveWell Labs we are open to any giving opportunity, no matter what form and what sector, but we still do not currently plan to recommend SI; given the amount of interest some of our audience has expressed, I feel it is important to explain why. Our views, of course, remain open to change. (Note: I am posting this only to Less Wrong, not to the GiveWell Blog, because I believe that everyone who would be interested in this post will see it here.)
I am currently the GiveWell staff member who has put the most time and effort into engaging with and evaluating SI. Other GiveWell staff currently agree with my bottom-line view that we should not recommend SI, but this does not mean they have engaged with each of my specific arguments. Therefore, while the lack of recommendation of SI is something that GiveWell stands behind, the specific arguments in this post should be attributed only to me, not to GiveWell.
Summary of my views
Intent of this post
I did not write this post with the purpose of "hurting" SI. Rather, I wrote it in the hopes that one of these three things (or some combination) will happen:
Which one of these occurs will hopefully be driven primarily by the merits of the different arguments raised. Because of this, I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.
Does SI have a well-argued case that its work is beneficial and important?
I know no more concise summary of SI's views than this page, so here I give my own impressions of what SI believes, in italics.
From the time I first heard this argument, it has seemed to me to be skipping important steps and making major unjustified assumptions. However, for a long time I believed this could easily be due to my inferior understanding of the relevant issues. I believed my own views on the argument to have only very low relevance (as I stated in my 2011 interview with SI representatives). Over time, I have had many discussions with SI supporters and advocates, as well as with non-supporters who I believe understand the relevant issues well. I now believe - for the moment - that my objections are highly relevant, that they cannot be dismissed as simple "layman's misunderstandings" (as they have been by various SI supporters in the past), and that SI has not published anything that addresses them in a clear way.
Below, I list my major objections. I do not believe that these objections constitute a sharp/tight case for the idea that SI's work has low/negative value; I believe, instead, that SI's own arguments are too vague for such a rebuttal to be possible. There are many possible responses to my objections, but SI's public arguments (and the private arguments) do not make clear which possible response (if any) SI would choose to take up and defend. Hopefully the dialogue following this post will clarify what SI believes and why.
Some of my views are discussed at greater length (though with less clarity) in a public transcript of a conversation I had with SI supporter Jaan Tallinn. I refer to this transcript as "Karnofsky/Tallinn 2011."
Objection 1: it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.
Suppose, for the sake of argument, that SI manages to create what it believes to be an FAI. Suppose that it is successful in the "AGI" part of its goal, i.e., it has successfully created an intelligence vastly superior to human intelligence and extraordinarily powerful from our perspective. Suppose that it has also done its best on the "Friendly" part of the goal: it has developed a formal argument for why its AGI's utility function will be Friendly, it believes this argument to be airtight, and it has had this argument checked over by 100 of the world's most intelligent and relevantly experienced people. Suppose that SI now activates its AGI, unleashing it to reshape the world as it sees fit. What will be the outcome?
I believe that the probability of an unfavorable outcome - by which I mean an outcome essentially equivalent to what a UFAI would bring about - exceeds 90% in such a scenario. I believe the goal of designing a "Friendly" utility function is likely to be beyond the abilities even of the best team of humans willing to design such a function. I do not have a tight argument for why I believe this, but a comment on LessWrong by Wei Dai gives a good illustration of the kind of thoughts I have on the matter:
I think this comment understates the risks, however. For example, when the comment says "the formalization of the notion of 'safety' used by the proof is wrong," it is not clear whether it means that the values the programmers have in mind are not correctly implemented by the formalization, or whether it means they are correctly implemented but are themselves catastrophic in a way that hasn't been anticipated. I would be highly concerned about both. There are other catastrophic possibilities as well; perhaps the utility function itself is well-specified and safe, but the AGI's model of the world is flawed (in particular, perhaps its prior or its process for matching observations to predictions are flawed) in a way that doesn't emerge until the AGI has made substantial changes to its environment.
By SI's own arguments, even a small error in any of these things would likely lead to catastrophe. And there are likely failure forms I haven't thought of. The overriding intuition here is that complex plans usually fail when unaccompanied by feedback loops. A scenario in which a set of people is ready to unleash an all-powerful being to maximize some parameter in the world, based solely on their initial confidence in their own extrapolations of the consequences of doing so, seems like a scenario that is overwhelmingly likely to result in a bad outcome. It comes down to placing the world's largest bet on a highly complex theory - with no experimentation to test the theory first.
So far, all I have argued is that the development of "Friendliness" theory can achieve at best only a limited reduction in the probability of an unfavorable outcome. However, as I argue in the next section, I believe there is at least one concept - the "tool-agent" distinction - that has more potential to reduce risks, and that SI appears to ignore this concept entirely. I believe that tools are safer than agents (even agents that make use of the best "Friendliness" theory that can reasonably be hoped for) and that SI encourages a focus on building agents, thus increasing risk.
Objection 2: SI appears to neglect the potentially important distinction between "tool" and "agent" AI.
Google Maps is a type of artificial intelligence (AI). It is far more intelligent than I am when it comes to planning routes.
Google Maps - by which I mean the complete software package including the display of the map itself - does not have a "utility" that it seeks to maximize. (One could fit a utility function to its actions, as to any set of actions, but there is no single "parameter to be maximized" driving its operations.)
Google Maps (as I understand it) considers multiple possible routes, gives each a score based on factors such as distance and likely traffic, and then displays the best-scoring route in a way that makes it easily understood by the user. If I don't like the route, for whatever reason, I can change some parameters and consider a different route. If I like the route, I can print it out or email it to a friend or send it to my phone's navigation application. Google Maps has no single parameter it is trying to maximize; it has no reason to try to "trick" me in order to increase its utility.
In short, Google Maps is not an agent, taking actions in order to maximize a utility parameter. It is a tool, generating information and then displaying it in a user-friendly manner for me to consider, use and export or discard as I wish.
Every software application I know of seems to work essentially the same way, including those that involve (specialized) artificial intelligence such as Google Search, Siri, Watson, Rybka, etc. Some can be put into an "agent mode" (as Watson was on Jeopardy!) but all can easily be set up to be used as "tools" (for example, Watson can simply display its top candidate answers to a question, with the score for each, without speaking any of them.)
The "tool mode" concept is importantly different from the possibility of Oracle AI sometimes discussed by SI. The discussions I've seen of Oracle AI present it as an Unfriendly AI that is "trapped in a box" - an AI whose intelligence is driven by an explicit utility function and that humans hope to control coercively. Hence the discussion of ideas such as the AI-Box Experiment. A different interpretation, given in Karnofsky/Tallinn 2011, is an AI with a carefully designed utility function - likely as difficult to construct as "Friendliness" - that leaves it "wishing" to answer questions helpfully. By contrast with both these ideas, Tool-AGI is not "trapped" and it is not Unfriendly or Friendly; it has no motivations and no driving utility function of any kind, just like Google Maps. It scores different possibilities and displays its conclusions in a transparent and user-friendly manner, as its instructions say to do; it does not have an overarching "want," and so, as with the specialized AIs described above, while it may sometimes "misinterpret" a question (thereby scoring options poorly and ranking the wrong one #1) there is no reason to expect intentional trickery or manipulation when it comes to displaying its results.
Another way of putting this is that a "tool" has an underlying instruction set that conceptually looks like: "(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc." An "agent," by contrast, has an underlying instruction set that conceptually looks like: "(1) Calculate which action, A, would maximize parameter P, based on existing data set D. (2) Execute Action A." In any AI where (1) is separable (by the programmers) as a distinct step, (2) can be set to the "tool" version rather than the "agent" version, and this separability is in fact present with most/all modern software. Note that in the "tool" version, neither step (1) nor step (2) (nor the combination) constitutes an instruction to maximize a parameter - to describe a program of this kind as "wanting" something is a category error, and there is no reason to expect its step (2) to be deceptive.
I elaborated further on the distinction and on the concept of a tool-AI in Karnofsky/Tallinn 2011.
This is important because an AGI running in tool mode could be extraordinarily useful but far more safe than an AGI running in agent mode. In fact, if developing "Friendly AI" is what we seek, a tool-AGI could likely be helpful enough in thinking through this problem as to render any previous work on "Friendliness theory" moot. Among other things, a tool-AGI would allow transparent views into the AGI's reasoning and predictions without any reason to fear being purposefully misled, and would facilitate safe experimental testing of any utility function that one wished to eventually plug into an "agent."
Is a tool-AGI possible? I believe that it is, and furthermore that it ought to be our default picture of how AGI will work, given that practically all software developed to date can (and usually does) run as a tool and given that modern software seems to be constantly becoming "intelligent" (capable of giving better answers than a human) in surprising new domains. In addition, it intuitively seems to me (though I am not highly confident) that intelligence inherently involves the distinct, separable steps of (a) considering multiple possible actions and (b) assigning a score to each, prior to executing any of the possible actions. If one can distinctly separate (a) and (b) in a program's code, then one can abstain from writing any "execution" instructions and instead focus on making the program list actions and scores in a user-friendly manner, for humans to consider and use as they wish.
Of course, there are possible paths to AGI that may rule out a "tool mode," but it seems that most of these paths would rule out the application of "Friendliness theory" as well. (For example, a "black box" emulation and augmentation of a human mind.) What are the paths to AGI that allow manual, transparent, intentional design of a utility function but do not allow the replacement of "execution" instructions with "communication" instructions? Most of the conversations I've had on this topic have focused on three responses:
Conventional wisdom says it is extremely dangerous to empower a computer to act in the world until one is very sure that the computer will do its job in a way that is helpful rather than harmful. So if a programmer chooses to "unleash an AGI as an agent" with the hope of gaining power, it seems that this programmer will be deliberately ignoring conventional wisdom about what is safe in favor of shortsighted greed. I do not see why such a programmer would be expected to make use of any "Friendliness theory" that might be available. (Attempting to incorporate such theory would almost certainly slow the project down greatly, and thus would bring the same problems as the more general "have caution, do testing" counseled by conventional wisdom.) It seems that the appropriate measures for preventing such a risk are security measures aiming to stop humans from launching unsafe agent-AIs, rather than developing theories or raising awareness of "Friendliness."
One of the things that bothers me most about SI is that there is practically no public content, as far as I can tell, explicitly addressing the idea of a "tool" and giving arguments for why AGI is likely to work only as an "agent." The idea that AGI will be driven by a central utility function seems to be simply assumed. Two examples:
The closest thing I have seen to a public discussion of "tool-AGI" is in Dreams of Friendliness, where Eliezer Yudkowsky considers the question, "Why not just have the AI answer questions, instead of trying to do anything? Then it wouldn't need to be Friendly. It wouldn't need any goals at all. It would just answer questions." His response:
This passage appears vague and does not appear to address the specific "tool" concept I have defended above (in particular, it does not address the analogy to modern software, which challenges the idea that "powerful optimization processes" cannot run in tool mode). The rest of the piece discusses (a) psychological mistakes that could lead to the discussion in question; (b) the "Oracle AI" concept that I have outlined above. The comments contain some more discussion of the "tool" idea (Denis Bider and Shane Legg seem to be picturing something similar to "tool-AGI") but the discussion is unresolved and I believe the "tool" concept defended above remains essentially unaddressed.
In sum, SI appears to encourage a focus on building and launching "Friendly" agents (it is seeking to do so itself, and its work on "Friendliness" theory seems to be laying the groundwork for others to do so) while not addressing the tool-agent distinction. It seems to assume that any AGI will have to be an agent, and to make little to no attempt at justifying this assumption. The result, in my view, is that it is essentially advocating for a more dangerous approach to AI than the traditional approach to software development.
Objection 3: SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.
SI's scenario concerns the development of artificial general intelligence (AGI): a computer that is vastly more intelligent than humans in every relevant way. But we already have many computers that are vastly more intelligent than humans in some relevant ways, and the domains in which specialized AIs outdo humans seem to be constantly and continuously expanding. I feel that the relevance of "Friendliness theory" depends heavily on the idea of a "discrete jump" that seems unlikely and whose likelihood does not seem to have been publicly argued for.
One possible scenario is that at some point, we develop powerful enough non-AGI tools (particularly specialized AIs) that we vastly improve our abilities to consider and prepare for the eventuality of AGI - to the point where any previous theory developed on the subject becomes useless. Or (to put this more generally) non-AGI tools simply change the world so much that it becomes essentially unrecognizable from the perspective of today - again rendering any previous "Friendliness theory" moot. As I said in Karnofsky/Tallinn 2011, some of SI's work "seems a bit like trying to design Facebook before the Internet was in use, or even before the computer existed."
Perhaps there will be a discrete jump to AGI, but it will be a sort of AGI that renders "Friendliness theory" moot for a different reason. For example, in the practice of software development, there often does not seem to be an operational distinction between "intelligent" and "Friendly." (For example, my impression is that the only method programmers had for evaluating Watson's "intelligence" was to see whether it was coming up with the same answers that a well-informed human would; the only way to evaluate Siri's "intelligence" was to evaluate its helpfulness to humans.) "Intelligent" often ends up getting defined as "prone to take actions that seem all-around 'good' to the programmer." So the concept of "Friendliness" may end up being naturally and subtly baked in to a successful AGI effort.
The bottom line is that we know very little about the course of future artificial intelligence. I believe that the probability that SI's concept of "Friendly" vs. "Unfriendly" goals ends up seeming essentially nonsensical, irrelevant and/or unimportant from the standpoint of the relevant future is over 90%.
Other objections to SI's views
There are other debates about the likelihood of SI's work being relevant/helpful; for example,
Unlike the three objections I focus on, these other issues have been discussed a fair amount, and if these other issues were the only objections to SI's arguments I would find SI's case to be strong (i.e., I would find its scenario likely enough to warrant investment in).
Wrapup
For a long time I refrained from engaging in object-level debates over SI's work, believing that others are better qualified to do so. But after talking at great length to many of SI's supporters and advocates and reading everything I've been pointed to as relevant, I still have seen no clear and compelling response to any of my three major objections. As stated above, there are many possible responses to my objections, but SI's current arguments do not seem clear on what responses they wish to take and defend. At this point I am unlikely to form a positive view of SI's work until and unless I do see such responses, and/or SI changes its positions.
Is SI the kind of organization we want to bet on?
This part of the post has some risks. For most of GiveWell's history, sticking to our standard criteria - and putting more energy into recommended than non-recommended organizations - has enabled us to share our honest thoughts about charities without appearing to get personal. But when evaluating a group such as SI, I can't avoid placing a heavy weight on (my read on) the general competence, capability and "intangibles" of the people and organization, because SI's mission is not about repeating activities that have worked in the past. Sharing my views on these issues could strike some as personal or mean-spirited and could lead to the misimpression that GiveWell is hostile toward SI. But it is simply necessary in order to be fully transparent about why I hold the views that I hold.
Fortunately, SI is an ideal organization for our first discussion of this type. I believe the staff and supporters of SI would overwhelmingly rather hear the whole truth about my thoughts - so that they can directly engage them and, if warranted, make changes - than have me sugar-coat what I think in order to spare their feelings. People who know me and my attitude toward being honest vs. sparing feelings know that this, itself, is high praise for SI.
One more comment before I continue: our policy is that non-public information provided to us by a charity will not be published or discussed without that charity's prior consent. However, none of the content of this post is based on private information; all of it is based on information that SI has made available to the public.
There are several reasons that I currently have a negative impression of SI's general competence, capability and "intangibles." My mind remains open and I include specifics on how it could be changed.
I have been pointed to Peter Thiel and Ray Kurzweil as examples of impressive SI supporters, but I have not seen any on-record statements from either of these people that show agreement with SI's specific views, and in fact (based on watching them speak at Singularity Summits) my impression is that they disagree. Peter Thiel seems to believe that speeding the pace of general innovation is a good thing; this would seem to be in tension with SI's view that AGI will be catastrophic by default and that no one other than SI is paying sufficient attention to "Friendliness" issues. Ray Kurzweil seems to believe that "safety" is a matter of transparency, strong institutions, etc. rather than of "Friendliness." I am personally in agreement with the things I have seen both of them say on these topics. I find it possible that they support SI because of the Singularity Summit or to increase general interest in ambitious technology, rather than because they find "Friendliness theory" to be as important as SI does.
Clear, on-record statements from these two supporters, specifically endorsing SI's arguments and the importance of developing Friendliness theory, would shift my views somewhat on this point.
SI's list of achievements is not, in my view, up to where it needs to be given (a) and (b). Yet I have seen no declaration that SI has fallen short to date and explanation of what will be changed to deal with it. SI's recent release of a strategic plan and monthly updates are improvements from a transparency perspective, but they still leave me feeling as though there are no clear metrics or goals by which SI is committing to be measured (aside from very basic organizational goals such as "design a new website" and very vague goals such as "publish more papers") and as though SI places a low priority on engaging people who are critical of its views (or at least not yet on board), as opposed to people who are naturally drawn to it.
I believe that one of the primary obstacles to being impactful as a nonprofit is the lack of the sort of helpful feedback loops that lead to success in other domains. I like to see groups that are making as much effort as they can to create meaningful feedback loops for themselves. I perceive SI as falling well short on this front. Pursuing more impressive endorsements and developing benign but objectively recognizable innovations (particularly commercially viable ones) are two possible ways to impose more demanding feedback loops. (I discussed both of these in my interview linked above).
Yet I'm not aware of any of what I consider compelling evidence that SI staff/supporters/advocates have any special insight into the nature of general rationality or that they have especially high general rationality.
I have been pointed to the Sequences on this point. The Sequences (which I have read the vast majority of) do not seem to me to be a demonstration or evidence of general rationality. They are about rationality; I find them very enjoyable to read; and there is very little they say that I disagree with (or would have disagreed with before I read them). However, they do not seem to demonstrate rationality on the part of the writer, any more than a series of enjoyable, not-obviously-inaccurate essays on the qualities of a good basketball player would demonstrate basketball prowess. I sometimes get the impression that fans of the Sequences are willing to ascribe superior rationality to the writer simply because the content seems smart and insightful to them, without making a critical effort to determine the extent to which the content is novel, actionable and important.
I endorse Eliezer Yudkowsky's statement, "Be careful … any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility." To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts, at least not as much as I would expect for people who have the sort of insight into rationality that makes it sensible for them to train others in it. I am open to other evidence that SI staff/supporters/advocates have superior general rationality, but I have not seen it.
Why is it a problem if SI staff/supporter/advocates believe themselves, without good evidence, to have superior general rationality? First off, it strikes me as a belief based on wishful thinking rather than rational inference. Secondly, I would expect a series of problems to accompany overconfidence in one's general rationality, and several of these problems seem to be actually occurring in SI's case:
A possible justification for these activities is that SI is seeking to promote greater general rationality, which over time will lead to more and better support for its mission. But if this is SI's core activity, it becomes even more important to test the hypothesis that SI's views are in fact rooted in superior general rationality - and these tests don't seem to be happening, as discussed above.
In addition, I have seen no public SI-authorized discussion of the matter that I consider to be satisfactory in terms of explaining what happened and what the current status of the case is on an ongoing basis. Some details may have to be omitted, but a clear SI-authorized statement on this point with as much information as can reasonably provided would be helpful.
A couple positive observations to add context here:
Wrapup
While SI has produced a lot of content that I find interesting and enjoyable, it has not produced what I consider evidence of superior general rationality or of its suitability for the tasks it has set for itself. I see no qualifications or achievements that specifically seem to indicate that SI staff are well-suited to the challenge of understanding the key AI-related issues and/or coordinating the construction of an FAI. And I see specific reasons to be pessimistic about its suitability and general competence.
When estimating the expected value of an endeavor, it is natural to have an implicit "survivorship bias" - to use organizations whose accomplishments one is familiar with (which tend to be relatively effective organizations) as a reference class. Because of this, I would be extremely wary of investing in an organization with apparently poor general competence/suitability to its tasks, even if I bought fully into its mission (which I do not) and saw no other groups working on a comparable mission.
But if there's even a chance …
A common argument that SI supporters raise with me is along the lines of, "Even if SI's arguments are weak and its staff isn't as capable as one would like to see, their goal is so important that they would be a good investment even at a tiny probability of success."
I believe this argument to be a form of Pascal's Mugging and I have outlined the reasons I believe it to be invalid in two posts (here and here). There have been some objections to my arguments, but I still believe them to be valid. There is a good chance I will revisit these topics in the future, because I believe these issues to be at the core of many of the differences between GiveWell-top-charities supporters and SI supporters.
Regardless of whether one accepts my specific arguments, it is worth noting that the most prominent people associated with SI tend to agree with the conclusion that the "But if there's even a chance …" argument is not valid. (See comments on my post from Michael Vassar and Eliezer Yudkowsky as well as Eliezer's interview with John Baez.)
Existential risk reduction as a cause
I consider the general cause of "looking for ways that philanthropic dollars can reduce direct threats of global catastrophic risks, particularly those that involve some risk of human extinction" to be a relatively high-potential cause. It is on the working agenda for GiveWell Labs and we will be writing more about it.
However, I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y. For donors determined to donate within this cause, I encourage you to consider donating to a donor-advised fund while making it clear that you intend to grant out the funds to existential-risk-reduction-related organizations in the future. (One way to accomplish this would be to create a fund with "existential risk" in the name; this is a fairly easy thing to do and one person could do it on behalf of multiple donors.)
For one who accepts my arguments about SI, I believe withholding funds in this way is likely to be better for SI's mission than donating to SI - through incentive effects alone (not to mention my specific argument that SI's approach to "Friendliness" seems likely to increase risks).
How I might change my views
My views are very open to revision.
However, I cannot realistically commit to read and seriously consider all comments posted on the matter. The number of people capable of taking a few minutes to write a comment is sufficient to swamp my capacity. I do encourage people to comment and I do intend to read at least some comments, but if you are looking to change my views, you should not consider posting a comment to be the most promising route.
Instead, what I will commit to is reading and carefully considering up to 50,000 words of content that are (a) specifically marked as SI-authorized responses to the points I have raised; (b) explicitly cleared for release to the general public as SI-authorized communications. In order to consider a response "SI-authorized and cleared for release," I will accept explicit communication from SI's Executive Director or from a majority of its Board of Directors endorsing the content in question. After 50,000 words, I may change my views and/or commit to reading more content, or (if I determine that the content is poor and is not using my time efficiently) I may decide not to engage further. SI-authorized content may improve or worsen SI's standing in my estimation, so unlike with comments, there is an incentive to select content that uses my time efficiently. Of course, SI-authorized content may end up including excerpts from comment responses to this post, and/or already-existing public content.
I may also change my views for other reasons, particularly if SI secures more impressive achievements and/or endorsements.
One more note: I believe I have read the vast majority of the Sequences, including the AI-foom debate, and that this content - while interesting and enjoyable - does not have much relevance for the arguments I've made.
Again: I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.
Acknowledgements
Thanks to the following people for reviewing a draft of this post and providing thoughtful feedback (this of course does not mean they agree with the post or are responsible for its content): Dario Amodei, Nick Beckstead, Elie Hassenfeld, Alexander Kruel, Tim Ogden, John Salvatier, Jonah Sinick, Cari Tuna, Stephanie Wykstra.