lukeprog comments on AI Risk and Opportunity: A Strategic Analysis - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (161)
What could an FAI project look like? Louie points out that it might look like Princeton's Institute for Advanced Study:
Idea #1: Write a good, very technical Open Problems in Friendly Artificial Intelligence, get a few of the best mathematicians/physicists who care about FAI accepted as visitors, have them talk to faculty and visitors at about the technical problems related to FAI.
Idea #2: Convince wealthy donors to endow a chair at the Institute for Advanced Study for somebody to do FAI research. (Princeton may not mind us sending another brilliant person and a bunch of money their way.)
Similar research institutes: PARC, Bell Labs, Perimeter Institute, maybe others?
But did the IAS actually succeed? Off-hand, the only thing I can think of them for was hosting Einstein in his crankish years, Kurt Godel before he want crazy, and Von Neumann's work on a real computer (which they disliked and wanted to get rid of). Richard Hamming, who might know, said:
(My own thought is to wonder if this is kind of a regression to the mean, or perhaps regression due to aging.)
How do you maintain secrecy in such a setting? Or is there a new line of thought that says secrecy isn't necessary for an FAI project?
The person/people working on FAI there could work exclusively on the relatively safe problems, e.g. CEV.
Ok, I thought when you said "FAI project" you meant a project to build FAI. But I've noticed two problems with trying to do some of the relatively safe FAI-related problems in public:
Yes, both Eliezer and I (and many others) agree with these points. Eliezer seems pretty set on only doing a basement-style FAI team, perhaps because he's thought about the situation longer and harder than I have. I'm still exploring to see whether there are strategic alternatives, or strategic tweaks. I'm hoping we can discuss this in more detail when my strategic analysis series gets there.
But it seems like SIAI has already deviated from the basement-style FAI plan, since it started supporting research associates who are allowed/encouraged to publish openly, and encouraging public FAI-related research in other ways (such as publishing a list of open problems). And if the "slippery slope" problems I described were already known, why didn't anyone bring them up during the discussions about whether to publish papers about UDT? (I myself only thought of them in the general explicit form yesterday.)
If SIAI already knew about these problems but still thinks it's a good idea to promote public FAI-related research and publish papers about decision theory, then I'm even more confused than before. I hope your series "gets there" soon so I can see where the cause of the disagreement lies.
What I'm saying is that there are costs and benefits to open FAI work. You listed some costs, but that doesn't mean there aren't also benefits. See, e.g. Vladimir's comment.
The benefits are only significant if there is a significant chance of successfully building FAI before some UFAI project takes off. Maybe our disagreement just boils down to different intuitions about that? But Nesov agrees this chance is "tiny" and still wants to push open research, so I'm still confused.
I want to make it bigger, as much as I can. It doesn't matter how small a chance of winning there is, as long as our actions improve it. Giving up doesn't seem like a strategy that leads to winning. The strategy of navigating the WBE transition (or some more speculative intelligence improvement tool) is a more complicated question, and I don't see in what way the background catastrophic risk matters for it.
This also came up in a previous discussion about this we had: it's necessary to distinguish the risk within a given interval of years, and the eventual risk (i.e. the risk of never building a FAI). The same action can make immediate risk worse, but probability of eventually winning higher. I think encouraging an open effort for researching metaethics through decision theory is like that; also better acceptance of the problem might be leveraged to overturn the hypothetical increase in UFAI risk.
Yes, if we're talking about the overall chance of winning, but I was talking about the chance of winning through a specific scenario (directly building FAI). If the chance of that is tiny, why did your cost/benefit analysis of the proposed course of action (encouraging open FAI research) focus completely on it? Shouldn't we be thinking more about how the proposal affects other ways of winning? ETA: To spell it out, encouraging open FAI research decreases the probability that we win by winning the WBE race or through intelligence amplification, by increasing the probability that UFAI happens first.
Nobody is saying "let's give up". If we don't encourage open FAI research, we can still push for a positive Singularity in other ways, some of which I've posted about recently in discussion.
What do you mean? What aren't you seeing?
Yes, of course. I am talking about the probability of eventually winning.
Near/Far. Long-term effects aren't predictable and shouldn't be traded for more predictable short-term losses. In my experience it fails the Predictable Retrospective Stupidity test. Even when you try to factor in structural uncertainty, you still end up getting burned. And even if you still want to make such a tradeoff then you should halt all research until you've come to agreement or a natural stopping point with Wei Dai or others who have reservations. Stop, melt, catch fire, don't destroy the world.
(Disclaimer: This comment is fueled by a strong emotional reaction due to contingent personal details that might or might not upon further reflection deserve to be treated as substantial evidence for the policy I recommend.)
Yeah, we'll come back to this in the strategy series. There are lots of details to consider.
There seems to be a tradeoff here. An open project has more chances to develop the necessary theory faster, but having such project in the open looks like a clearly bad idea towards the endgame. So on one hand, an open project shouldn't be cultivated (and becomes harder to hinder) as we get closer to the endgame, but on the other, a closed project will probably not get off the ground, and fueling it by an initial open effort is one way to make it stronger. So there's probably some optimal point to stop encouraging open development, and given the current state of the theory (nil) I believe the time hasn't come yet.
The open effort could help the subsequent closed project in two related ways: gauge the point where the understanding of what to actually do in the closed project is sufficiently clear (for some sense of "sufficiently"), and form enough of background theory to be able to convince enough young Conways (with necessary training) to work on the problem on the closed stage.
Your argument seems premised on the assumption that there will be an endgame. If we assume some large probability that we end up deciding not to have an endgame at all (i.e., not to try to actually build FAI with unenhanced humans), then it's no longer clear "the time hasn't come yet".
Even if we assume that with probability ~1 there will be an effort to directly build FAI, given the slippery slope effects we have to stop encouraging open research well before the closed project starts. The main deciding factors for "when" must be how large the open research community has gotten, how strong the slippery slope effects are, and how much "pull" SingInst has against those effects. The "current state of the theory" seems to have little to do with it. (Edit: No that's too strong. Let me amend it to "one consideration among many".)
This is something we'll know better further down the road, so as long as it's possible to defer this decision (i.e. while the downside is not too great, however that should be estimated), it's the right thing to do. I still can't rule out that there might be a preference definition procedure (that refers to humans) simple enough to be implemented pre-WBE, and decision theory seems to be an attack on this possibility (clarifying why this is naive, for example, in which case it'll also serve as an argument to the powerful in the WBE race).
Well, maybe not specifically current, but what can be expected eventually, for the closed project to benefit from, which does seem to me like a major consideration in the possibility of its success.
I'm confused as to what you have in mind when you're thinking of work on CEV. Do you mean things like getting a better model of the philosophy of reflective consistency, or studying mechanism design to find algorithms for relatively fair aggregation, or looking into neuroscience to see how beliefs and preferences are encoded, or...? Is there perhaps a post I missed or am forgetting?