Wei_Dai comments on AlphaGo versus Lee Sedol - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (183)
Compared to its competition in the AGI race, MIRI was always going to be disadvantaged by both lack of resources and the need to choose an AI design that can predictably be made Friendly as opposed to optimizing mainly for capability. For this reason, I was against MIRI (or rather the Singularity Institute as it was known back then) going into AI research at all, as opposed to pursuing some other way of pushing for a positive Singularity.
In any case, what other approaches to Friendliness would you like MIRI to consider? The only other approach that I'm aware of that's somewhat developed is Paul Christiano's current approach (see for example https://medium.com/ai-control/alba-an-explicit-proposal-for-aligned-ai-17a55f60bbcf), which I understand is meant to be largely agnostic about the underlying AI technology. Personally I'm pretty skeptical but then I may be overly skeptical about everything. What are your thoughts? I don't recall seeing you having commented on them much.
Are you aware of any other ideas that MIRI should be considering?
As far as I can tell, Paul's current proposal might still suffer from blackmail, like his earlier proposal which I commented on. I vaguely remember discussing the problem with you as well.
One big lesson for me is that AI research seems to be more incremental and predictable than we thought, and garage FOOM probably isn't the main danger. It might be helpful to study the strengths and weaknesses of modern neural networks and get a feel for their generalization performance. Then we could try to predict which areas will see big gains from neural networks in the next few years, and which parts of Friendliness become easy or hard as a result. Is anyone at MIRI working on that?
If they did that, then what? Try to convince NN researchers to attack the parts of Friendliness that look hard? That seems difficult for MIRI to do given where they've invested in building their reputation (i.e., among decision theorists and mathematicians instead of in the ML community). (It would really depend on people trusting their experience and judgment since it's hard to see how much one could offer in the form of either mathematical proof or clearly relevant empirical evidence.) You'd have a better chance if the work was carried out by some other organization. But even if that organization got NN researchers to take its results seriously, what incentives do they have to attack parts of Friendliness that seem especially hard, instead of doing what they've been doing, i.e., racing as fast as they can for the next milestone in capability?
Or is the idea to bet on the off chance that building an FAI with NN turns out to be easy enough that MIRI and like-minded researchers can solve the associated Friendliness problems themselves and then hand the solutions to whoever ends up leading the AGI race, and they can just plug the solutions in at little cost to their winning the race?
Or you're suggesting aiming/hoping for some feasible combination of both, I guess. It seems pretty similar to what Paul Christiano is doing, except he has "generic AI technology" in place of "NN" above. To me, the chance of success of this approach seems low enough that it's not obviously superior to what MIRI is doing (namely, in my view, betting on the off chance that the contrarian AI approach they're taking ends up being much easier/better than the mainstream approach, which is looking increasingly unlikely but still not impossible).