I'm Henry Lieberman, Research Scientist at the MIT Computer Science and AI Lab. I'm interested in long-term thinking about the future of humanity, technology and society, and have developed some new ideas, together with my colleague Christopher Fry, about how AI and other technologies can ensure a positive future for humanity. For the full story, see my Web site, https://www.whycantwe.org, where you'll find a 12-minute TED talk; other videos, and writing based on our book, "Why Can't We All Just Get Along?".
For specific thoughts on the Alignment question, here's an abstract (maybe a future paper and/or talk at an appropriate venue):
AI Alignment Depends on Human Alignment
Henry Lieberman – MIT CSAIL
The problem of whether the goals and values of an artificially intelligent agent will align with human goals and values, can be reduced to this problem: Will the goals and values of different human agents ever align with each other?
Regardless of what you believe about the problem for humans, we're likely to get the same answer when we think about intelligent machines. We can only program AI "in our own image", so both the features and bugs of humanity will reappear in AI. Thus, whether AI turns out to be a good thing or a bad thing in the future, depends critically on this question: Will humans cooperate with each other, or will they compete with one another?
Right now, our society is schizophrenic -- some of our institutions are oriented towards cooperation (like science), others (like business and politics) seem to be primarily oriented towards competing. Many of the social problems caused by AI are a result of this schizophrenia. If AI becomes a tool of warring human factions, we're doomed. But it doesn't have to be like that.
With all the evident conflict and disagreement in the world, some despair of the prospect of ever getting people to align their values, substantially, if not perfectly. Yet the technology itself will provide unprecedented opportunities to eliminate the barriers to widespread social cooperation. A positive future for AI depends on changing our competitive mindset (and institutions) towards more cooperative alternatives. I will present some concrete proposals for doing so. Then, we'll get benevolent AI "for free".
Thanks for sharing your ideas. I'm a bit confused about your core claim and would love if you could could clarify (Or refer to the specific part of your writing that addresses these questions): I get the general gist of your claim, that AI alignment depends on whether humans can all have the same values, but I don't know how much 'the same' you mean. You say 'substantially' align, could you give some examples of how aligned you mean? For example, do you mean all humans sharing the same political ideology (libertarian/communist/ etc)? Do you mean that for all non-trivial ethical questions (When is abortion permissable? How much duty do you have to your family vs yourself? How many resources should we devote to making things better on earth vs exploring space?), that you would need to be able to ask any human on earth and say 99% would give you the same answer?
Likewise with the idea of humans needing to compete less and cooperate more. How much less and more? For example, competition between firms is a core part of capitalism, do you think we need to completely eliminate capitalism? Or do you only mean eliminating zero/negative sum competition like war?