Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

NancyLebovitz comments on Stupid Questions Open Thread - Less Wrong Discussion

42 Post author: Costanza 29 December 2011 11:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (265)

You are viewing a single comment's thread.

Comment author: NancyLebovitz 30 December 2011 02:33:13PM 3 points [-]

Is there a proof that it's possible to prove Friendliness?

Comment author: Vladimir_Nesov 30 December 2011 05:02:20PM *  6 points [-]

No. There's also no proof that it's possible to prove that P!=NP, and for the Friendliness problem it's much, much less clear what the problem even means. You aren't entitled to that particular proof, it's not expected to be available until it's not needed anymore. (Many difficult problems get solved or almost solved without a proof of them being solvable appearing in the interim.)

Comment author: NancyLebovitz 30 December 2011 05:43:35PM *  0 points [-]

Why is it plausible that Friendliness is provable? Or is it more a matter that the problem is so important that it's worth trying regardless?

Comment author: Vladimir_Nesov 30 December 2011 06:54:39PM *  5 points [-]

There is no clearly defined or motivated problem of "proving Friendliness". We need to understand what goals are, what humane goals are, what process can be used to access their formal definition, and what kinds of things can be done with them how to what end. We need to understand these things well, which (on psychological level) triggers association with mathematical proofs, and will probably actually involve some mathematics suitable to the task. Whether the answers take the form of something describable as "provable Friendliness" seems to me an unclear/unmotivated consideration. Unpacking that label might make it possible to provide a more useful response to the question.

Comment author: XiXiDu 30 December 2011 03:05:23PM *  2 points [-]

Is there a proof that it's possible to prove Friendliness?

I wonder what SI would do next if they could prove that friendly AI was not possible. For example if it could be shown that value drift was inevitable and that utility-functions are unstable under recursive self-improvement.

Comment author: TimS 30 December 2011 03:12:00PM -1 points [-]

Something along the lines that value drift is inevitable and utility-functions are unstable under recursive self-improvement.

That doesn't seem like the only circumstances in which FAI is not possible. If moral nihilism is true, then FAI is impossible even if value drift is not inevitable.
In that circumstance, shouldn't we try to make any AI we decide to build "friendly" to present day humanity, even if it wouldn't be friendly to Aristotle or Plato or Confucius. Based on hidden complexity of wishes analysis, consistency with our current norms is still plenty hard.

Comment author: NancyLebovitz 30 December 2011 04:38:14PM *  0 points [-]

My concerns are more that it will not be possible to adequately define "human", especially as, transhuman tech develops, and that there might not be a good enough way to define what's good for people.

Comment author: shminux 30 December 2011 08:54:00PM 0 points [-]

As I understand it, the modest goal of building an FAI is that of giving an AGI a push in the "right" direction, what EY refers to as the initial dynamics. After that, all bets are off.