timtyler comments on GiveWell interview with major SIAI donor Jaan Tallinn - Less Wrong

17 Post author: jsalvatier 19 July 2011 03:10PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (8)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vladimir_Nesov 19 July 2011 10:32:57PM *  5 points [-]

These ideas might inform the exchange:

  • The point about hidden complexity of wishes applies fully to specifying the fact that needs to be predicted. Such a wish is still very hard to formalize.
  • If Oracle AI is used to construct complex designs, it needs to be more than a predictor, for the space of possible designs is too big, for example the designs need to be understandable by people who read them. (This is much less of a problem if the predictor just makes up a probability.) If it's not just a predictor, it needs a clear enough specification of what parameters it's optimizing its output for.
  • What does the AI predict, for each possible answer? It predicts the consequences of having produced a particular answer, and then it argmaxes over possible answers. In other words, it's not a predictor at all, it's a full-blown consequentialist agent.
  • A greedy/unethical person scenario is not relevant for two reasons: (1) it's not apparent that an AI can be built that gives significant power, for the hidden complexity of wishes reasons, and (2) if someone has taken over the world, the problem is still the same: what's next, and how to avoid destroying the world?
  • It's not clear in what way powerful humans/narrow AI teams "make SIAI's work moot". Controlling the world doesn't give insight about what to do with it, or guard from fatal mistakes.
Comment author: timtyler 20 July 2011 11:12:12AM *  0 points [-]
  • If Oracle AI is used to construct complex designs, it needs to be more than a predictor, for the space of possible designs is too big, for example the designs need to be understandable by people who read them.

You mean you can't think of scenarios where an Oracle prints out complex human-readable designs? How about you put the Oracle into a virtual world where it observes a plan to steal those kinds of design, and then ask it what it will observe next - as the stolen plans are about to be presented to it?