ChristianKl comments on MIRI strategy - Less Wrong

5 Post author: ColonelMustard 28 October 2013 03:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (94)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 01 November 2013 04:52:51PM 0 points [-]

How do you decide whether some interaction of a complex neural net is friendly or unfriendly?

It's very hard to tell what a neural net or complex algorithm is doing even if you have logs.

Comment author: [deleted] 02 November 2013 12:49:08AM *  0 points [-]

Don't use a neural net (or variants like deep belief networks). The field has advanced quite a bit since the 60's, and since the late 80's there have been machine learning and knowledge representation structures which are human and/or auditor comprehensible, such as probabilistic graphical models. This would have to be first class types of the virtual machine which implements the AGI if you are using auditing as a confinement mechanism. But that's not really a restriction as many AI techniques are already phrased in terms of these models (including Eliezer's own TDT, for example), and others have simple adaptations.