Comment author: Larifari 18 March 2011 12:18:57PM 2 points [-]

What exactly are we trying to learn from this thought experiment that we cannot already learn from the torture/dust-speck experiment?

In response to The Friendly AI Game
Comment author: Alexandros 16 March 2011 12:31:45PM *  5 points [-]

So, here's my pet theory for <1-person friendly> AI that I'd love to put out of it's misery: "Don't do anything your designer wouldn't approve of". It's loosely based on the "Gandi wouldn't take a pill that would turn him into a murderer" principle.

A possible implementation: Make an emulation of the designer and use it as an isolated component of the AI. Any plan of action has to be submitted for approval to this component before being implemented. This is nicely recursive and rejects plans such as "make a plan of action deceptively complex such that my designer will mistakenly approve it" and "modify my designer so that they approve what I want them to approve".

There could be an argument about how the designer's emulation would feel in this situation, but.. torture vs. dust specks! Also, is this a corrupted version of <1-person CEV>?

Comment author: Larifari 16 March 2011 02:41:05PM 6 points [-]

If the AI is designed to follow the principle by the letter, it has to request approval from the designer even for the action of requesting approval, leaving the AI incapable of action. If the AI is designed to be able to make certain exemptions, it will figure out a way to modify the designer without needing approval for this modification.

Comment author: Larifari 02 January 2011 04:53:57PM 4 points [-]

Do we actually know that our discounting function is hyperbolic in the range below 5 minutes? Or is that just extrapolation from experiments done on longer intervals?

In response to Reference Points
Comment author: Larifari 17 November 2010 09:18:56AM 2 points [-]

Referencing long-term consequences could also be viewed as having empathy with ones future self. Instead of thinking "What do I care about the me of tomorrow?", one creates the impression/illusion of a continuous personality. Maybe empathy for others even evolved piggybacking on empathy for future versions of oneself.

View more: Prev