eli_sennesh comments on A toy model of the treacherous turn - Less Wrong

13 Post author: Stuart_Armstrong 08 January 2016 12:58PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (13)

You are viewing a single comment's thread.

Comment author: [deleted] 10 January 2016 08:13:08PM 2 points [-]

Can we just build a Link to the Past minigame that actually models this with real, running code, and then post a bunch of YouTube videos of Link trying naively to kill Sahasrala?

Comment author: aphyer 13 February 2016 06:42:49AM 1 point [-]

Besides the obvious benefit of being awesome, I think there could be a more serious benefit to this. One extreme failure mode when imagining the behavior of an AI is not merely to fail to imagine it as being superintelligent but to imagine it as being less intelligent than yourself, as not doing things you could think of (a la That Alien Message). A game that consisted of you, the player, needing to come up with increasingly complicated ways to trick these 'shopkeeper' agents could illustrate this pretty neatly.

Comment author: Stuart_Armstrong 12 January 2016 10:32:52AM 0 points [-]

PS: Were you offering to do or partially do such a project?

Comment author: [deleted] 12 January 2016 03:22:22PM 2 points [-]

I would totally contribute to such a project, although we should coordinate what sort of language and reasoning techniques we're using first. Reinforcement learning is actually a reasonably involved thing to code, after all.

Comment author: Stuart_Armstrong 13 January 2016 10:37:51AM *  0 points [-]

Would you mind if I put you in contact with Jaan Tallinn on this issue?

PS: PM me your email if so

Comment author: [deleted] 12 January 2016 03:23:38PM 0 points [-]

I could only contribute, not write the whole thing, though, since I've basically got stuff on my plate at all times: Latex fix for conference paper, actually arranging travel to conference, gym, social life, structure-learning project, studying, etc.

Comment author: Stuart_Armstrong 13 January 2016 10:37:11AM -1 points [-]

Social lives are for the weak! ;-)

Comment author: [deleted] 18 January 2016 08:17:53PM 1 point [-]

That's a sick statement.

Comment author: Stuart_Armstrong 11 January 2016 11:23:36AM 0 points [-]

Fun! Do it if you can, but the model needs to be further clarified first, I think.