DaveX comments on The self-unfooling problem - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (30)
I'm confused about the "hide" part of the initial task, or the "fooling" that needs to be unfooled. The objective function rewards ineffective fooling.
It seems you simply mean "store" such that you can find it.
Congrats, you got the joke!