jimmy comments on The ethics of breaking belief - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (123)
Both, but the statement is stronger for their definition.
My general approach to helping people is to clear out their fears and then let them reassemble the pieces as they see fit - sometimes suggesting possible solutions. This is more easily used to help people than to hurt them, since they are in full control of their actions and more of the game space is visible to them. I can fool them into thinking they’re helping themselves, but I’d have to include at least selective fear removal (though this can happen accidentally through your own biases!).
In contrast, using leading questions and classical conditioning works equally well regardless of which direction you’re pushing.
Hmm, have been looking through your blog a bit more... I'm wondering if you can help people complaining about akrasia by making their second-order desires first-order ones?
Yep :)
Hmm, you would probably be great playing the jailed AI in an AI boxing experiment (can you beat [someone like] EY?), but how successful would you be playing the guard?
The AI box game still seems stacked against the AI roleplayer for any similar skill level. As the AI, I don't think I could beat someone like EY or myself on the other end, and as the gate keeper I think I would beat someone like EY or myself.
I still wouldn't consider myself secure against even human takeover in general, especially if I'm not prepared for mental assault.
Would you know what to look for?
Also, can you write an AI bot that would have a decent success rate against humans, by finding and exploiting the standard wetware bugs?
For the most part
Not for any interesting opponent. I can't even write a 'real' chatbot. The only reason I get the results I do is because I immediately force them into a binary yes/no response and then make sure they keep saying yes .
Your link to your blog is down, but once its back up and if I find this claim plausible upon reading it, I would be very interested in trying this on myself.
EDIT: read the blog, and it looks awesome.