RyanCarey comments on Chatbots or set answers, not WBEs - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (9)
As I understand it, you're trying to prevent the AI from behaving in a non-humanlike way by constraining its output. This seems to me to be a reasonable option to explore.
I agree that generating a finite set of humanlike answers (with a chatbot or otherwise) might be a sensible way to do this. An AI could perform gradient descent over the solution space then pick the nearest proposed behaviour (it could work like relaxation in integer programming).
The multiple choice AI (with human-suggested options) is the most obvious option for avoiding unhumanlike behaviour. Paul has said in some medium comments that he thinks his more elaborate approach of combining mimicry and optimisation [1] would work better though. https://medium.com/ai-control/mimicry-maximization-and-meeting-halfway-c149dd23fc17
Thanks for linking me to that!