DanielLC comments on Approval-directed agents - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (22)
I feel like if you give the AI enough freedom for its intelligence to be helpful, you'd have the same pitfalls as having the AI pick a goal you'd approve of. I also feel like it's not clear exactly which decisions you'd oversee. What if the AI convinces you that it's actions are fine, because you'd approve of its method of choosing them, and that it's method is fine, because you'd approve of the individual action?