The previous open thread has now exceeded 300 comments – new Open Thread posts may be made here.
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.
I was thinking of current top chess programs as smart(well above average humans), with simple utility functions.
This is a good example, but it might not completely explain it away.
Can we, by hand or by algorithm, construct a utility function that does what we want, even when we know exactly what we want?
I think you could still have a situation in which a smarter agent does worse because it's learned utility function does not match the winning conditions (it's learned utility function would constitute a created subgoal of "maximize reward")
Learning about the world and constructing subgoals would probably be part of any near-human AI. I don't think we have a way to construct reliable subgoals, even with a rules-defined supergoal and perfect knowledge of the world. (such a process would be a huge boon for FAI)
Likewise, I don't think we can be certain that the utility functions we create by hand would reliably lead a high-intelligence AI to seek the goal we want, even for well-defined tasks.
A smarter agent might have the advantage of learning the winning conditions faster, but if it is comparatively better at implementing a flawed utility function than it is at fixing it's utility function, then could be outpaced by stupider versions, and you're working more in an evolutionary design space.
So I think it would hit the same kind of wall, at least in some games.
I meant the AI to be limited to the formal game universe, which should be easily feasible for non-superintelligent AIs. In this case, smarter agents always have an advantage, maximization of reward is the same as the intended goal.
Thinking deeply until you get eaten by a sabertooth is not smart.