jacob_cannell comments on The Brain as a Universal Learning Machine - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (166)
About the universality or otherwise of RL. Big topic.
There's no need to taboo "RL" because switching to utility-based learning does not solve the issue (and the issue I have in mind covers both).
See, this is the problem. It is hard for me to fight the idea that RL (or utility-driven learning) works, because I am forced to fight a negative; a space where something should be, but which is empty ....... namely, the empirical fact that Reinforcement Learning has never been made to work in the absence of some surrounding machinery that prepares or simplifies the ground for the RL mechanism.
It is a naked fact about traditional AI that it puts such an emphasis on the concept of expected utility calculations without any guarantees that a utility function can be laid on the world in such a way that all and only the intelligent actions in that world are captured by a maximization of that quantity. It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.
If anyone ever produced a proof why it should work, there would be a there there, and I could undermine it. But .... not so much!
About AIXI and my conversation with Marcus: that was actually about the general concept of RL and utility-driven systems, not anything specific to AIXI. We circled around until we reached the final crux of the matter, and his last stand (before we went to the conference banquet) was "Yes, it all comes down to whether you believe in the intrinsic reasonableness of the idea that there exists a utility function which, when maximized, yields intelligent behavior .......... but that IS reasonable, .... isn't it?"
My response was "So you do agree that that is where the buck stops: I have to buy the reasonableness of that idea, and there is no proof on the table for why I SHOULD buy it, no?"
Hutter: "Yes."
Me: "No matter how reasonable it seems, I don't buy it"
His answer was to laugh and spread his arms wide. And at that point we went to the dinner and changed to small talk. :-)
Since the utility function is approximated anyway, it becomes an abstract concept - especially in the case of evolved brains. For an evolved creature, the evolutionary utility function can be linked to long term reproductive fitness, and the value function can then be defined appropriately.
For a designed agent, it's a useful abstraction. We can conceptually rate all possible futures, and then roughly use that to define a value function that optimizes towards that goal.
It's really just a mathematical abstraction of the notion of X is better than Y. It's not worth arguing about. It's also proven in the real world - agents based on utility formalizations work. Well.
It certainly is worth discussing, and I'm sorry but you are not correct that "agents based on utility formalizations work. Well."
That topic came up at the AAAI symposium I attended last year. Specifically, we had several people there who built real-world (as opposed to academic, toy) AI systems. Utility based systems are generally not used, except as a small component of a larger mechanism.
Pretty much all of the recent ML systems are based on a utility function framework in a sense - they are trained to optimize an objective function. In terms of RL in particular, Deepmind's Atari agent works pretty well, and builds on a history of successful practical RL agents that all are trained to optimize a 'utility function'.
That said, for complex AGI, we probably need something more complex than current utility function frameworks - in the sense that you can't reduce utility to an external reward score. The brain doesn't appear to have a simple VNM single-axis utility concept, which is some indication that we may eventually drop that notion for complex AI. My conception of 'utility function' is loose, and could include whatever it is the brain is doing.