I'm sorry but I'm still not following what learning and planning you would do. Are you attaching the oracle to some sort of reward mechanism?
The oracle itself is what gives one the ability to perform this computation: Find compact model that efficiently predicts/explains with bounded error the observed sensory data. (This is rough description of the more precise version stated above)
Also gives one the ability to efficiently perform this computation: Given a generated model, determine actions that will lead to desired outcome in bounded number of steps, with reasonably good probability.
The ability to perform the former computation would amount to the ability to efficiently learn. The ability to ...
Many experts suspect that there is no polynomial-time solution to the so-called NP-complete problems, though no-one has yet been able to rigorously prove this and there remains the possibility that a polynomial-time algorithm will one day emerge. However unlikely this is, today I would like to invite LW to play a game I played with with some colleagues called what-would-you-do-with-a-polynomial-time-solution-to-3SAT? 3SAT is, of course, one of the most famous of the NP-complete problems and a solution to 3SAT would also constitute a solution to *all* the problems in NP. This includes lots of fun planning problems (e.g. travelling salesman) as well as the problem of performing exact inference in (general) Bayesian networks. What's the most fun you could have?