I use that same assumption, with only the slight caveat that I keep the world program and the preference function separate for clarity's sake, and you need both, but I often see them combined into one function.
The big difference, I think, is that the way I do it, the world program doesn't get a decision theory directly as input; instead, the world program's source is given to the decision theory, the decision theory outputs a strategy, and then the strategy is given to the world program as input. This is a better match for how we normally talk about decision theory problems, and prevents a lot of shennanigans.
Of course the world program shouldn't get the decision theory as input! In the formulation I always use, the world program doesn't have any inputs, it's a computation with no arguments that returns a utility value. You live in the world, so the world program contains your decision theory as a subroutine :-)
Some people on LW have expressed interest in what's happening on the decision-theory-workshop mailing list. Here's an example of the kind of work we're trying to do there.
In April 2010 Gary Drescher proposed the "Agent simulates predictor" problem, or ASP, that shows how agents with lots of computational power sometimes fare worse than agents with limited resources. I'm posting it here with his permission:
About a month ago I came up with a way to formalize the problem, along the lines of my other formalizations:
Also Wei Dai has a tentative new decision theory that solves the problem, but this margin (and my brain) is too small to contain it :-)
Can LW generate the kind of insights needed to make progress on problems like ASP? Or should we keep working as a small clique?