Posts

Sorted by New

Wiki Contributions

Comments

Sorted by
AILΩ330

Okay, I think this makes sense. The idea is trying to re-interpret the various functions in the utility function as a single function and asking about the notion of complexity on that function which combines the complexity of producing a circuit which computes that function and the complexity of the circuit itself.

But just to check: is T over  ? I thought T in utility functions only depended on states and actions 

Maybe I am confused by what you mean by . I thought it was the state space, but that isn't consistent with  in your post which was defined over ? As a follow up: defining r as depending on actions and observations instead of actions and states (which e.g. the definition in POMDP on Wikipedia) seems like it changes things.  So I'm not sure if you intended the rewards to correspond with the observations or 'underlying' states. 

One more question, this one about the priors: what are they a prior over exactly? I will use the letters/terms from https://en.wikipedia.org/wiki/Partially_observable_Markov_decision_process to try to be explicit. Is the prior capturing the "set of conditional observation probabilities" (O on Wikipedia)? Or is it capturing the "set of conditional transition probabilities between states" (T on Wikipedia)? Or is it capturing a distribution over all possible T and O? Or are you imaging that T is defined with U (and is non-random) and O is defined within the prior? 
I ask because the term  will be positive infinity if  is zero for any value where  is non-zero. Which makes the interpretation that it is either O or T directly pretty strange (for example, in the case where there are two states  and  and two obersvations  and  an O where  and  if  would have a KL divergence of infinity from the  if  had non-zero probability on ). So, I assume this is a prior over what the conditional observation matrices might be. I am assuming that your comment above implies that T is defined in the utility function U instead, and is deterministic? 

AILΩ230

I am not sure I understand your use of  in the third from last paragraph where you define goal directed intelligence. As you define  it is a complexity measure over programs . I assume this was a typo and you mean ? Or am I misunderstanding the definition of either  or ?

AILΩ010

Is there a place to look for papers and posts people have already submitted (and want to be public)?