All of AIL's Comments + Replies

AILΩ330

Okay, I think this makes sense. The idea is trying to re-interpret the various functions in the utility function as a single function and asking about the notion of complexity on that function which combines the complexity of producing a circuit which computes that function and the complexity of the circuit itself.

But just to check: is T over  ? I thought T in utility functions only depended on states and actions 

Maybe I am confused by what you mean by . I thought it was the state space, but that isn't consistent wit... (read more)

2Vanessa Kosoy
I'm not entirely sure what you mean by the state space. S is a state space associated specifically with the utility function. It has nothing to do with the state space of the environment. The reward function in the OP is (A×O)∗→R, not A×O→R. I slightly abused notation by defining r:S→Q in the parent comment. Let's say it's r′:S→Q and r is defined by using T to translate the history to the (last) state and then applying r′. The prior is just an environment i.e. a partial mapping ζ:(A×O)∗→ΔO defined on every history to which it doesn't itself assign probability 0. The expression DKL(ξ||ζ) means that we consider all possible ways to choose a Polish space X, probability distributions μ,ν∈ΔX and a mapping f:X×(A×O)∗→ΔO s.t. ζ=Eμ[f] and ξ=Eν[f] (where the expected value is defined using the Bayes law and not pointwise, see also the definition of "instrumental states" here), and take the minimum over all of them of DKL(ν||μ).
AILΩ230

I am not sure I understand your use of  in the third from last paragraph where you define goal directed intelligence. As you define  it is a complexity measure over programs . I assume this was a typo and you mean ? Or am I misunderstanding the definition of either  or ?

2Vanessa Kosoy
This is not a typo. I'm imagining that we have a program P that outputs (i) a time discount parameter γ∈Q∩[0,1), (ii) a circuit for the transition kernel of an automaton T:S×A×O→S and (iii) a circuit for a reward function r:S→Q (and, ii+iii are allowed to have a shared component to save computation time complexity). The utility function is U:(A×O)ω→R defined by U(x):=(1−γ)∞∑n=0γnr(sxn) where sx∈Sω is defined recursively by sxn+1=T(s,xn)
AILΩ010

Is there a place to look for papers and posts people have already submitted (and want to be public)?

2Vanessa Kosoy
We received no submissions so far, but I think that such submissions will appear here in the "prize claims" section.