You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Dagon comments on Continually-adjusted discounted preferences - Less Wrong Discussion

3 Post author: Stuart_Armstrong 06 March 2015 04:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (15)

You are viewing a single comment's thread.

Comment author: Dagon 07 March 2015 11:06:50AM 0 points [-]

Time travel is about the worst possible example to discuss discount rates and future preferences. Your statements about what you want from an agent WRT to past, current, and future desires pretty much collapse if time travel exists, along with the commonsense definitions of the words "past, current, and future".

Additionally, 0.1% is way too high for the probability that significant agent-level time-travel exists in our universe. Like hundreds (or more) of orders of magnitude too high. It's quite correct for me to say 0% is the probability I assign to it, as that's what it is, to any reasonable rounding precision.

I'd like to hear more about how you think discounting should work in a rational agent, on more conventional topics than time travel.

I tend to think of utility as purely an instantaneous decision-making construct. For me, it's non-comparable across agents AND across time for an agent (because I don't have a good theory of agent identity over time, and because it's not necessary for decision-making). For me, utility is purely the evaluation of the potential future gameboard (universe) conditional on a choice under consideration.

Utility can't be stored, and gets re-evaluated for each decision. Memory and expectation, of course are stored and continue forward, but that's not utility, that's universe state.

Discounting works by the agent counting on less utility for rewards that come further away from the decision/evaluation point. I think it's strictly a heuristic - useful to estimate uncertainty about the future state of the agent (and the rest of the universe) when the agent can't calculate very precisely.

In any case, I'm pretty sure discounting is about the amount of utility for a given future material gain, not about the amount of utility over time.

It's also my belief that self-modifying rational agents will correct their discounting pretty rapidly for cases where it doesn't optimize their goal achievement. Even in humans, you see this routinely: it only takes a little education for most investors to increase their time horizons (i.e. reduce their discount rate for money) by 10-100 times.

Comment author: Stuart_Armstrong 09 March 2015 11:57:14AM 0 points [-]

Additionally, 0.1% is way too high for the probability that significant agent-level time-travel exists in our universe.

The one person I asked - Anders Sandberg - gave 1% as his first estimate. But for most low probabilities, exponential shrinkage will eventually chew up the difference. A 100 orders of magnitude - what's that, an extra 10,000 years?

Comment author: Stuart_Armstrong 09 March 2015 11:54:05AM 0 points [-]

I'd like to hear more about how you think discounting should work in a rational agent, on more conventional topics than time travel.

I don't think discounting should be used at all, and that rational facts about the past and future (eg expected future wealth) should be used to get discount-like effects instead.

However, there are certain agent designs (AIXI, unbounded utility maximisers, etc...) that might need discounting as a practical tool. In those cases, adding this hack could allow them to discount while reducing the negative effects.

Utility can't be stored, and gets re-evaluated for each decision.

Depends. Utility that sums (eg total hedonistic utilitarianism, reward-agent made into a utility maximiser, etc...) does accumulate. Some other variants have utility that accumulates non-linearly. Many non-accumulating utilities might have an accumulating component.