All of EulersApprentice's Comments + Replies

I should clarify that the discounting is not a shackle, per se, but a specification of the utility function. It's a normative specification that results now are better than results later according to a certain discount rate. An AI that cares about results now will not change itself to be more "patient" – because then it will not get results now, which is what it cares about.

The key is that the utility function's weights over time should form a self-similar graph. That is, if results in 10 seconds are twice as valuable as results in 20 s... (read more)

1TheWakalix
Wait, but isn't the exponential curve self-similar in that way, not the hyperbolic curve? I notice that I am confused. (Edit to clarify: I'm the only one who said hyperbolic, this is entirely my own confusion.) Justification: waiting x seconds at time a should result in the same discount ratio as waiting xseconds at time b. If f(x) is the discounting function, this is equivalent to saying that f(a+x)f(a)=f(b+x)f(b) . If we let f(x)=e−x, then this holds: e−(a+x)e−a=e−x=e−(b+x)e−b. But if f(x)=1x , then aa+x≠bb+x unless a=b. (To see why, just cross-multiply.) It turns out that I noticed a real thing. "Although exponential discounting has been widely used in economics, a large body of evidence suggests that it does not explain people’s choices. People choose as if they discount future rewards at a greater rate when the delay occurs sooner in time." Hyperbolic discounting is, in fact, irrational as you describe, in the sense that an otherwise rational agent will self-modify away from it. "People [...] seem to show inconsistencies in their choices over time." (By the way, thanks for making the key mathematical idea of discounting clear.) (That last quote is also amusing: dry understatement.)

I'd be fine with it throwing a brick at me. It beats it having the patience to take over the entire world. The point is, if it throws a brick at me, I have data on what went wrong with its utility function and I have a lead on how to fix it.

9Vladimir_Nesov
It could throw a paperclip maximizer at you.

Here's my attempt at solving the puzzle you provide – I believe the following procedure will yield a list of approximate values for the E-Coli bacterium. (It'd take a research team and several years, but in principle it is possible.)

... (read more)

The only example I can think of is with parents and their children. Evolutionarily, parents are optimized to maximize the odds that their children will survive to reproduce, up to and including self-sacrifice to that end. However, parents do not possess ideal information about the current state of their child, so they must undergo a process resembling value alignment to learn what their children need.

At that point I think we’re running the risk of passing the buck forever. (Unless we can prove that process terminates.)

I am inclined to believe that indeed the buck will get passed forever. This idea you raise is remarkably similar to the Procrastination Paradox (which you can read about at https://intelligence.org/files/ProcrastinationParadox.pdf).