I'd be fine with it throwing a brick at me. It beats it having the patience to take over the entire world. The point is, if it throws a brick at me, I have data on what went wrong with its utility function and I have a lead on how to fix it.
Here's my attempt at solving the puzzle you provide – I believe the following procedure will yield a list of approximate values for the E-Coli bacterium. (It'd take a research team and several years, but in principle it is possible.)
The only example I can think of is with parents and their children. Evolutionarily, parents are optimized to maximize the odds that their children will survive to reproduce, up to and including self-sacrifice to that end. However, parents do not possess ideal information about the current state of their child, so they must undergo a process resembling value alignment to learn what their children need.
At that point I think we’re running the risk of passing the buck forever. (Unless we can prove that process terminates.)
I am inclined to believe that indeed the buck will get passed forever. This idea you raise is remarkably similar to the Procrastination Paradox (which you can read about at https://intelligence.org/files/ProcrastinationParadox.pdf).
I should clarify that the discounting is not a shackle, per se, but a specification of the utility function. It's a normative specification that results now are better than results later according to a certain discount rate. An AI that cares about results now will not change itself to be more "patient" – because then it will not get results now, which is what it cares about.
The key is that the utility function's weights over time should form a self-similar graph. That is, if results in 10 seconds are twice as valuable as results in 20 s... (read more)