The classic is Parfit's hitch-hiker, where an agent capable of accurately predicting the AI's actions offers to give it something if and only if the AI will perform some specific action in future. A causal AI might be tempted to modify itself to desire that specific action, while a timeless AI will simply do the thing anyway without needing to self-modify.
As for your second problem, Yudkowsky himself explains much better than I could why self-modification is important in the 3rd question of this interview.
Roughly, the importance is that there's only two kinds of truly catastrophic mistakes that an AI could make, mistakes which manage to wipe out to whole planet in one shot and errors in modifying its own code. Everything else can be recovered from.
The classic is Parfit's hitch-hiker, where an agent capable of accurately predicting the AI's actions offers to give it something if and only if the AI will perform some specific action in future. A causal AI might be tempted to modify itself to desire that specific action, while a timeless AI will simply do the ting anyway without needing to self-modify.
That works if the AI knows that the other agent will keep its promise, and the other agent knows what the AI will do in the future. In particular the AI has to know the other agent is going to successf...
I don't know if this is a little too afar field for even a Discussion post, but people seemed to enjoy my previous articles (Girl Scouts financial filings, video game console insurance, philosophy of identity/abortion, & prediction market fees), so...
I recently wrote up an idea that has been bouncing around my head ever since I watched Death Note years ago - can we quantify Light Yagami's mistakes? Which mistake was the greatest? How could one do better? We can shed some light on the matter by examining DN with... basic information theory.
Presented for LessWrong's consideration: Death Note & Anonymity.