As a non-native English speaker, it was a surprise that "self-conscious" normally means "shy", "embarassed", "uncomfortable", ... I blame lesswrong for giving me the wrong idea of this word meaning.
If there are side effects that someone can observe then the virtual machine is potentially escapable.
An unfriendly AI might not have a goal of getting out. A psycho that would prefer a dead person to a live person, and who would prefer to stay in a locked room instead of getting out, is not particularly friendly.
Since you would eventually let out the AI that won't halt after a certain finite amount of time, I see no reason why unfriendly AI would halt instead of waiting for you to believe it is friendly.
I'm curious what are the "ejector seats" that you mention in this post and in Day 1 post, that can help with the time sinks and planning. While other concepts seem familiar, I don't think I heared about the ejector seats before. I can guess that those are something like TAP's with the action of "adandoning current project/activity". Looking forward to your Day 10 post on planning that will hopefully have an in depth explanation and best practices of building those.
Thanks for the sequence that focuses on instrumental every-day rationality.
Suppose, there is a prediction market for a question like:
"If Alice is elected president, will GDP grow more than 20% by the end of the next 4 years?"
Current bets are 10 to 1 against Alice succeeding if elected. I strongly disagree, so I would like to bet $5000 on Alice and win a lot of money. Alice does not end up being elected, the prediction market probably being largely responsible for this outcome. So, the outc... (read more)