JoshuaZ comments on Simplified Humanism, Positive Futurism & How to Prevent the Universe From Being Turned Into Paper Clips - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (43)
He likely means a formal statement of the claim about decision systems that would take the form something like "Under the following formal definition of a decision system, as long as the following pathological/stupid conditions don't hold, a decision system will not seek to modify its goals." There are a fair number of mathematical theorems that have forms close to this where we can prove something for some large set of things but there are edge cases where we can't. That's the sort of thing Eliezer is talking about here (although we don't even have a really satisfactory definition of decision system at this point so what Eliezer wants is very optimistic here.)