Peterdjones comments on Arguing Orthogonality, published form - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (10)
You appear to not understand the Orthogonality Thesis, since you have misstated it. The orthogonality thesis deliberately refers to preferences, not values, because values could also refer to instrumental values, whereas preferences can only refer to terminal values. (Obviously, consistency and lack of contradiction are only generally valued instrumentally.)
No; if the entity values itself believing contradictions, then it having contradicting beliefs would mean it is a good instrumental rationalist.
An entity that has contradictory beliefs will be a poor instrumental rationalist. It looks like you would need to engineer a distinction between instrumental beliefs and terminal beliefs. While we're on the subject, you might need a firewall to stop an .AI acting on intrinsically motivating ideas, if they exist. In any case, orthogonality is an architecture choice, not an ineluctable fact about minds.
The OT has multiple forms, as Armstrong notes. An OT that says you could make arbitrary combinations of preference and power if you really wanted to, can't plug into an argument that future .AI will ,with high probability, be a Lovecraftian horror, at least not unless you also aargue that an orthogonal architecture will be chosen, with high probability.