Strange7 comments on SotW: Check Consequentialism - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (311)
Cleverness-related failure mode (that actually came up in the trial unit):
One shouldn't try too hard to rescue non-consequentialist reasons. This probably has to be emphasized especially with new audiences who associate "rationality" to Spock and university professors, or audiences who've studied pre-behavioral economics, and who think they score extra points if they come up with amazingly clever ways to rescue bad ideas.
Any decision-making algorithm, no matter how stupid, can be made to look like expected utility maximization through the transform "Assign infinite negative utility to departing from decision algorithm X". This in essence is what somebody is doing when they say, "Aha! But if I stop my PhD program now, I'll have the negative consequence of having abandoned a sunk cost!" (Sometimes I feel like hitting people with a wooden stick when they do this, but that act just expresses an emotion rather than having any discernible positive consequences.) This is Cleverly Failing to Get the Point if "not wanting to abandon a sunk cost", i.e., the counterintuitive feel of departing from the brain's previous decision algorithm, is treated as an overriding consideration, i.e., an infinite negative utility.
It's a legitimate future consequence only if the person says, "The sense of having abandoned a sunk cost will make me feel sick to my stomach for around three days, after which I would start to adjust and adapt a la the hedonic treadmill". In this case they have weighed the intensity and the duration of the future hedonic consequence, rather than treating it as an instantaneous infinite negative penalty, and are now ready to trade that off against other and probably larger considerations like the total amount of work required to get a PhD.
It would have the consequence of conditioning in the subject's mind an association between a particular thought process and being hit with a stick. Most people don't like being hit with sticks, so the association is likely to make them avoid that particular thought process. Do you not consider "teaching people to avoid a dangerously stupid thought process" a positive consequence?
Actually they would associate the stick with a number of things, including but not limited to the stupid thought process. They would be quite likely to associate the stick with their encounter with Eliezer, and to their (failed) attempt to converse with and/or follow his thought processes. Mind: They associate the stick with all aspects of the attempt, not only with the failure.
It might work in a Master/Apprentice scenario where the stick-hitting-victim is bindingly pre-committed to a year of solitude with Stick-Happy!Eliezer in order to learn from him the art of Cognitive Kung Fu. This is the only scenario I can immediately visualize in which the stick-hitting victim would not immediately decide that Stick-Happy!Eliezer is a person they can get away with avoiding, and possibly with reporting to the police for assault.
EDIT01: This is assuming that the experiential sample size is 1.
I was only pointing out that arguably-positive consequences would be present. I agree that they most likely would not predominate outside controlled conditions, and the overall decision not to engage in spontaneous armed assault was a wise one.