Comment author: Yvain2 23 October 2008 11:26:00AM 7 points [-]

This is a beautiful comment thread. Too rarely do I get to hear anything at all about people's inner lives, so too much of my theory of mind is generalizations from one example.

For example, I would never have guessed any of this about reflectivity. Before reading this post, I didn't think there was such a thing as people who hadn't "crossed the Rubicon", except young children. I guess I was completely wrong.

Either I feel reflective but there's higher level of reflectivity I haven't reached and can't even imagine (which I consider unlikely but am including for purposes of fake humility), I'm misunderstanding what is meant by this post, or I've just always been reflective as far back as I can remember (6? 7?).

The only explanation I can give for that is that I've always had pretty bad obsessive-compulsive disorder which takes the form of completely irrational and inexplicable compulsions to do random things. It was really, really easy to identify those as "external" portions of my brain pestering me, so I could've just gotten in the habit of believing that about other things.

As for the original article, it would be easier to parse if I'd ever heard a good reduction of "I". Godel Escher Bach was brilliant, funny, and fascinating, but for me at least didn't dissolve this question.

In response to Prices or Bindings?
Comment author: Yvain2 21 October 2008 08:45:42PM 16 points [-]

I am glad Stanislav Petrov, contemplating his military oath to always obey his superiors and the appropriate guidelines, never read this post.

In response to Ethical Inhibitions
Comment author: Yvain2 19 October 2008 11:59:40PM 8 points [-]

"Historically speaking, it seems likely that, of those who set out to rob banks or murder opponents "in a good cause", those who managed to hurt themselves, mostly wouldn't make the history books. (Unless they got a second chance, like Hitler after the failed Beer Hall Putsch.) Of those cases we do read about in the history books, many people have done very well for themselves out of their plans to lie and rob and murder "for the greater good". But how many people cheated their way to actual huge altruistic benefits - cheated and actually realized the justifying greater good? Surely there must be at least one or two cases known to history - at least one king somewhere who took power by lies and assassination, and then ruled wisely and well - but I can't actually name a case off the top of my head. By and large, it seems to me a pretty fair generalization that people who achieve great good ends manage not to find excuses for all that much evil along the way."

History seems to me to be full of examples of people or groups successfully breaking moral rules for the greater good.

The American Revolution, for example. The Founding Fathers committed treason against the crown, started a war that killed thousands of people, and confiscated a lot of Tory property along the way. Once they were in power, they did arguably better than anyone else of their era at trying to create a just society. The Irish Revolution also started in terrorism and violence and ended in a peaceful democractic state (at least in the south); the war of Israeli independence involved a lot of terrorism on the Israeli side and ended with a democratic state that, regardless of what you think of it now, didn't show any particularly violent tendencies before acquiring Palestine in the 1967 war.

Among people who seized power violently, Augustus and Cyrus stand out as excellent in the ancient world (and I'm glad Caligula was assassinated and replaced with Claudius). Ho Chi Minh and Fidel Castro, while I disagree with their politics, were both better than their predecessors and better than many rulers who came to power by more conventional means in their parts of the world.

There are all sorts of biases that would make us less likely to believe people who "break the rules" can ever turn out well. One is the halo effect. Another is availability bias - it's much easier to remember people like Mao than it is to remember the people who were quiet and responsible once their revolution was over, and no one notices the genocides that didn't happen because of some coup or assassination. "Violence leads only to more violence" is a form of cached deep wisdom. And there's probably a false comparison effect: a post-coup government may be much better than the people they replaced while still not up to first-world standards.

And of course, "history is written by the victors". When the winners do something bad, it's never interpreted as bad after the fact. Firebombing a city to end a war more quickly, taxing a populace to give health care to the less fortunate, intervening in a foreign country's affairs to stop a genocide: they're all likely to be interpreted as evidence for "the ends don't justify the means" when they fail, but glossed over or treated as common sense interventions when they work. Consider the amount of furor raised over our supposedly good motives in going into Iraq and failing vs. the complete lack of discussion about going into Yugoslavia and succeeding.

Comment author: Yvain2 30 September 2008 08:16:02PM 8 points [-]

"I need to beat my competitors" could be used as a bad excuse for taking unnecessary risks. But it is pretty important. Given that an AI you coded right now with your current incomplete knowledge of Friendliness theory is already more likely to be Friendly than that of some competitor who's never really considered the matter, you only have an incentive to keep researching Friendliness until the last possible moment when you're confident that you could still beat your competitors.

The question then becomes: what is the minimum necessary amount of Friendliness research at which point going full speed ahead has a better expected result than continuing your research? Since you've been researching for several years and sound like you don't have any plans to stop until you're absolutely satisfied, you must have a lot of contempt for all your competitors who are going full-speed ahead and could therefore be expected to beat you if any were your intellectual equals. I don't know your competitors and I wouldn't know enough AI to be able to judge them if I did, but I hope you're right.

In response to 9/26 is Petrov Day
Comment author: Yvain2 26 September 2008 08:59:53PM 4 points [-]

Given that full-scale nuclear war would either destroy the world or vastly reduce the number of living people, Petrov, Arkhipov, and all the other "heroic officer makes unlikely decision to avert nuclear war" stories Recovering Irrationalist describes above make a more convincing test case for the anthropic principle than an LHC breakdown or two.

Comment author: Yvain2 21 September 2008 01:31:35AM 1 point [-]

Just realized that several sentences in my previous post make no sense because they assume Everett branches were separate before they actually split, but think the general point still holds.

Comment author: Yvain2 21 September 2008 12:49:53AM 10 points [-]

Originally I was going to say yes to the last question, but after thinking over why a failure of the LHC now (before it would destroy Earth) doesn't let me conclude anything by the anthropic principle, I'm going to say no.

Imagine a world in which CERN promises to fire the Large Hadron Collider one week after a major terrorist attack. Consider ten representative Everett branches. All those branches will be terrorist-free for the next few years except number 10, which is destined to suffer a major terrorist attack on January 1, 2009.

On December 31, 2008, Yvains 1 through 10 are perfectly happy, because they live in a world without terrorist attacks.

On January 2, 2009, Yvains 1 through 9 are perfectly happy, because they still live in worlds without terrorist attacks. Yvain 10 is terrified and distraught, both because he just barely escaped a terrorist attack the day before, and because he's going to die in a few days when they fire the LHC.

On January 8, 2009, CERN fires the LHC, killing everyone in Everett branch 10.

Yvains 1 through 9 aren't any better off than they would've been otherwise. Their universe was never destined to have a terrorist attack, and it still hasn't had a terrorist attack. Nothing has changed.

Yvain 10 is worse off than he would have been otherwise. If not for the LHC, he would be recovering from a terrorist attack, which is bad but not apocalyptically so. Now he's dead. There's no sense in which his spirit has been averaged out over Yvains 1 through 9. He's just plain dead. That can hardly be considered an improvement.

Since it doesn't help anyone and it does kill a large number of people, I'd advise CERN against using LHC-powered anthropic tricks to "prevent" terrorism.

In response to Magical Categories
Comment author: Yvain2 25 August 2008 07:52:44PM 7 points [-]

IMHO, the idea that wealth can't usefully be measured is one which is not sufficiently worthwhile to merit further discussion.

The "wealth" idea sounds vulnerable to hidden complexity of wishes. Measure it in dollars and you get hyperinflation. Measure it in resources, and the AI cuts down all the trees and converts them to lumber, then kills all the animals and converts them to oil, even if technology had advanced beyond the point of needing either. Find some clever way to specify the value of all resources, convert them to products and allocate them to humans in the level humans want, and one of the products will be highly carcinogenic because the AI didn't know humans don't like that. The only way to get wealth in the way that's meaningful to humans without humans losing other things they want more than wealth is for the AI to know exactly what we want as well or better than we do. And if it knows that, we can ignore wealth and just ask it to do what it knows we want.

"The counterargument is, in part, that some classifiers are better than others, even when all of them satisfy the training data completely. The most obvious criterion to use is the complexity of the classifier."

I don't think "better" is meaningful outside the context of a utility function. Complexity isn't a utility function and it's inadequate for this purpose. Which is better, tank vs. non-tank or cloudy vs. sunny? I can't immediately see which is more complex than the other. And even if I could, I'd want my criteria to change depending on whether I'm in an anti-tank infantry or a solar power installation company, and just judging criteria by complexity doesn't let me make that change, unless I'm misunderstanding what you mean by complexity here.

Meanwhile, reading the link to Bill Hibbard on the SL4 list:

"Your scenario of a system that is adequate for intelligence in its ability to rule the world, but absurdly inadequate for intelligence in its inability to distinguish a smiley face from a human, is inconsistent."

I think the best possible summary of Overcoming Bias thus far would be "Abandon all thought processes even remotely related to the ones that generated this statement."

Comment author: Yvain2 21 August 2008 12:22:45PM 15 points [-]

I was one of the people who suggested the term h-right before. I'm not great with mathematical logic, and I followed the proof only with difficulty, but I think I understand it and I think my objections remain. I think Eliezer has a brilliant theory of morality and that it accords with all my personal beliefs, but I still don't understand where it stops being relativist.

I agree that some human assumptions like induction and Occam's Razor have to be used partly as their own justification. But an ultimate justification of a belief has to include a reason for choosing it out of a belief-space.

For example, after recursive justification hits bottom, I keep Occam and induction because I suspect they reflect the way the universe really works. I can't prove it without using them. But we already know there are some things that are true but can't be proven. I think one of those things is that reality really does work on inductive and Occamian principles. So I can choose these two beliefs out of belief-space by saying they correspond to reality.

Some other starting assumptions ground out differently. Clarence Darrow once said something like "I hate spinach, and I'm glad I hate it, because if I liked it I'd eat it, and I don't want to eat it because I hate it." He's was making a mistake somewhere! If his belief is "spinach is bad", it probably grounds out in some evolutionary reason like insufficient energy for the EEA. But that doesn't justify his current statement "spinach is bad". His real reason for saying "spinach is bad" is that he dislikes it. You can only choose "spinach is bad" out of belief-space based on Clarence Darrow's opinions.

One possible definition of "absolute" vs. "relative": a belief is absolutely true if people pick it out of belief-space based on correspondence to reality; if people pick it out of belief-space based on other considerations, it is true relative to those considerations.

"2+2=4" is absolutely true, because it's true in the system PA, and I pick PA out of belief-space because it does better than, say, self-PA would in corresponding to arithmetic in the real world. "Carrots taste bad" is relatively true, because it's true in the system "Yvain's Opinions" and I pick "Yvain's Opinions" out of belief-space only because I'm Yvain.

When Eliezer say X is "right", he means X satisfies a certain complex calculation. That complex calculation is chosen out of all the possible complex-calculations in complex-calculation space because it's the one that matches what humans believe.

This does, technically, create a theory of morality that doesn't explicitly reference humans. Just like intelligent design theory doesn't explicitly reference God or Christianity. But most people believe that intelligent design should be judged as a Christian theory, because being a Christian is the only reason anyone would ever select it out of belief-space. Likewise, Eliezer's system of morality should be judged as a human morality, because being a human is the only reason anyone would ever select it out of belief-space.

That's why I think Eliezer's system is relative. I admit it's not directly relative, in that Eliezer isn't directly picking "Don't murder" out of belief-space every time he wonders about murder, based only on human opinion. But if I understand correctly, he's referring the question to another layer, and then basing that layer on human opinion.

An umpire whose procedure for making tough calls is "Do whatever benefits the Yankees" isn't very fair. A second umpire whose procedure is "Always follow the rules in Rulebook X" and writes in Rulebook X "Do whatever benefits the Yankees" may be following a rulebook, but he is still just as far from objectivity as the last guy was.

I think the second umpire's call is "correct" relative to Rulebook X, but I don't think the call is absolutely correct.

Comment author: Yvain2 21 August 2008 12:20:40PM 0 points [-]

...yeah, this was supposed to go in the new article, and I was just checking something in this one and accidentally posted it here. Please ignore *embarrassed*

View more: Prev | Next