[SEQ RERUN] Entangled Photons
Today's post, Entangled Photons was originally published on 03 May 2008. A summary (taken from the LW wiki):
Using our newly acquired understanding of photon polarizations, we see how to construct a quantum state of two photons in which, when you measure one of them, the person in the same world as you, will always find that the opposite photon has opposite quantum state. This is not because any influence is transmitted; it is just decoherence that takes place in a very symmetrical way, as can readily be observed in our calculations.
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Decoherence as Projection, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
[SEQ RERUN] On Being Decoherent
Today's post, On Being Decoherent was originally published on 27 April 2008. A summary (taken from the LW wiki):
When a sensor measures a particle whose amplitude distribution stretches over space - perhaps seeing if the particle is to the left or right of some dividing line - then the standard laws of quantum mechanics call for the sensor+particle system to evolve into a state of (particle left, sensor measures LEFT) + (particle right, sensor measures RIGHT). But when we humans look at the sensor, it only seems to say "LEFT" or "RIGHT", never a mixture like "LIGFT". This, of course, is because we ourselves are made of particles, and subject to the standard quantum laws that imply decoherence. Under standard quantum laws, the final state is (particle left, sensor measures LEFT, human sees "LEFT") + (particle right, sensor measures RIGHT, human sees "RIGHT").
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Where Experience Confuses Physicists, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
[Transcript] Richard Feynman on Why Questions
I thought this video was a really good question dissolving by Richard Feynman. But it's in 240p! Nobody likes watching 240p videos. So I transcribed it. (Edit: That was in jest. The real reasons are because I thought I could get more exposure this way, and because a lot of people appreciate transcripts. Also, Paul Graham speculates that the written word is universally superior than the spoken word for the purpose of ideas.) I was going to post it as a rationality quote, but the transcript was sufficiently long that I think it warrants a discussion post instead.
Here you go:
Would a FAI reward us for helping create it?
We expect that post-singularity there will still be limited resources in the form of available computational resources until heat death.
Those resources do not necessarily need to be allocated fairly. In fact, I would guess that if they were allocated unfairly the most like beneficiaries would be those people that helped contribute to the creation of a friendly AI.
Now for some open questions:
What probability distribution of extra resources do you expect with respect to various possible contributions to the creation of friendly AI?
Would donating to the SIAI suffice for acquiring these extra resources?
Should we discount extraordinary implications?
(Spawned by an exchange between Louie Helm and Holden Karnofsky.)
The field of formal rationality is relatively new and I believe that we would be well-advised to discount some of its logical implications that advocate extraordinary actions.
Our current methods might turn out to be biased in new and unexpected ways. Pascal's mugging, the Lifespan Dilemma, blackmailing and the wrath of Löb's theorem are just a few examples on how an agent build according to our current understanding of rationality could fail.
Bayes’ Theorem, the expected utility formula, and Solomonoff induction are all reasonable heuristics. Yet those theories are not enough to build an agent that will be reliably in helping us to achieve our values, even if those values were thoroughly defined.
If we wouldn't trust a superhuman agent equipped with our current grasp of rationality to be reliably in extrapolating our volition, how can we trust ourselves to arrive at correct answers given what we know?
We should of course continue to use our best methods to decide what to do. But I believe that we should also draw a line somewhere when it comes to extraordinary implications.
Intuition, Rationality and Extraordinary Implications
It doesn't feel to me like 3^^^^3 lives are really at stake, even at very tiny probability. I'd sooner question my grasp of "rationality" than give five dollars to a Pascal's Mugger because I thought it was "rational". — Eliezer Yudkowsky
Holden Karnofsky is suggesting that in some cases we should follow the simple rule that "extraordinary claims require extraordinary evidence".
I think that we should sometimes demand particular proof P; and if proof P is not available, then we should discount seemingly absurd or undesirable consequences even if our theories disagree.
I am not referring to the weirdness of the conclusions but the foreseeable scope of the consequences of being wrong about them. We should be careful in using the implied scope of certain conclusions to outweigh their low probability. I feel we should put more weight to the consequences of our conclusions being wrong than being right.
As an example take the idea of quantum suicide and assume it would make sense under certain circumstances. I wouldn’t commit quantum suicide even given a high confidence in the many-worlds interpretation of quantum mechanics being true. Logical implications just don’t seem enough in some cases.
To be clear, extrapolations work and often are the best we can do. But since there are problems such as the above, that we perceive to be undesirable and that lead to absurd actions and their consequences, I think it is reasonable to ask for some upper and lower bounds regarding the use and scope of certain heuristics.
We are not going to stop pursuing whatever terminal goal we have chosen just because someone promises us even more utility if we do what that person wants. We are not going to stop loving our girlfriend just because there are other people who do not approve our relationship and who together would experience more happiness if we divorced than the combined happiness of us and our girlfriend being in love. Therefore we already informally established some upper and lower bounds.
I have read about people who became very disturbed and depressed taking ideas too seriously. That way madness lies, and I am not willing to choose that path yet.
Maybe I am simply biased and have been unable to overcome it yet. But my best guess right now is that we simply have to draw a lot of arbitrary lines and arbitrarily refuse some steps.
Taking into account considerations of vast utility or low probability quickly leads to chaos theoretic considerations like the butterfly effect. As a computationally bounded and psychical unstable agent I am unable to cope with that. Consequently I see no other way than to neglect the moral impossibility of extreme uncertainty.
Until the problems are resolved, or rationality is sufficiently established, I will continue to put vastly more weight on empirical evidence and my intuition than on logical implications, if only because I still lack the necessary educational background to trust my comprehension and judgement of the various underlying concepts and methods used to arrive at those implications.
Expected Utility Maximization and Complex Values
One of the problems with my current grasp of rationality that I perceive to be unacknowledged are the consequences of expected utility maximization with respect to human nature and our complex values.
I am still genuinely confused about what a person should do. I don't even know how much sense that concept makes. Does expected utility maximization has anything to do with being human?
Those people who take existential risks seriously and who are currently involved in their mitigation seem to be disregarding many other activities that humans usually deem valuable because the expected utility of saving the world does outweigh the pursuit of other goals. I do not disagree with that assessment but find it troubling.
The problem is, will there ever be anything but a single goal, a goal that can either be more effectively realized and optimized to yield the most utility or whose associated expected utility simply outweighs all other values?
Assume that humanity managed to create a friendly AI (FAI). Given the enormous amount of resources that each human is poised to consume until the dark era of the universe, wouldn't the same arguments that now suggest that we should contribute money to existential risk charities then suggest that we should donate our resources to the friendly AI? Our resources could enable it to find a way to either travel back in time, leave the universe or hack the matrix. Anything that could avert the end of the universe and allow the FAI to support many more agents has effectively infinite expected utility.
The sensible decision would be to concentrate on those scenarios with the highest expected utility now, e.g. solving friendly AI, and worry about those problems later. But not only does the same argument always work but the question is also relevant to the nature of friendly AI and our ultimate goals. Is expected utility maximization even compatible with our nature? Does expected utility maximization lead to world states in which wireheading is favored, either directly or indirectly by focusing solely on a single high-utility goal that does outweigh all other goals?
Conclusion
- Being able to prove something mathematically doesn't prove its relation to reality.
- Relativity is less wrong than Newtonian mechanics but it still breaks down in describing singularities including the very beginning of the universe.
It seems to me that our notion of rationality is not the last word on the topic and that we shouldn't act as if it was.
[SEQ RERUN] 0 And 1 Are Not Probabilities
Today's post, 0 And 1 Are Not Probabilities was originally published on 10 January 2008. A summary (taken from the LW wiki):
In the ordinary way of writing probabilities, 0 and 1 both seem like entirely reachable quantities. But when you transform probabilities into odds ratios, or log-odds, you realize that in order to get a proposition to probability 1 would require an infinite amount of evidence.
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Infinite Certainty, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
Probability puzzle: Coins in envelopes
This went over well in the xkcd logic puzzle forum (my hand was not removed), so I thought I'd try it here. It came to me in a dream, so by solving it you may be helping to summon an elder god or something.
Bob replies, "That depends on what random function you used to choose how many envelopes to fill. If you, say, flipped m coins and put each one that came up heads in an envelope, the expected value is $.50."
Alice explains what her random function was, and Bob calculates the expected value. For kicks, he pays her that amount, and she lets him pick a random envelope. It has a coin in it! Bob pockets the coin. Alice then takes the now-empty envelope back, and shuffles it into the others. "Congratulations," she says. "So, what's the expected value of playing the game again, now that there's one fewer coin?"
"Same as before," Bob replies.
Problem 1: Give a value for m and a random function for which this makes sense (there are many).
Where do selfish values come from?
Human values seem to be at least partly selfish. While it would probably be a bad idea to build AIs that are selfish, ideas from AI design can perhaps shed some light on the nature of selfishness, which we need to understand if we are to understand human values. (How does selfishness work in a decision theoretic sense? Do humans actually have selfish values?) Current theory suggest 3 possible ways to design a selfish agent:
- have a perception-determined utility function (like AIXI)
- have a static (unchanging) world-determined utility function (like UDT) with a sufficiently detailed description of the agent embedded in the specification of its utility function at the time of the agent's creation
- have a world-determined utility function that changes ("learns") as the agent makes observations (for concreteness, let's assume a variant of UDT where you start out caring about everyone, and each time you make an observation, your utility function changes to no longer care about anyone who hasn't made that same observation)
Note that 1 and 3 are not reflectively consistent (they both refuse to pay the Counterfactual Mugger), and 2 is not applicable to humans (since we are not born with detailed descriptions of ourselves embedded in our brains). Still, it seems plausible that humans do have selfish values, either because we are type 1 or type 3 agents, or because we were type 1 or type 3 agents at some time in the past, but have since self-modified into type 2 agents.
But things aren't quite that simple. According to our current theories, an AI would judge its decision theory using that decision theory itself, and self-modify if it was found wanting under its own judgement. But humans do not actually work that way. Instead, we judge ourselves using something mysterious called "normativity" or "philosophy". For example, a type 3 AI would just decide that its current values can be maximized by changing into a type 2 agent with a static copy of those values, but a human could perhaps think that changing values in response to observations is a mistake, and they ought to fix that mistake by rewinding their values back to before they were changed. Note that if you rewind your values all the way back to before you made the first observation, you're no longer selfish.
So, should we freeze our selfish values, or rewind our values, or maybe even keep our "irrational" decision theory (which could perhaps be justified by saying that we intrinsically value having a decision theory that isn't too alien)? I don't know what conclusions to draw from this line of thought, except that on close inspection, selfishness may offer just as many difficult philosophical problems as altruism.
Why would an AI try to figure out its goals?
"So how can it ensure that future self-modifications will accomplish its current objectives? For one thing, it has to make those objectives clear to itself. If its objectives are only implicit in the structure of a complex circuit or program, then future modifications are unlikely to preserve them. Systems will therefore be motivated to reflect on their goals and to make them explicit." -- Stephen M. Omohundro, The Basic AI Drives
This AI becomes able to improve itself in a haphazard way, makes various changes that are net improvements but may introduce value drift, and then gets smart enough to do guaranteed self-improvement, at which point its values freeze (forever). -- Eliezer Yudkowsky, What I Think, If Not Why
I have stopped understanding why these quotes are correct. Help!
More specifically, if you design an AI using "shallow insights" without an explicit goal-directed architecture - some program that "just happens" to make intelligent decisions that can be viewed by us as fulfilling certain goals - then it has no particular reason to stabilize its goals. Isn't that anthropomorphizing? We humans don't exhibit a lot of goal-directed behavior, but we do have a verbal concept of "goals", so the verbal phantom of "figuring out our true goals" sounds meaningful to us. But why would AIs behave the same way if they don't think verbally? It looks more likely to me that an AI that acts semi-haphazardly may well continue doing so even after amassing a lot of computing power. Or is there some more compelling argument that I'm missing?
Some conditional independence (Bayes Network) exercises from ai-class.com
If you'd like to see some visual representations of conditional independence is neither necessary or sufficient for independence, confounding causes, explaining away, etc. you should be able to view these videos from ai-class.com.
Working the exercises gave me a better understanding than the "I understand this and so don't need to actually apply it" feeling that almost satisfied me.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)