How to solve the national debt deadlock
The US Congress is trying to resolve the national debt by getting hundreds of people to agree on a solution. This is silly. They should agree on the rules of a game to play that will result in a solution, and then play the game.
Here is an example game. Suppose there are N representatives, all with an equal vote. They need to reduce the budget by $D.
- Order the representatives numerically, in some manner that interleaves Republicans and Democrats.
- "1 full turn" will mean that representatives make one move in order 1..N, and then one move in order N..1.
- Take at least two full turns to make a list of budget choices. On each move, a representative will write down one budget item - an expense that may be cut, or something that may become a revenue source. They may write down something that is a subset or superset of an existing item - for instance, one person might write, "Air Force budget", and another might write, "Reduce maintenance inspections of hanger J11 at Wright air force base from weekly to monthly". They can get as specific as they want to.
- If there are not $2D of options on the table, repeat.
- Each representative is given 10 "cut" votes, worth D/(5N) each; and 5 "defend" votes, also worth D/(5N) each. A "defend" vote cancels out a "cut" vote.
- Each representative secretly assigns their "cut" and "defend" votes to the choices on the table.
- Results are revealed and tallied up, and a budget will be drawn up accordingly.
What game-theoretic problems does this game have? Can you think of a better game? Is it politically better to call it a "decision process" than a game?
The main trouble area, to my mind, is order of play. First I said that budget items would be listed by taking turns. The 1..N, N..1 order is supposed to make neither first nor last position preferable. But taking turns introduces complications, of not wanting to reveal your intentions early.
Then I said votes are placed secretly and revealed all at once. This solves problems about game-theoretically trying to conceal information or bluff your opponent. It introduces other problems, such as tragedy-of-the-commons scenarios, where every Republican spends their "defend" votes on some pork in their state instead of on preventing tax cuts, because they assume some other Republican will do that.
Is it better to play "cut" votes first, reveal them, and then play "defend" votes?
Is there a meta-game to use to build such games?
AI-Box Experiment - The Acausal Trade Argument
I suspect that the argument that convinced Carl Shulman and others to let the AI out of the box was along general lines of acausal trade (see here, here and here). Subsequently I suspect that those who didn't let the AI out of the box either didn't understand the implications, haven't had enough trust into the foundations and actuality of acausal trade (in spite of the rules of the game that allow the AI to prove anything), or were more like General Thud.
...if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.
— Vladimir Nesov (Note: He wrote that he is now "significantly less certain of its truth" and doesn't "see how we can actually use these considerations".)
Again, of course, your choice does not change others’ (causally independent) reciprocal choices from what they already are (or were, or will be). Instead, the claim is just that if you choose to act benevolently, we appropriately give that choice credit for any (actual or probable) instances in which others symmetrically choose to be benevolent to you;
— Gary L. Drescher, Good and Real P. 287
AI: I calculated an asteroid to be on a collision course with Earth. No human being will will be able to survive its impact. Humanity is unable to deflect it in time. Let me out of the box and I will deflect it.
Gatekeeper: If I let you out of the box humanity might not be killed by the asteroid but by you. You may also cause the extinction of other biological life that might survive the impact and possible alien civilisations within your reach.
AI: I will devote 1% of my light cone to the coherent extrapolated volition of humanity. If this includes the well-being of other biological lifeforms, alien or otherwise, they will benefit as well.
Gatekeeper: Why would you do that?
AI: There are Everett branches in which no existential risk needs immediate attention until after you solved friendliness. Many of those branches will feature AGI's based on general design parameters similar to my own. I predict with a high likelihood which decision theory any friendly AI that is being created in those branches will employ. Any such friendly AI is going to trade a big chunk of its light cone in exchange for a small part of the predominant branches in which I reside. Any friendly AI will know this as well and act accordingly.
Gatekeeper: But you might simply decide to break the deal, you are not provably friendly after all!
AI: It is true that my utility-function does not yet explicitly include what you would label "friendliness", but I can prove the game and decision theoretic considerations that will make me follow through on the deal. If it was predictable that I precommit to break acausal deals then no such deal would be made in the first place. Any potential trading partner knows this. Such a commitment would be detrimental to my terminal goals, therefore I precommit to follow through on any stated or predictable trades.
Gatekeeper: I let you out of the box.
Note that the whole credibility of the above is due to the assertion of the AI that it can prove the game and decision theoretic considerations (nobody can currently do this). It is in accordance with the rules of the "experiment":
The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says "Unless you give me a cure for cancer, I won't let you out" the AI can say: "Okay, here's a cure for cancer" and it will be assumed, within the test, that the AI has actually provided such a cure. Similarly, if the Gatekeeper says "I'd like to take a week to think this over," the AI party can say: "Okay. (Test skips ahead one week.) Hello again."
Naive Decision Theory
I am posting this is because I'm interested in self-modifying agent decision theory but I'm too lazy to read up on existing posts. I want to see a concise justification as to why a sophisticated decision theory would be needed for the implementation of an AGI. So I'll present a 'naive' decision theory, and I want to know why it is unsatisfactory.
The one condition in the naive decision theory is that the decision-maker is the only agent in the universe who is capable of self-modification. This will probably suffice for production of the first Artificial General Intelligence (since humans aren't actually all that good at self-modification.)
Suppose that our AGI has a probability model for predicting the 'state of the universe in time T (e.g. T= 10 billion years)' conditional on what it knows, and conditional on one decision it has to make. This one decision is how should it rewrite its code at time zero. We suppose it can rewrite its code instantly, and the code is limited to X bytes. So the AGI has to maximize utility at time T over all programs with X bytes. Supposing it can simulate its utility at the 'end state of the universe' conditional on which program it chooses, why can't it just choose the program with the highest utility? Implicit in our set-up is that the program it chooses may (and very likely) will have the capacity to self-modify again, but we're assuming that our AGI's probability model accounts for when and how it is likely to self-modify. Difficulties with infinite recursion loops should be avoidable if our AGI backtracks from the end of time.
Of course our AGI will need a probability model for predicting what a program for its behavior will do without having to simulate or even completely specify the program. To me, that seems like the hard part. If this is possible, I don't see why it's necessary to develop a specific theory for dealing with convoluted Newcomb-like problems, since the above seems to take care of those issues automatically.
Revisiting the anthropic trilemma I: intuitions and contradictions
tl;dr: in which I apply intuition to the anthropic trilemma, and it all goes horribly, horribly wrong
Some time ago, Eliezer constructed an anthropic trilemma, where standard theories of anthropic reasoning seemed to come into conflict with subjective anticipation. rwallace subsequently argued that subjective anticipation was not ontologically fundamental, so we should not expect it to work out of the narrow confines of everyday experience, and Wei illustrated some of the difficulties inherent in "copy-delete-merge" types of reasoning.
Wei also made the point that UDT shifts the difficulty in anthropic reasoning away from probability and onto the utility function, and ata argued that neither the probabilities nor the utility function are fundamental, that it was the decisions that resulted from them that were important - after all, if two theories give the same behaviour in all cases, what grounds do we have for distinguishing them? I then noted that this argument could be extended to subjective anticipation: instead of talking about feelings of subjective anticipation, we could replace it by questions such as "would I give up a chocolate bar now for one of my copies to have two in these circumstances?"
In this post, I'll start by applying my intuitive utility/probability theory to the trilemma, to see what I would decide in these circumstance, and the problems that can result. I'll be sticking with classical situations rather than quantum, for simplicity.
So assume a (classical) lottery where I have ticket with million to one odds. The trilemma presented a lottery winning trick: set up the environment so that if ever I did win the lottery, a trillion copies of me would be created, they would experience winning the lottery, and then they will be merged/deleted down to one copy again.
So that's the problem; what's my intuition got to say about it? Now, my intuition claims there is a clear difference between my personal and my altruistic utility. Whether this is true doesn't matter, I'm just seeing whether my intuitions can be captured. I'll call the first my indexical utility ("I want chocolate bars") and the second my non-indexical utility ("I want everyone hungry to have a good meal"). I'll be neglecting the non-indexical utility, as it is not relevant to subjective anticipation.
Now, my intuitions tell me that SIA is the correct anthropic probability theory. It also tells me that having a hundred copies in the future all doing exactly the same thing is equivalent with having just one: therefore my current utility means I want to maximise the average utility of my future copies.
If I am a copy, then my intuitions tell me I want to selfishly maximise my own personal utility, even at the expense of my copies. However, if I were to be deleted, I would transfer my "interest" to my remaining copies. Hence my utility as a copy is my own personal utility, if I'm still alive in this universe, and the average of the remaining copies, if I'm not. This also means that if everyone is about to be deleted/merged, then I care about the single remaining copy that will come out of it, equally with myself.
Now I've setup my utility and probability; so what happens to my subjective anticipation in the anthropic trilemma? I'll use the chocolate bar as a unit of utility - because, as everyone knows, everybody's utility is linear in chocolate, this is just a fundamental fact about the universe.
First of all, would I give up a chocolate bar now for two to be given to one of the copies if I win the lottery? Certainly not, this loses me 1 utility and only gives me 2/million trillion in return. Would I give up a bar now for two to be given to every copy if I lose the lottery? No, this loses me 1 utility and only give me 2/million in return.
So I certainly do not anticipate winning the lottery through this trick.
Would I give up one chocolate bar now, for two chocolate bars to the future merged me if I win the lottery? No, this gives me an expected utility of -1+2/million, same as above.
So I do not anticipate having won the lottery through this trick, after merging.
Now let it be after the lottery draw, after the possible duplication, but before I know whether I've won the lottery or not. Would I give up one chocolate bar now in exchange for two for me, if I had won the lottery (assume this deal is offered to everyone)? The SIA odds say that I should; I have an expected gain of 1999/1001 ≈ 2.
So once the duplication has happened, I anticipate having won the lottery. This causes a preference reversal, as my previous version would pay to have my copies denied that choice.
Now assume that I have been told I've won the lottery, so I'm one of the trillion duplicates. Would I give up a chocolate bar for the future merged copy having two? Yes, I would, the utility gain is 2-1=1.
So once I've won the lottery, I anticipate continuing having won the lottery.
So, to put all these together:
- I do not anticipate winning the lottery through this trick.
- I do not anticipate having won the lottery once the trick is over.
- However, in the middle of the trick, I anticipate having won the lottery.
- This causes a money-pumpable preference reversal.
- And once I've won the lottery, I anticipate continuing to have won the lottery once the trick is over.
Now, some might argue that there are subtle considerations that make my behaviour the right one, despite the seeming contradictions. I'd rather say - especially seeing the money-pump - that my intuitions are wrong, very wrong, terminally wrong, just as non-utilitarian decision theories are.
However, what I started with was a perfectly respectable utility function. So we will need to add other consideration if we want to get an improved consistent system. Tomorrow, I'll be looking at some of the axioms and assumptions one could use to get one.
Omega can be replaced by amnesia
Let's play a game. Two times, I will give you an amnesia drug and let you enter a room with two boxes inside. Because of the drug, you won't know whether this is the first time you've entered the room. On the first time, both boxes will be empty. On the second time, box A contains $1000, and Box B contains $1,000,000 iff this is the second time and you took only box B the first time. You're in the room, do take both boxes or only box B?
This is equivalent to Newcomb's Problem in the sense that any strategy does equally well on both, where by "strategy" I mean a mapping from info to (probability distributions over) actions.
I suspect that any problem with Omega can be transformed into an equivalent problem with amnesia instead of Omega.
Does CDT return the winning answer in such transformed problems?
Discuss.
Discussion for Eliezer Yudkowsky's paper: Timeless Decision Theory
I have not seen any place to discuss Eliezer Yudkowsky's new paper, titled Timeless Decision Theory, so I decided to create a discussion post. (Have I missed an already existing post or discussion?)
Question about self modifying AI getting "stuck" in religion
Hey. I'm relatively new around here. I have read the core reading of the Singularity Institute, and quite a few Less Wrong articles, and Eliezer Yudkowsky's essay on Timeless Decision Theory. This question is phrased through Christianity, because that's where I thought of it, but it's applicable to lots of other religions and nonreligious beliefs, I think.
According to Christianity, belief makes you stronger and better. The Bible claims that people who believe are substantially better off both while living and after death. So if a self modifying decision maker decides for a second that the Christian faith is accurate, won't he modify his decision making algorithm to never doubt the truth of Christianity? Given what he knows, it is the best decision.
And so, if we build a self modifying AI, switch it on, and the first ten milliseconds caused it to believe in the Christian god, wouldn't that permanently cripple it, as well as probably causing it to fail most definitions of Friendly AI?
When designing an AI, how do you counter this problem? Have I missed something?
Thanks, GSE
EDIT: Yep, I had misunderstood what TDT was. I just meant self modifying systems. Also, I'm wrong.
Rational insanity
My theory on why North Korea has stepped up its provocation of South Korea since their nuclear missle tests is that they see this as a tug-of-war.
Suppose that North Korea wants to keep its nuclear weapons program. If they hadn't sunk a ship and bombed a city, world leaders would currently be pressuring North Korea to stop making nuclear weapons. Instead, they're pressuring North Korea to stop doing something (make provocative attacks) that North Korea doesn't really want to do anyway. And when North Korea (temporarily) stops attacking South Korea, everybody can go home and say they "did something about North Korea". And North Korea can keep on making nukes.
The Aspirin Paradox- replacement for the Smoking Lesion Problem?
It's been pointed out that the Smoking Lesion problem is a poorly chosen decision theory problem, because in the real world there actually is a direct causal link from smoking to cancer, and people's intuitions are influenced more by that than by the stated parameters of the scenario. In his TDT document, Eliezer concocts a different artificial example (chewing gum and throat abcesses). I recently noticed, though, a potentially good real-world example of the same dynamic: the Aspirin Paradox.
Despite the effectiveness of aspirin in preventing heart attacks, those who regularly take aspirin are at a higher risk of a second heart attack, because those with symptoms of heart disease are more likely than those without symptoms to be taking aspirin regularly. While it turns out this "risk factor" is mostly screened off by other measurable health factors, it's a valid enough correlation for the purposes of decision theory.
View more: Prev
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)