Thanks. I had one question about your Toward Idealized Decision Theory paper.
I can't say I fully understand UDT, but the 'updateless' part does seem very similar to the "act as if you had precommitted to any action that you'd have wanted to precommit to" core idea of NDT. It's not clear to me that the super powerful UDT would make the wrong decision in the game where two players pick numbers between 0-10 and get payouts based on their pick and the total sum.
Wouldn't the UDT reason as follows? "If my algorithm were such that I wouldn't just ...
An AI should certainly cooperate if it discovered that by chance its opposing AI had identical source code.
I read your paper and the two posts in your short sequence. Thanks for the links. I still think it's very unlikely that one of the AIs in your original hypothetical (when they don't examine each other's source code) would do better by defecting.
I accept that if an opposing AI had a model of you that was just decent but not great, then there is some amount of logical connection there. What I haven't seen is any argument about the shape of the graph o...
I think defect is the right answer in your AI problem and therefore that NDT gets it right, but I'm aware lots of LWers think otherwise. I haven't researched this enough to want to argue it, but is there a discussion you'd recommend I read that spells out the reasoning? Otherwise I'll just look through LW posts on prisoner's dilemmas.
Secondly, I'd like to try to somehow incorporate logical effects into NDT. I agree they're important. Any suggestions for where I could find lots of examples of decision problems where logical effects matter, to help me think about the general case?
In the retro blackmail, CDT does not precommit to refusing even if it's given the opportunity to do so before the researcher gets its source code.
To clarify: you mean that CDT doesn't precommit at time t=1 even if the researcher hasn't gotten the code representing CDT's state at time t=0 yet. The CDT doesn't think precommitting will help because it knows the code the researcher will get will be from before its precommitment. I agree that this is true, and a CDT won't want to precommit.
I guess my definition even after my clarification is ambiguous, as i...
For example, a decision algorithm based on precommitment is unable to hold selfish preferences (valuing a cookie for me more than a cookie for a copy of me) in anthropic situations
I disagree that it makes sense to talk about one of the future copies of you being "you" whereas the other isn't. They're both you to the same degree (if they're exact copies).
Eliezer talked about this in his TDT paper. It is possible to hypothesize scenarios where agents get punished or rewarded for arbitrary reasons. For instance an AI could punish agents who made decisions based on the idea of their choices determining the results of abstract computations (as in TDT). This wouldn't show that TDT is a bad decision theory or even that it's no better than any other theory.
If we restrict ourselves to action-determined and decision-determined problems (see Eliezer's TDT paper) we can say that TDT is better than CDT, because it get...
I think my definition of NDT above was worded badly. The problematic part is "if he had previously known he'd be in his currently situation." Consider this definition:
You should always make the decision that a CDT-agent would have wished he had precommitted to, if he previously considered the possibility of his current situation and had the opportunity to costlessly precommit to a decision.
The key is that the NDT agent isn't behaving as if he knew for sure that he'd end up blackmailed when he made his precommitment (since his precommitment affec...
I believe that NDT gets this problem right.
The paper you link to shows that a pure CDT agent would not self modify into an NDT agent, because a CDT agent wouldn't really have the concept of "logical" connections between agents. The understanding that both logical and causal connections are real things is what would compel an agent to self-modify to NDT.
However, if there was some path by which an agent started out as pure CDT and then became NDT, the NDT agent would still choose correctly on Retro Blackmail even if the researcher had its original...
Idea related to the clipboard, but combined with poker chips:
There is a stack of blank note cards on the table, and several pens/markers. If there's an existing discussion and you want to talk about an unrelated topic, you grab a notecard, write down the topic, and place it face up on the table. At any time, there may be several note cards on the table representing topics people want to talk about. Each person also has a poker chip (or a few) that they may place near a particular card, expressing their interest in talking about that topic. Poker chips are basically upvotes.
I stutter and have done a lot of research on stuttering. It's rare that adult stutterers ever completely stop stuttering, but these two ebooks are the best resources I know of for dealing with it:
http://www.stutteringhelp.org/Portals/English/Book_0012_tenth_ed.pdf http://www.scribd.com/doc/23283047/Easy-Stuttering-Avoidance-Reduction-Therapy
The short version is that the less you try to suppress or conceal your stuttering the less severe it will become in the long run.
"the plan that lets you save money in the US is a life-engulfing minefield of time-consuming bargin-hunting, self-denial, and tax evasion."
I work as a software developer in the US, have never made a 'budget' for myself or tried to analyze my finaces before now, I pay taxes normally, eat out often, and have no trouble saving lots of money. I'm going to substitute my expenses and pretend I only make 100k and see how much I'd still be able to save (living in Seattle).
Rent: 16.8k instead of 23.2k Utilities: 2k instead of 7k (how can you spend 7k on u...
Steve Omohundro says:
"1) Nobody powerful wants to create unsafe AI but they do want to take advantage of AI capabilities.
2) None of the concrete well-specified valuable AI capabilities require unsafe behavior"
I think a lot of powerful people / organizations do want take advantage of possibly unsafe AI capabilities, such as ones that would allow them to be the emperors of the universe for all time. Especially if not doing so means that their rivals have a higher chance of becoming the emperors of the universe.