Comment author: patrickscottshields 28 April 2013 01:53:56AM 0 points [-]

I'd like to cite this article (or related published work) in a research project paper I'm writing which includes application of an expected utility-maximizing algorithm to a version of the prisoner's dilemma. Do you have anything more cite-able than this article's URL and your LW username? I didn't see anything in your profile which could point me towards your real name and anything you might have published.

Comment author: wedrifid 04 April 2013 04:09:28AM 0 points [-]

The strategy I have been considering in my attempt to prove a paradox inconsistent is to prove a contradiction using the problem formulation.

This seems like a worthy approach to paradoxes! I'm going to suggest the possibility of broadening your search slightly. Specifically, to include the claim "and this is paradoxical" as one of the things that can be rejected as producing contradictions. Because in this case there just isn't a paradox. You take the one box, get rich and if there is a decision theory that says to take both boxes you get a better theory. For this reason "Newcomb's Paradox" is a misnomer and I would only use "Newcomb's Problem" as an acceptable name.

In Newcomb's problem, suppose each player uses a fair coin flip to decide whether to one-box or two-box. Then Omega could not have a sustained correct prediction rate above 50%. But the problem formulation says Omega does; therefore the problem must be inconsistent.

Yes, if the player is allowed access to entropy that Omega cannot have then it would be absurd to also declare that Omega can predict perfectly. If the coin flip is replaced with a quantum coinflip then the problem becomes even worse because it leaves an Omega that can perfectly predict what will happen but is faced with a plainly inconsistent task of making contradictory things happen. The problem specification needs to include a clause for how 'randomization' is handled.

Alternatively, Omega knew the outcome of the coin flip in advance; let's say Omega has access to all relevant information, including any supposed randomness used by the decision-maker. Then we can consider the decision to already have been made; the idea of a choice occurring after Omega has left is illusory (i.e. deterministic; anyone with enough information could have predicted it.)

Here is where I should be able to link you to the wiki page on free will where you would be given an explanation of why the notion that determinism is incompatible with choice is a confusion. Alas that page still has pretentious "Find Out For Yourself" tripe on it instead of useful content. The wikipedia page on compatibilism is somewhat useful but not particularly tailored to a reductionist decision theory focus.

In this case of the all-knowing Omega, talking about what someone should choose after Omega has left seems mistaken. The agent is no longer free to make an arbitrary decision at run-time, since that would have backwards causal implications; we can, without restricting which algorithm is chosen, require the decision-making algorithm to be written down and provided to Omega prior to the whole simulation. Since Omega can predict the agent's decision, the agent's decision does determine what's in the box, despite the usual claim of no causality. Taking that into account, CDT doesn't fail after all.

There have been attempts to create derivatives of CDT that work like that. That replace the "C" from conventional CDT with a type of causality that runs about in time as you mention. Such decision theories do seem to handle most of the problems that CDT fails at. Unfortunately I cannot recall the reference.

I would love to hear from someone in further detail on these issues of consistency. Have they been addressed elsewhere? If so, where?

I'm not sure which further details you are after. Are you after a description of Newcomb's problem that includes the details necessary to make it consistent? Or about other potential inconsistencies? Or other debates about whether the problems are inconsistent?

Comment author: patrickscottshields 11 April 2013 03:32:43PM 0 points [-]

I'm not sure which further details you are after.

Thanks for the response! I'm looking for a formal version of the viewpoint you reiterated at the beginning of your most recent comment:

Yes, if the player is allowed access to entropy that Omega cannot have then it would be absurd to also declare that Omega can predict perfectly. [...] The problem specification needs to include a clause for how 'randomization' is handled.

That makes a lot of sense, but I haven't been able to find it stated formally. Wolpert and Benford's papers (using game theory decision trees or alternatively plain probability theory) seem to formally show that the problem formulation is ambiguous, but they are recent papers, and I haven't been able to tell how well they stand up to outside analysis.

If there is a consensus that the sufficient use of randomness prevents Omega from having perfect or nearly perfect predictions, then why is Newcomb's problem still relevant? If there's no randomness, wouldn't an appropriate application of CDT result in one-boxing since the decision-maker's choice and Omega's prediction are both causally determined by the decision-maker's algorithm, which was fixed prior to the making of the decision?

There have been attempts to create derivatives of CDT that work like that. That replace the "C" from conventional CDT with a type of causality that runs about in time as you mention. Such decision theories do seem to handle most of the problems that CDT fails at. Unfortunately I cannot recall the reference.

I'm curious: why can't normal CDT handle it by itself? Consider two variants of Newcomb's problem:

  1. At run-time, you get to choose the actual decision made in Newcomb's problem. Omega made its prediction without any information about your choice or what algorithms you might use to make it. In other words, Omega doesn't have any particular insight into your decision-making process. This means at run-time you are free to choose between one-boxing and two-boxing without backwards causal implications. In this case Omega cannot make perfect or nearly perfect predictions, for reasons of randomness which we already discussed.
  2. You get to write the algorithm, the output of which will determine the choice made in Newcomb's problem. Omega gets access to the algorithm in advance of its prediction. No run-time randomness is allowed. In this case, Omega can be a perfect predictor. But the correct causal network shows that both the decision-maker's "choice" as well as Omega's prediction are causally downstream from the selection of the decision-making algorithm. CDT holds in this case because you aren't free at run-time to make any choice other than what the algorithm outputs. A CDT algorithm would identify two consistent outcomes: (one-box && Omega predicted one-box), and (two-box && Omega predicted two-box). Coded correctly, it would prefer whichever consistent outcome had the highest expected utility, and so it would one-box.

(Note: I'm out of my depth here, and I haven't given a great deal of thought to precommitment and the possibility of allowing algorithms to rewrite themselves.)

Comment author: patrickscottshields 17 March 2013 07:26:29PM 2 points [-]

This seems like an opportunity for a startup. It could be a fun project to build startup weekend-style. The concept doesn't seem particular tied to the Less Wrong community, and (based on a couple minutes searching for "online study halls") there don't seem to be other prominent startups taking on this specific challenge.

In response to comment by linas on Decision Theory FAQ
Comment author: wedrifid 05 March 2013 07:34:25AM 2 points [-]

Presentation of Newcomb's problem in section 11.1.1. seems faulty. What if the human flips a coin to determine whether to one-box or two-box? (or any suitable source of entropy that is beyond the predictive powers of the super-intelligence.) What happens then?

If the FAQ left this out then it is indeed faulty. It should either specify that if Omega predicts the human will use that kind of entropy then it gets a "Fuck you" (gets nothing in the big box, or worse) or, at best, that Omega awards that kind of randomization with a proportional payoff (ie. If behavior is determined by a fair coin then the big box contains half the money.)

This is a fairly typical (even "Frequent") question so needs to be included in the problem specification. But it can just be considered a minor technical detail.

Comment author: patrickscottshields 11 March 2013 03:10:07AM *  0 points [-]

This response challenges my intuition, and I would love to learn more about how the problem formulation is altered to address the apparent inconsistency in the case that players make choices on the basis of a fair coin flip. See my other post.

In response to Decision Theory FAQ
Comment author: incogn 28 February 2013 05:33:54PM *  9 points [-]

I don't really think Newcomb's problem or any of its variations belong in here. Newcomb's problem is not a decision theory problem, the real difficulty is translating the underspecified English into a payoff matrix.

The ambiguity comes from the the combination of the two claims, (a) Omega being a perfect predictor and (b) the subject being allowed to choose after Omega has made its prediction. Either these two are inconsistent, or they necessitate further unstated assumptions such as backwards causality.

First, let us assume (a) but not (b), which can be formulated as follows: Omega, a computer engineer, can read your code and test run it as many times as he would like in advance. You must submit (simple, unobfuscated) code which either chooses to one- or two-box. The contents of the boxes will depend on Omega's prediction of your code's choice. Do you submit one- or two-boxing code?

Second, let us assume (b) but not (a), which can be formulated as follows: Omega has subjected you to the Newcomb's setup, but because of a bug in its code, its prediction is based on someone else's choice than yours, which has no correlation with your choice whatsoever. Do you one- or two-box?

Both of these formulations translate straightforwardly into payoff matrices and any sort of sensible decision theory you throw at them give the correct solution. The paradox disappears when the ambiguity between the two above possibilities are removed. As far as I can see, all disagreement between one-boxers and two-boxers are simply a matter of one-boxers choosing the first and two-boxers choosing the second interpretation. If so, Newcomb's paradox is not as much interesting as poorly specified. The supposed superiority of TDT over CDT either relies on the paradox not reducing to either of the above or by fiat forcing CDT to work with the wrong payoff matrices.

I would be interested to see an unambiguous and nontrivial formulation of the paradox.

Some quick and messy addenda:

  • Allowing Omega to do its prediction by time travel directly contradicts box B contains either $0 or $1,000,000 before the game begins, and once the game begins even the Predictor is powerless to change the contents of the boxes. Also, this obviously make one-boxing the correct choice.
  • Allowing Omega to accurately simulate the subject reduces to problem to submit code for Omega to evaluate; this is not exactly paradoxical, but then the player is called upon to choose which boxes to take actually means the code then runs and returns its expected value, which clearly reduces to one-boxing.
  • Making Omega an imperfect predictor, with an accuracy of p<1.0 simply creates a superposition of the first and second case above, which still allows for straightforward analysis.
  • Allowing unpredictable, probabilistic strategies violates the supposed predictive power of Omega, but again cleanly reduces to payoff matrices.
  • Finally, the number of variations such as the psychopath button are completely transparent, once you decide between choice is magical and free will and stuff which leads to pressing the button, and the supposed choice is deterministic and there is no choice to make, but code which does not press the button is clearly the most healthy.
In response to comment by incogn on Decision Theory FAQ
Comment author: patrickscottshields 11 March 2013 03:02:44AM *  1 point [-]

Thanks for this post; it articulates many of the thoughts I've had on the apparent inconsistency of common decision-theoretic paradoxes such as Newcomb's problem. I'm not an expert in decision theory, but I have a computer science background and significant exposure to these topics, so let me give it a shot.

The strategy I have been considering in my attempt to prove a paradox inconsistent is to prove a contradiction using the problem formulation. In Newcomb's problem, suppose each player uses a fair coin flip to decide whether to one-box or two-box. Then Omega could not have a sustained correct prediction rate above 50%. But the problem formulation says Omega does; therefore the problem must be inconsistent.

Alternatively, Omega knew the outcome of the coin flip in advance; let's say Omega has access to all relevant information, including any supposed randomness used by the decision-maker. Then we can consider the decision to already have been made; the idea of a choice occurring after Omega has left is illusory (i.e. deterministic; anyone with enough information could have predicted it.) Admittedly, as you say quite eloquently:

Choice is not something inherent to a system, but a feature of an outsider's model of a system, in much the same sense as random is not something inherent to a Eeny, meeny, miny, moe however much it might seem that way to children.

In this case of the all-knowing Omega, talking about what someone should choose after Omega has left seems mistaken. The agent is no longer free to make an arbitrary decision at run-time, since that would have backwards causal implications; we can, without restricting which algorithm is chosen, require the decision-making algorithm to be written down and provided to Omega prior to the whole simulation. Since Omega can predict the agent's decision, the agent's decision does determine what's in the box, despite the usual claim of no causality. Taking that into account, CDT doesn't fail after all.

It really does seem to me like most of these supposed paradoxes of decision theory have these inconsistent setups. I see that wedrifid says of coin flips:

If the FAQ left this out then it is indeed faulty. It should either specify that if Omega predicts the human will use that kind of entropy then it gets a "Fuck you" (gets nothing in the big box, or worse) or, at best, that Omega awards that kind of randomization with a proportional payoff (ie. If behavior is determined by a fair coin then the big box contains half the money.)

This is a fairly typical (even "Frequent") question so needs to be included in the problem specification. But it can just be considered a minor technical detail.

I would love to hear from someone in further detail on these issues of consistency. Have they been addressed elsewhere? If so, where?

Comment author: patrickscottshields 10 March 2013 11:25:19PM *  2 points [-]

Task management has become a passion of mine; for the last two years or so I've been trying to build something close to what you're describing. I think it's cool that you're giving this a shot. Here are some of my thoughts:

  • Start small. Building good task management software is a hard challenge, potentially several orders of magnitude harder than you're expecting. I continually underestimated how much effort it would take to build my task management software.
  • If you want to work on this full-time, consider joining an existing team. Companies such as Asana are already in the task management space, and they have teams of software engineers and data scientists working on cool things. Joining an existing team allows you to specialize on a part of the software, whereas you might spread yourself too thin if you are responsible for all components of the project. Joining an existing team is basically what I'm trying to do now, after I decided pursuing my startup further was suboptimal. (Potential employers reading this: please contact me!)
  • Focus on the fundamental algorithms and APIs before considering presentation. Target the command line; in the browser, it's easy to get distracted by user experience issues and end up prematurely optimizing for them. Unless your software actually does the awesome things you want it to do on the technical side, it won't matter how nice its interface is. Developing for the command line forces you to focus on the actual algorithms and APIs.
  • Don't reinvent things and don't allow feature creep. If you feel like you're doing something new, do more research. Very little in the way of new algorithms, math, data structures, etc. is necessary in this area; most of the work to be done is in picking which things to use that have already been invented. Keep your code base and features small so you don't get overwhelmed by technical debt.
  • Take free online classes in algorithms, data structures, software development, machine learning, statistics, information theory, logic, AI, and planning. There's so much cool stuff out there that you might not know about, which could be useful for this sort of endeavor. For example, something I learned from Tim Roughgarden's algorithms class on Coursera is that a set of tasks with precedence constraints (e.g. constraints of the form "task X must be completed before task Y") can be represented by a directed graph. If the graph is acyclic, a topological sort can, in linear time, can give a sequence of tasks that respects all precedence constraints (if the graph is cyclic, no such sequence exists.)
  • One avenue to explore with this kind of software is data entry optimization. What optimal subset of data should be collected from the user? Data entry consumes users' time; it's suboptimal for a user with thousands of tasks to routinely update each task's parameters. I think by looking at tasks' parameters as random variables, we can use information theory and machine learning to decide when users should be asked to update various data. I wrote a paper exploring this.

One thing you have going for you is that your project is open source. That allows for a lot of little contributions from people who are interested in the work, but already have their own sources of income. That might allow the project to survive where my startup failed. I still care deeply about task management, so it's possible our work will intersect in the future. I'm now following the GitHub repository you made for this project.

Comment author: beoShaffer 31 January 2013 08:56:21PM *  17 points [-]

[Note: I also attended the workshop and am writing this before reading the main post to avoid being biased.]

The out of class discussions alone were more than worth my while. However this is partially due to my circumstances, I am about to graduate college with a double major in CS and psychology. Deciding which (if either) field to pursue, and how to best pursue it is obviously quite important to me. Before going to the workshop, I had already done most of the obvious “due diligence” research on the matter, but still updated significantly during the workshop. Talking to several of the other CfAR participants, the instructors, and several LW people who stopped by (Lukeprog, Yvain ect.) uncovered many important points that I had missed. For example, I assumed the median staring salary for computer scientists was a reasonable estimate for what my starting salary would be. It turns out that I can expect to make about twice that much money if I use certain job hunting techniques I learned at the workshop and optimize for money (instead of, say, cool sounding problems). I don't expect most people to be in similar positions, but I do think asking questions at a CfAR event is a very powerful way of researching a fairly large class of problems. If all else fails pick one position and offer ridiculous side-bet odds with a high burden of proof on the other person.

As for the official lessons, if I actually internalize them all, and they work as advertised they will be some of the most important things I have learned in my life. If we take the outside view and assume that I will retain and successfully use them at the same rate as skills I learned at similar retreats then the expected value drops by more than half, but is still pretty high. Furthermore, I have several strong inside view reasons to believe I will retain more of what I learned here. The most obvious is the six weeks of follow up, but there are other reasons, like the fact that I actually slept at CfAR which should increase my retention.

-Edit, minor gramar stuff fixed. Also, I endorse the OP's view of the workshop, except for the sleep part. In particular I forgot to mention the venue, but both the Rose Garden Inn and the food were both spectacular (this was despite my having dietary restrictions). One minor complaint, the Inn could use more writing/setting food down surfaces.

In conclusion tell CfAR “Shut up and take my money.” or you will not go to space today.

Comment author: patrickscottshields 01 February 2013 01:23:05AM *  4 points [-]

For example, I assumed the median staring salary for computer scientists was a reasonable estimate for what my starting salary would be. It turns out that I can expect to make about twice that much money if I use certain job hunting techniques I learned at the workshop and optimize for money (instead of, say, cool sounding problems).

What changed your expectation of your starting salary?

Comment author: jsalvatier 14 January 2013 06:14:05PM 2 points [-]

I think it might help to do your questioning in private or at least in front of as few people as possible. Questioning is more likely to be social challenging questioning if its done in public rather than private.

Comment author: patrickscottshields 16 January 2013 12:18:50AM 2 points [-]

The potential benefits from private questioning should be weighed against the cost of the information not being visible to others. I like to see Wei_Dai's questions and the responses they elicit. I think the public exchanges have significant value beyond the immediate participants.

Comment author: drethelin 25 October 2012 10:57:34PM 9 points [-]

I think it's more that if we are being simulated, the most likely simulating party is our future, if we take our OWN propensity to simulate our past as evidence. Human simulators are far more likely to simulate humans out of all possible intelligences than other intelligences are.

Comment author: patrickscottshields 27 October 2012 04:22:44AM 2 points [-]

If our simulators are human, that implies that their universe has laws of physics similar to our own. But if we're living in a simulation, I think it's more plausible that our simulators exist in a world operating under different laws of physics (e.g. they live in a universe which is more amenable to our-universe-scale simulation.) So I think other factors are in play which could lessen the probability that we are being simulated by humans, let alone our future.

Comment author: James_Miller 25 October 2012 11:20:32PM 3 points [-]

The simulators are much like us, or at least are our post-human descendants.

No, they probably have a much greater desire to run simulations than we do or ever will.

Comment author: patrickscottshields 27 October 2012 04:04:24AM 2 points [-]

Or maybe just greater means. I imagine many humans would run universe-scale simulations, if they had the means.

View more: Next