A good guide is linked here (it's quite long though)
Mostly I just wanted to make a short version.
The upshot is that causal decision theory is optimal...
They're all optimal if you look at a situation in the right way. It's a question of what you count as given when you do the optimizing.
Evidential decision theory is harder to describe, because it is flawed - it falls to Simpson's paradox.
That explains why they don't come out the same. the flaws are based on what is and isn't given. For example, according to a proponent of CDT, your decision on Newcomb's problem can't possibly change if box A has money, so controlling if it has money shouldn't affect your decision. Similarly, in Parfit's hitchhiker, a proponent of EDT would say that they already know if they got picked up, and they're not going to base their decision on counterfactuals.
Optimal by the fairly obvious criterion of "gets agents who use it maximal rewards." If you cared about which decision theory you used because of some extra factor, the problem would become one where the rewards were not solely action-determined or decision-determined, when that extra factor is cast in terms of reward.
If you prefer, I'm sure you could recast it using the word "wins."
I couldn't find any concise explanation of what the decision theories are. Here's mine:
A Causal Decision Theorist wins, given what's happened so far.
An Evidential Decision Theorist wins, given what they know.
A Timeless Decision Theorist wins a priori.
To explain what I mean, here are two interesting problems. In each of them, two of the decision theories give one choice, and the third gives the other.
In Newcomb's problem and you separate people into groups based on what happened before the experiment, i.e. whether or not Box A has money, CDT will be at least as successful in each group as any other strategy, and notably more successful than EDT and TDT. If you separate it into what's known, there's only one group, since everybody has the same information. EDT is at least as successful as any other strategy, and notably more successful than CDT. If you don't separate it at all, TDT will be at least as successful as any other strategy, and notably more successful than EDT.
In Parfit's hitchhiker, when it comes time to pay the driver, if you split into groups based on what happened before the experiment, i.e. whether or not one has been picked up, CDT will be at least as successful in each group as any other strategy, and notably more successful than TDT. If you split based on what's given, which is again whether or not one has been picked up, EDT will be at least as successful in each group as any other strategy, and notably more successful than TDT. If you don't separate at all, TDT will be at least as successful as any other strategy, and notably more successful than CDT and EDT.
There's one thing I'm not sure about. How does Updateless Decision Theory compare?