A causal model to me is a set of joint distributions defined over potential outcome random variables.
Huh?
Can you expand on this, with special attention to the difference between the model and the result of a model, and to the differences from plain-vanilla Bayesian models which will also produce joint distributions over outcomes.
Sure. Here's the world's simplest causal graph: A -> B.
Rubin et al, who do not like graphs, will instead talk about a joint distribution:
p(A, B(a=1), B(a=0))
where B(a=1) means 'random variable B under intervention do(a=1)'. Assume binary A for simplicity here.
A causal model over A,B is a set of densities { p(A, B(a=1), B(a=0) | [ some property ] } The causal model for this graph would be:
{ p(A, B(a=1), B(a=0) | B(a=1) is independent of A, and B(a=0) is independent of A }
These assumptions are called 'ignorability assumptions' in the literature, and th...
Yann LeCun, now of Facebook, was interviewed by The Register. It is interesting that his view of AI is apparently that of a prediction tool:
"In some ways you could say intelligence is all about prediction," he explained. "What you can identify in intelligence is it can predict what is going to happen in the world with more accuracy and more time horizon than others."
rather than of a world optimizer. This is not very surprising, given his background in handwriting and image recognition. This "AI as intelligence augmentation" view appears to be prevalent among the AI researchers in general.