That's a strawman. The conditional probability we're talking about has a clear (if explicitly unstated) temporal ordering: P(rain in the past | wet grass in the present).
You seem to be missing Ilya's point. He was arguing that if you regard "under intervention do(A = 1)" as equivalent to "conditional on A = 1" (as you suggested in a previous comment), then you should regard P(rain | do(grass wet)) as equivalent to P(rain | grass wet). But these are not in fact equivalent, and adding temporal ordering in there doesn't make them equivalent either. P(rain in the past | do(wet grass) in the present) = P(rain in the past), but P(rain in the past | wet grass in the present) != P(rain in the past) .
He was arguing that if you regard "under intervention do(A = 1)" as equivalent to "conditional on A = 1" (as you suggested in a previous comment), then you should regard P(rain | do(grass wet)) as equivalent to P(rain | grass wet).
There is obviously a difference between observational data and experiments.
But these are not in fact equivalent
No, because they're modeling different reality.
Yann LeCun, now of Facebook, was interviewed by The Register. It is interesting that his view of AI is apparently that of a prediction tool:
"In some ways you could say intelligence is all about prediction," he explained. "What you can identify in intelligence is it can predict what is going to happen in the world with more accuracy and more time horizon than others."
rather than of a world optimizer. This is not very surprising, given his background in handwriting and image recognition. This "AI as intelligence augmentation" view appears to be prevalent among the AI researchers in general.