Mostly the essay is careful not to flatly say that a node value X_1 is a function of a node value X_2.
I'm not sure where you're getting that. Here is how he describes the dependence among nodes under the "first approach":
Motivated by the above discussion, one way we could define causal influence would be to require that X_j be a function of its parents:
X_j = f_j(X_{\mbox{pa}(j)}),
where f_j(\cdot) is some function. In fact, we’ll allow a slightly more general notion of causal influence, allowing X_j to not just be a deterministic function of the parents, but a random function. We do this by requiring that X_j be expressible in the form:
X_j = f_j(X_{\mbox{pa}(j)},Y_{j,1},Y_{j,2},\ldots),
where f_j is a function, and Y_{j,\cdot} is a collection of random variables such that: (a) the Y_{j,\cdot} are independent of one another for different values of j; and (b) for each j, Y_{j,\cdot} is independent of all variables X_k, except when X_k is X_j itself, or a descendant of X_j.
So, unless you have some exogenous Y-nodes, X_j will be a deterministic function of its parent X-nodes. The only way he introduces randomness is by introducing the Y-nodes. My question is, how is that any different from the "alternative approach" that he discusses later?
Y_{j,\cdot} is a collection of random variables
That is not the same as there being Y-nodes. Nodes would be part of the graph structure, and so be more visible when you look at the graph.
The only difference is whether the Y-values require their own nodes.
Michael Nielsen has posted a long essay explaining his understanding of the Pearlean causal DAG model. I don't understand more than half, but that's much more than I got out of a few other papers. Strongly recommended for anyone interested in the topic.