On the basis of these remarks I submit the following qualified statement: while the belief network paradigm is mathematically elegant and intuitively appealing, it is NOT very useful for describing real data.
Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems; this is also referred to as the "no free lunch" theorem. Determining a suitable classifier for a given problem is still more an art than science.
Russell and Norvig, 1st ed. has a good example comparing the performance of a Bayes net with a decision tree on data that was generated by a decision tree-like process, of course the net did not perform as well as a decision tree on that data, surprise, surprise.
The challenge is just as wrong; to quote from the wiki:
Russell and Norvig, 1st ed. has a good example comparing the performance of a Bayes net with a decision tree on data that was generated by a decision tree-like process, of course the net did not perform as well as a decision tree on that data, surprise, surprise.