What I'm trying to figure out is, how to I determine whether a source I'm looking at is telling the truth? For an example, let's take this page from Metamed: http://www.metamed.com/vital-facts-and-statistics
At first glance, I see some obvious things I ought to consider. It often gives numbers for how many die in hospitals/year, but for my purposes I ought to interpret it in light of how many hospitals are in the US, as well as how many patients are in each hospital. I also notice that as they are trying to promote their site, they probably selected the data that would best serve that purpose.
So where do I go from here? Evaluating each source they reference seems like a waste of time. I do not think it would be wrong to trust that they are not actively lying to me. But how do I move from here to an accurate picture of general doctor competence?
It's actually quite plausible that cheaper projects have a greater chance of falling into the rare category of successful projects, as the original 1979 defined success - "used without extensive rework". It's also quite possible that project size isn't normally distributed.
What I seem to have trouble conveying is my intuition that the fit is too close to be true - that in general if you have a multinomial distribution with five categories, and you draw a small sample from that distribution, it is quite unlikely that your sample frequencies will come within 1% of the true probabilities.
The chi-squared test, if I've understood it correctly, computes the converse probability - the probability that your sample contains frequencies that are this far removed or more from the true probabilities, given the assumption that it's drawn from a distribution with those probabilities. In the case that concerns me the chi-square is obviously very small, so that the p-value approaches unity.
What I'm saying - and it may be a crazy thing to say - is that it's precisely this small distance from the true probabilities that makes me suspicious.
I realize what you're getting at, and it is suspicious, I'm just saying that the probabilities you're trying to calculate for it aren't correct.
I'm also not sure what your alternate hypotheses are. There's no way that the 1979 data was fabricated to fit the 1995 percentages, is there? So any argument that casts doubt on the 1979 data being possible to begin with is going to penalize all possible alternate hypotheses. That's the problem with the normality assumption: assuming a normal distribution with any true mean makes the 1979 data unlikely, whether or not the percentages are suspiciously close.