Question about application of Bayes

RolfAndreassen

I have successfully confused myself about probability again.

I am debugging an intermittent crash; it doesn't happen every time I run the program. After much confusion I believe I have traced the problem to a specific line (activating my debug logger, as it happens; irony...) I have tested my program with and without this line commented out. I find that, when the line is active, I get two crashes on seven runs. Without the line, I get no crashes on ten runs. Intuitively this seems like evidence in favour of the hypothesis that the line is causing the crash. But I'm confused on how to set up the equations. Do I need a probability distribution over crash frequencies? That was the solution the last time I was confused over Bayes, but I don't understand what it means to say "The probability of having the line, given crash frequency f", which it seems I need to know to calculate a new probability distribution.

I'm going to go with my intuition and code on the assumption that the debug logger should be activated much later in the program to avoid a race condition, but I'd like to understand this math.

I have successfully confused myself about probability again.

I'm going to go with my intuition and code on the assumption that the debug logger should be activated much later in the program to avoid a race condition, but I'd like to understand this math.

Well, the problem is that you have uncertainty over the probability of the code crashing with or without fixing the particular bug you're looking for. What you need, in order to apply Bayes, is:

A prior model 'pr(f)' of the frequency of the code crashing with the bug present. Which is a distribution over crash frequencies. 1/(f(1-f)) is an ignorance prior that might do the job.
I'd assume the probability of the code crashing with the bug fixed is 0, even if that's not strictly true (since there could be other bugs).
A prior likelihood 'b' of the bug being on that line of code. This could depend (for instance) on how you identified that particular line to comment out in the first place. I'll call your evidence from running the program E1 (2 crashes out of 7) and E2 (10 runs no crashes)

P(bug on that line| E1, E2) = P(E1, E2 | bug on line) P(bug on that line) / (P(E1, E2)

P(bug on that line) = b

P(E1, E2) = P(E1, E2 | bug on line) + P(E1, E2 | bug elsewhere)

P(E1, E2 | bug on line) = P(E1)P(E2 | bug on line) (since the 2 crashes out of 7 are independent of the bug location)

P(E1, E2 | bug elsewhere) = P(E1)P(E2 | bug elsewhere) (as above)

P(E2 | bug on line) = 1

Given a frequency of crashing 'f', P(E2 | f, bug elsewhere) = (1-f)^10

P(E1|f) = f^2 * (1-f)^5

So, then you need to integrate over all possible values of 'f': P(E1) = Integral over [0,1] of: P(E1|f)pr(f)df

P(E2 | bug elsewhere) = Integral over [0,1] of: P(E2 | f, bug elsewhere)pr(f)df

That's everything you need, the rest is just picking those priors and integrating back up the line. Of course the results are only as good as the priors. A much easier solution is: "The chance of a crash appears to be about 2/7. The chance of getting 10 non-crashes is (5/7)^10 ~= 3.5% " Note that this is not the same as the above, it's an approximation, but it's probably going to be just as good as doing it the hard way.

Incidentally, you need to be aware that, particularly with intermittant bugs, just because commenting out a line stops the crash (even when you're 100% sure of the correlation) that doesn't mean the line itself is the problem. Bugs can be absolutely pathological. For example, if the problem is, say, freeing the same memory twice, then any line that calls a lot of memory allocations will increase the frequency of crashes even if the problem is actually earlier in the code. Also, if the bug is overrunning the end of an array, taking out any line can have a chaotic effect on the optimiser, moving the relative locations of things in memory around and causing the bug to disappear without fixing it (only for it to reappear later). On a simpler level, taking a line out might change the execution path avoiding the bug without fixing it. There seems to be no end to ways in which impossible seeming things can happen in computer code.

Bugs can be absolutely pathological.

This is very true. I simplified the information a bit because I was posting about the math as a matter of intellectual curiosity, not to get help debugging. I have a model of what was causing the crash that I find reasonably convincing, which I outlined in my response to jimrandomh, below. So, while it's a real-world problem, for purposes of math we can assume that the effect of commenting out the line is an indication of a point bug, as it were. I found the Bayes confusing anyway, so there's no need to complexify fu... (read more)