I propose that this concept be called "unexpected surprise" rather than "strictly confused":
"Strictly confused" suggests logical incoherence.
"Unexpected surprise" can be motivated the following way: let s(d)=surprise(d∣H)=−logPr(d∣H) be how surprising data d is on hypothesis H. Then one is "strictly confused" if the observed s is larger than than one would expect assuming a H holds.
This terminology is nice because the average of s under H is the entropy or expected surprise in (d∣H). It also connects with Bayes, since log-likelihood=−surprise is the evidential support d gives H.
The section on "Distinction from frequentist p-values" is, I think, both technically incorrect and a bit uncharitable.
It's technically incorrect because the following isn't true:
The classical frequentist test for rejecting the null hypothesis involves considering the probability assigned to particular 'obvious'-seeming partitions of the data, and asking if we ended up inside a low-probability partition.
Actually, the classical frequentist test involves specifying an obvious-seeming measure of surprise t(d), and seeing whether t is higher than expected on H. This is even more arbitrary than the above.
On the other hand, it's uncharitable because it's widely acknowledged one should try to choose t to be sufficient, which is exactly the condition that the partition induced by t is "compatible" with Pr(d∣H) for different H, in the sense that Pr(H∣d)=Pr(H∣t(d)) for all the considered H.
Clearly s is sufficient in this sense. But there might be simpler functions of d that do the job too ("minimal sufficient statistics").
Note that t being sufficient doesn't make it non-arbitrary, as it may not be a monotone function of s.
Finally, I think that this concept is clearly "extra-Bayesian", in the sense that it's about non-probabilistic ("Knightian") uncertainty over H, and one is considering probabilities attached to unobserved d (i.e., not conditioning on the observed d).
I don't think being "extra-Bayesian" in this sense is problematic. But I think it should be owned-up to.
Actually, "unexpected surprise" reveals a nice connection between Bayesian and sampling-based uncertainty intervals:
To get a (HPD) credible interval, exclude those H that are relatively surprised by the observed d (or which are a priori surprising).
To get a (nice) confidence interval, exclude those H that are "unexpectedly surprised" by d.
I propose that this concept be called "unexpected surprise" rather than "strictly confused":
This terminology is nice because the average of s under H is the entropy or expected surprise in (d∣H). It also connects with Bayes, since log-likelihood=−surprise is the evidential support d gives H.
The section on "Distinction from frequentist p-values" is, I think, both technically incorrect and a bit uncharitable.
It's technically incorrect because the following isn't true:
Actually, the classical frequentist test involves specifying an obvious-seeming measure of surprise t(d), and seeing whether t is higher than expected on H. This is even more arbitrary than the above.
On the other hand, it's uncharitable because it's widely acknowledged one should try to choose t to be sufficient, which is exactly the condition that the partition induced by t is "compatible" with Pr(d∣H) for different H, in the sense that Pr(H∣d)=Pr(H∣t(d)) for all the considered H.
Clearly s is sufficient in this sense. But there might be simpler functions of d that do the job too ("minimal sufficient statistics").
Note that t being sufficient doesn't make it non-arbitrary, as it may not be a monotone function of s.
Finally, I think that this concept is clearly "extra-Bayesian", in the sense that it's about non-probabilistic ("Knightian") uncertainty over H, and one is considering probabilities attached to unobserved d (i.e., not conditioning on the observed d).
I don't think being "extra-Bayesian" in this sense is problematic. But I think it should be owned-up to.
Actually, "unexpected surprise" reveals a nice connection between Bayesian and sampling-based uncertainty intervals: