Strength of Association
More general form and the obvious mathematical extension of the non-parametric Mann-Whitney test is the Kruskal Wallis test, with the same problems of interpretation. This test statistic has a different distribution from the other methods, since when null hypothesis is true the test statistic follow the Chi squared distribution, that is used mainly for the analysis of categorical data.
X2= (observed difference in percentages / standard error of difference )2
The interpretation of X2 is as follows: the larger the value of X2 the smaller the probability P and hence the stronger the evidence that the null hypothesis is untrue.
Any variation among the groups will increase the test statistic, therefore we are concerned with only the upper tail of the Chi squared distribution. The idea of one and two sided tests does not apply with three or more groups. One important comment on interpretation is the reminder that the size of X2 (or P) does not indicate the strength of the association, but rather the strength of the evidence against the null hypothesis of no association.
Suppose the null hypothesis of no treatment difference is true and consider the hypothetical situation where one repeats the whole clinical trial over and over again with different patients each time. Then on average 5% of such repeat trials would produce a treatment difference large enough to make X2> 3.84 and hence P< 0.05. Note that one common pitfall is to misinterpret P as being the probability that the null hypothesis is true.
Pocock S., Clinical Trials, 2003
Probability
All probabilities are conditional and so, if the situation changes, then probabilities may change.
Weight of evidence
Log (likelihood ratio) [ the ratio - eg. the sensitivity of a HIV test divided by its specificity] is termed the ‘weight of evidence’ used by Alan Turing when using statistical techniques for breaking the Enigma codes at Bletchley Park during the WWII.
Myles J., Abrams K., Spiegelhalter D., Bayesian Approaches to Clinical Trials and Health Care Evaluation, 2004
<< Home