Type 1 error

11/9/2022

Imagine if the 95% confidence interval just captured the value zero, what would be the P value? A moment’s thought should convince one that it is 2.5%. These two approaches, the estimation and hypothesis testing approach, are complementary. If this is less than a specified level (usually 5%) then the result is declared significant and the null hypothesis is rejected. The other approach is to compute the probability of getting the observed value, or one that is more extreme, if the null hypothesis were correct. If the two samples were from the same population we would expect the confidence interval to include zero 95% of the time, and so if the confidence interval excludes zero we suspect that they are from a different population. The first approach would be to calculate the difference between two statistics (such as the means of the two groups) and calculate the 95% confidence interval. Suppose that we have samples from two groups of subjects, and we wish to see if they could plausibly come from the same population. It is worth recapping this procedure, which is at the heart of statistical inference. The probability is known as the P value and may be written P < 0.001. The probability of a difference of 11.1 standard errors or more occurring by chance is therefore exceedingly low, and correspondingly the null hypothesis that these two samples came from the same population of observations is exceedingly unlikely. Reference to Table A ( Appendix table A.pdf) shows that z is far beyond the figure of 3.291 standard deviations, representing a probability of 0.001 (or 1 in 1000).

We usually denote the ratio of an estimate to its standard error by “z”, that is, z = 11.1. The question is, how many multiples of its standard error does the difference in means difference represent? Since the difference in means is 9 mmHg and its standard error is 0.81 mmHg, the answer is: 9/0.81 = 11.1. To find out whether the difference in blood pressure of printers and farmers could have arisen by chance the general practitioner erects the null hypothesis that there is no significant difference between them.

Do we regard it as a lucky event or suspect a biased coin? If we are unwilling to believe in unlucky events, we reject the null hypothesis, in this case that the coin is a fair one. This has nearly the same probability (6.3%) as obtaining a mean difference bigger than two standard errors when the null hypothesis is true. Imagine tossing a coin five times and getting the same face each time. If we do obtain a mean difference bigger than two standard errors we are faced with two choices: either an unusual event has happened, or the null hypothesis is incorrect. If we set the limits at twice the standard error of the difference, and regard a mean outside this range as coming from another population, we shall on average be wrong about one time in 20 if the null hypothesis is in fact true. Consequently we set limits within which we shall regard the samples as not having any significant difference. But what do we mean by “no difference”? Chance alone will almost certainly ensure that there is some difference between the sample means, for they are most unlikely to be identical.

0 Comments

Type 1 error

Leave a Reply.

Author

Archives

Categories