Confidence Intervals

by Richard Reid


Often, one is concerned with trying to make an "educated guess" about how accurately a sample mean approximates the population mean. That is to say, we wish to make an inference regarding the population mean having only our sample data with which to do so.

Since the means of samples are normally distributed, we can use our knowledge of the probabilities associated with the areas under the normal curve as it applies to sample means. The normal distribution has been studied in great depth and there are tables that associate the distance from the mean in terms of standard deviation with an area under the normal curve.

95% confidence Interval:
Suppose we wish to determine the interval within which 95% of the sample means lie. One of the important properties of a normal distribution of means is that 2.5% of them lie above 1.96 Standard Errors of the Mean (SEM) and 2.5% of them lie below -1.96 SEM.

Therefore, we can say that 95% of the sample means in this distribution lie between -1.96 and 1.96 SEM. This percentage can be considered a probability. In other words, we can say that, if we were to select one sample from this distribution, there is a probability of 0.025 that its mean lies below -1.96 SEM and 0.025 that it lies above 1.96 SEM. We can also, say that the probability is 0.95 that the sample mean is within the "interval" bounded by -1.96 and 1.96 SEM.

Example:
If we have run a sim and the results indicate that the Initial Bet Advantage (IBA), or mean is 1.5% with a St. Err (SEM) of 0.02, to find the 95% confidence interval we must determine the values of the IBA at -1.96 and 1.96 SEM.

For -1.96 SEM, this value lies at 1.96 * 0.02 = 0.0392 below the IBA, which is 0.015 - 0.0392 = -0.0242.

For 1.96 SEM, this value lies at 1.96 * 0.02 = 0.0392 above the IBA, which is 0.015 + 0.0392 = 0.0542.

Now we can say that, there is a probability of 0.95 that the real IBA (not the same as the sim's IBA) will be between -0.0242 and +0.0542. Also, it is important to note that there is a probability of 0.05 that the real IBA will have a value outside (either above or below) this interval.

What we have done is establish a range of values for estimating the real IBA. In other words, if we make an interval estimate using the sim data, we can determine the degree of confidence that our interval contains the IBA. This is what is know as establishing a "confidence interval."

Perhaps the following diagram will help to clarify things.



99% Confidence Interval:
If we wish to determine an interval that would give us more confidence than P = 0.95 in our statement about the IBA, we could determine the 99% confidence interval (or any other interval for that matter).

The 99% confidence interval is established in a manner similar to that used to determine the 95% confidence interval. From the properties of the normal distribution, we know that 0.5% of the sample means (IBAs) will lie below -2.58 SEM and that 0.5% of them will lie above 2.58 SEM.

Example
If we want to determine the 99% confidence interval of a sim with an IBA of 1% and a St. Err (SEM) of 0.04, we do so as follows.

For -2.58 SEM, this value lies at 2.58 * 0.04 below the IBA of 0.01, which is 0.01 - 0.1032 = -0.0932.

For 2.58 SEM, this value lies at 2.58 * 0.04 above the IBA of 0.01, which is 0.01 + 0.1032 = 0.1132.

In this case, there is a probability of 0.99 that the real IBA will be between -0.0932 and +0.1132. And again as a reminder, it is important to note that there is a probability of 0.01 that the real IBA will have a value outside (either above or below) this interval.

Again, I'm hoping that the following diagram will help to clarify things.



It should be noted that in computing the confidence interval for any given sim, we are attempting to determine the probability that it is one of those intervals that encompasses the real IBA (remember that the sim's IBA is only an approximation of the real IBA). Thus, it is being emphasized that the 99% confidence interval we determined from the example is interpreted as having a probability of 0.99 that it is an interval that encompasses the actual IBA.



Return to: Statistics