Standard Error of the Mean
by Richard Reid
When we draw a random sample from a population, it is usually to infer something about the population. Typically, from our sample we compute a statistic (such as a sample mean) and use it to infer a population parameter (such as a population mean). For example, let's say we run a 1 Billion hand sim and find that the initial bet advantage (sample mean) is +1.5%. From this we usually infer that the initial bet advantage of the entire population (population mean) of blackjack hands over all deck subsets is 1.5%.
How accurate is the initial bet advantage (population parameter) that we find in this way? We know that even after 1 Billion hands our measurements are approximate, and we must give some indication of the accuracy of the measurement. This is where the Standard Error of the Mean enters the picture.
Standard Error of the Mean:
If we were to draw all possible samples of size "n" from a given population, and for each sample calculate the mean, and then make a frequency distribution of the sample means, then the central limit theorem states that:
- The mean of the distribution is the same as the mean of the population from which the samples are drawn
- The standard deviation of the distribution of the sample means is equal to the standard deviation of the population divided by the square root of the sample size "n" and is usually called the Standard Error. So,
SE = SD(of the population)/sqrt(n)
For a large sample, the distribution of the sample means is approximately a normal distribution, even if the population from which the samples were drawn is not a normal distribution.
Predicting the population mean from a single sample:
If the standard deviation of the means of the sample distribution, SE, is small, most of the sample means will be near the center population mean. Thus a particular sample mean has a good chance of being close to the population mean, and will be a good estimator of the population mean. Conversely, a large SE means that the given sample mean will be a poor estimator of the population mean.
However, since the frequency distribution of the sample means is normal, the chance of a single sample mean lying within one standard deviation of the population mean is approximately 68%. Conversely, the population mean has a 68% chance of of lying within one standard deviation of a single randomly chosen sample mean. Thus, there is a 68% chance that the true population mean falls within the interval:
"Sample Mean" +/- One "Standard Error"
In this way, we are able to estimate the population mean from a single sample. Not only that, but we are able to give a range of values within which the population mean must lie, and to also give the probability that the population mean will fall within that interval.
There is just one difficulty: To compute the standard error SE by the central limit theorem, we must divide the population standard deviation by the square root of the sample size. But, the standard deviation of the population is not usually known. Thus, it is common practice to use the sample standard deviation instead. thus the standard error of the mean is approximated by the following formula (valid if the sample is large):
SE = SD(of the sample)/sqrt(n)
Return to: Statistics