Unless you are in statistics or otherwise work with numbers on a daily basis, the words standard deviation and variance are probably unfamiliar to you. For those who work in statistics, these words carry a very significant weight. While learning these concepts, you will find that they are different, yet they are also connected. In fact, variance is very important to determining the standard deviation. Before you can determine the standard deviation of any number set, you must first know how to find the variance.
Simply put, variance means you take the differences of the numbers in the data set, subtract them from the mean and square them. Most people will remember that the mean equals the average of a group of numbers. When squaring the differences, you will subtract each number from the mean before squaring. These results are squared so that the negatives do not cancel out the positives. You will then find the average of those squared differences. This is how you find variance, which allows you to use a more accurate number in finding standard deviation.
Once you have the variance, you can work out standard deviation. Standard deviation shows how far you get from the norm. In other words, this measures how far from the normal the data spread. You can easily calculate standard deviation by taking the square root of the variance. In its formula, you denote the standard deviation with the Greek symbol sigma (ơ). However, the formula will vary depending upon whether you are calculating numbers for a population or for a sample.
Sample vs. Population
The variance and standard deviation will depend upon whether or not you are calculating a sample or a population. In the formula, you have N number of data values. More than likely, you are using only a small sample. For instance, if you are looking at 5 students in a school of 200, N = 5 and is your sample. In this case, you will use N minus 1 when dividing the sample into your formula for variance.
However, if you are only interested in these 5 students, you are looking at the entire population. In this case, you will simply use N when dividing the population into your formula for variance because you are not making accommodations for the difference between the sample and the actual population. In other words, you have to make a “correction” when your data are a sample rather than a population. Of course, you have to consider your data set when it comes to whether you have a sample or an entire population.
Another point to remember when you are looking at standard deviation and variance is that they help determine how well the data set is normally distributed or how symmetric it is. Data tend to come under a bell curve, which means that there are few if any abnormities in the data. This means that you can make accurate inferences in the data.