Central Tendency and Dispersion  
Prior to reading this SPSS lesson, you should have already read the lesson on central tendency and dispersion. During this lesson, you will apply the concepts of central tendency and dispersion using SPSS. You will learn how to interpret various measures of central tendency (including mean, median, and mode) and variability (including standard deviation and variance) on SPSS output. You will also learn the settings used to generate central tendency and variability output in SPSS. Upon completing this lesson, you should be able to:
To practice generating frequency output using SPSS, download one of the following percentiles data files:
(Note: If the linked file does not begin downloading when you click the link, rightclick on the link and select save target as or save link as from the menu.) Interpreting Measures of Central Tendency on SPSS Output Recall that the most commonly used measures of central tendency are the mean, the median, and the mode. While you can calculate these values by hand or using MS Excel, you can also use SPSS to perform the calculations. Central Tendency Below is a set of data for which we want to calculate the mean, median, and mode.
To calculate the mean, you would add up all of the values and divide the total by the number of values in the data set. In this example, the mean is 29.66. To calculate the median, you would rank the values in the data set in increasing order and then determine the middle value. In this example, the median is 28.5. To calculate the mode, you would determine the frequency at which each value occurs and then determine which value has the highest frequency. In this data set, there are two values that have the highest frequencies (f = 4). These values are 27 and 33. Using SPSS to determine the measures of central tendency of a variable named score, we could produce one (or all) of the following sets of output (depending on how much information we needed).
Each of the above sets of output were generated using different function in SPSS (to be described later in this lesson). The results are the same, though, because the outputs were generated from the same data file. In the first output (labeled frequencies output), all of the measures of central tendencies presented in the textbook are included in the output. Checking our manual calculations against the SPSS input, our calculated measures of central tendency match the values from SPSS. In the second output (labeled descriptives output), the only measure of central tendency included in the output is the mean. The other descriptive statistics provided in the output are measures of variability (addressed the variability section of this lesson ). In the third output (labeled explore output), there are additional measures of central tendency (along with several measures of variability). Since the same data was used to generate all of the output (and the methods used to calculate the measures were the same), the values are the same. A check of the median in the output shows that our manual calculation was correct. Our manual review of the frequencies revealed two modes (27 and 33). The SPSS output shows the lowest mode (when there is more than one mode) along with a note that there are multiple modes. Mean, Median, and Mode Recall that a histogram and frequency curve graphically display the relationship between the mean, median, and mode. By looking at a frequency curve, you can determine whether a distribution is normally distributed or is skewed. You can also make this determination by reading the SPSS output. In the sample output shown below, the mean is higher than the median and mode. What is the shape of this distribution? Since the mean is the largest value, the median is the next largest value, and the mode is the smallest value, the distribution has a positive skew. How else could you determine that this distribution was positively skewed? If you were viewing the explore output, you could determine whether the distribution was skewed by checking the calculated skew value, which is labeled "skewness". In this example, the test scores have a positive skew, indicating that most of the scores are at the low end of the distribution with relatively fewer scores at the high end of the distribution.
Interpreting Measures of Dispersion on SPSS Output Recall that the mean, median, and mode allow you to determine the shape of a data distribution. Another important aspect of the data distribution is how the data is spread (or how it varies). To examine the variability of a data distribution, you examine the measures of dispersion. Recall from the lesson on variability that you have learned about these four measures of dispersion: range, interquartile range, variance, and standard deviation. Just as central tendency information can be included in a variety of forms of SPSS output, measures of dispersion can also be customized. Below is sample output showing various measures of dispersion generated from the frequencies command, descriptives command, and explore command. The display differs across the outputs, but the results are the same because each analysis was conducted using the same data file.
We will start with the range. To calculate the range of a data distribution, you subtract the lowest value from the highest value. In the data set shown below (which was used to generate the above outputs), the calculated range would be 55  10 = 45.
Using the output to check our calculation, we can confirm that our calculated range was correct by reading the value labeled range. If you wanted to triplecheck the range calculation, you could even subtract the lowest value from the highest value of the desired variable. On the output these values are labeled minimum and maximum, respectively. Given an SPSS output, then, you could determine or calculate the range of the underlying data set. To further explore the spread of the data, you could examine the interquartile range (IQR). If you calculated the IQR manually, you could check your calculations by reading the IQR from the explore output. The IQR and range give you information about how "spread out" the data is. The variance and standard deviation provide information about how the data vary relative to the mean of the data. If you were given two SPSS outputs that displayed two different values of variance and standard deviation, you could determine which data set had data that were more clustered around the mean. For example, compare the outputs below.
Which of the two examples indicates the greater scatter of data around the mean? Recall from the lesson on variability that if the data are widely scattered about the mean, the variance and the standard deviation will be somewhat large. Because the variance and standard deviation of Example A are larger than that of Example B, the data for Example A would be more widely scattered about the mean. Likewise, the data for Example B would be more tightly clustered about the mean. Other Descriptive Statistics If you were given output but no data set, could you still determine how many values were in the sample (data set)? The answer is yes using either the frequencies output or the descriptives output. The number of values in the sample is represented as N in the output. In this example there were 50 values in the data set. Sometimes you may have an incomplete data set (e.g., where some of the values of the variables being analyzed are missing). SPSS alerts you to the number of missing values in the data set (for the variable being analyzed) in the frequencies output and the descriptives output. Another statistic that you have already encountered that is included on the explore output is the skewness. Recall that skewness provides information about the shape (symmetry) of a distribution. If the skewness value on the output is positive, the distribution has a positive skew. Similarly, if the skewness value is negative, the distribution has a negative skew. Based on what you learned about the shape of distributions, how would you interpret a positive or negative skew value on an SPSS output? If the skew value is positive, the variable being examined has mostly high values with relatively few low values. Conversely, if the skew value is negative, the variable has mostly low values with relatively few high values. The other statistics on the sample output (e.g., confidence interval, trimmed mean, and kurtosis) will be covered later in this course or in the intermediate statistics course. For the present course, you should be able to use SPSS output to interpret the measures that are covered in the central tendency and dispersion lesson. You should also be able to generate the types of output that contain information about central tendency and dispersion, which will be covered next. Generating Central Tendency and Dispersion Output As indicated by the three sets of output we interpreted, there are several ways to generate central tendency output in SPSS. Three methods are presented in the following sections: (a) using the frequencies command, (b) using the descriptives command, and (c) using the explore command. The method that you would use when analyzing your own data would depend upon your preference or the types of information that you needed to include in the output.Using the Frequencies Command You can include the mean, median, and mode in frequency output generated using the frequencies command. The output will also include the sum of the values of the specified variable(s) as well as the number of values included in the calculation(s). To create a frequency table using SPSS, open the analyze menu and select the frequencies command from the descriptive statistics submenu. In the Frequencies dialog box, you must specify the name of the variable for which you want to calculate measures of central tendency. To specific a variable to include in the output:
Using the Descriptives Command The descriptives output can also be customized to provide various measures of central tendency and dispersion. To generate descriptives output using SPSS, open the analyze menu and select the descriptives command from the descriptive statistics submenu. In the Descriptives dialog box:
Using the Explore Command The last method of obtaining central tendency and dispersion output that we will use is the explore command. Using this method, you can obtain the same information from the descriptives output with a few additional statistics. To generate explore output using SPSS, open the analyze menu and select the explore command from the descriptive statistics submenu.
In the Explore dialog box:
Notice the similarities and differences in the outputs generated using the frequencies, descriptives, and explore commands. The calculations are the same no matter which output you use, but the information displays differently. When determining which command to use, select the one that provides the information you need in a preferable format. Review You can generate central tendency and dispersion output using several different methods in SPSS. Using the information that you have learned in this lesson and given sample SPSS output, you can:
To practice generating frequency output using SPSS, download one of the following percentiles data files:
(Note: If the linked file does not begin downloading when you click the link, rightclick on the link and select save target as or save link as from the menu.)

© 2007 by Melissa Kelly and L. K. Curda. All rights reserved.  Updated on September 26, 2007 