Course syllabus Course resources
Course discussions
Course schedule
Send message to instructor
This week's lesson
Send message to course mentor
Class list
Home
Grade book

 

Central Tendency and Dispersion
 

Prior to reading this SPSS lesson, you should have already read the lesson on central tendency and dispersion. During this lesson, you will apply the concepts of central tendency and dispersion using SPSS. You will learn how to interpret various measures of central tendency (including mean, median, and mode) and variability (including standard deviation and variance) on SPSS output. You will also learn the settings used to generate central tendency and variability output in SPSS. Upon completing this lesson, you should be able to:

  • Identify measures of central tendency from SPSS output.
  • Identify measures of variability from SPSS output.
  • Determine the range of a data set based on the SPSS output.
  • Determine the number of values in a data set used to generate a SPSS output.
  • Determine the shape of a data distribution based on a SPSS-generated central tendency output.
  • Compare SPSS outputs and determine which output represents data that is more scattered or clustered about the mean.
  • Generate central tendency output using the frequencies command in SPSS.
  • Generate central tendency output using the descriptives command in SPSS.
  • Generate central tendency output using the explore command in SPSS.

To practice generating frequency output using SPSS, download one of the following percentiles data files:

(Note: If the linked file does not begin downloading when you click the link, right-click on the link and select save target as or save link as from the menu.)


Interpreting Measures of Central Tendency on SPSS Output

Recall that the most commonly used measures of central tendency are the mean, the median, and the mode. While you can calculate these values by hand or using MS Excel, you can also use SPSS to perform the calculations.

Central Tendency

Below is a set of data for which we want to calculate the mean, median, and mode.

25.00 34.00 33.00
15.00
46.00
33.00 29.00 42.00
27.00
21.00
35.00 44.00 15.00
27.00
19.00
37.00 36.00 36.00
33.00
26.00
55.00 22.00 41.00
46.00
19.00
27.00 51.00
20.00
10.00
17.00
40.00 29.00
25.00
16.00
24.00
33.00 21.00
38.00
37.00
21.00
39.00 28.00 47.00
18.00
27.00
28.00 29.00 32.00 14.00 16.00

To calculate the mean, you would add up all of the values and divide the total by the number of values in the data set. In this example, the mean is 29.66. To calculate the median, you would rank the values in the data set in increasing order and then determine the middle value. In this example, the median is 28.5. To calculate the mode, you would determine the frequency at which each value occurs and then determine which value has the highest frequency. In this data set, there are two values that have the highest frequencies (f = 4). These values are 27 and 33.

Using SPSS to determine the measures of central tendency of a variable named score, we could produce one (or all) of the following sets of output (depending on how much information we needed).

Frequencies Output Frequencies output with measures of central tendency
Descriptives
Output
Descriptives output with measures of central tendency
Explore
Output
Explore output with measures of central tendency

Each of the above sets of output were generated using different function in SPSS (to be described later in this lesson). The results are the same, though, because the outputs were generated from the same data file. In the first output (labeled frequencies output), all of the measures of central tendencies presented in the textbook are included in the output. Checking our manual calculations against the SPSS input, our calculated measures of central tendency match the values from SPSS.

In the second output (labeled descriptives output), the only measure of central tendency included in the output is the mean. The other descriptive statistics provided in the output are measures of variability (addressed the variability section of this lesson ).

In the third output (labeled explore output), there are additional measures of central tendency (along with several measures of variability). Since the same data was used to generate all of the output (and the methods used to calculate the measures were the same), the values are the same. A check of the median in the output shows that our manual calculation was correct. Our manual review of the frequencies revealed two modes (27 and 33). The SPSS output shows the lowest mode (when there is more than one mode) along with a note that there are multiple modes.

Mean, Median, and Mode

Recall that a histogram and frequency curve graphically display the relationship between the mean, median, and mode. By looking at a frequency curve, you can determine whether a distribution is normally distributed or is skewed. You can also make this determination by reading the SPSS output. In the sample output shown below, the mean is higher than the median and mode.

What is the shape of this distribution? Since the mean is the largest value, the median is the next largest value, and the mode is the smallest value, the distribution has a positive skew. How else could you determine that this distribution was positively skewed? If you were viewing the explore output, you could determine whether the distribution was skewed by checking the calculated skew value, which is labeled "skewness". In this example, the test scores have a positive skew, indicating that most of the scores are at the low end of the distribution with relatively fewer scores at the high end of the distribution.

 

Interpreting Measures of Dispersion on SPSS Output

Recall that the mean, median, and mode allow you to determine the shape of a data distribution. Another important aspect of the data distribution is how the data is spread (or how it varies). To examine the variability of a data distribution, you examine the measures of dispersion. Recall from the lesson on variability that you have learned about these four measures of dispersion: range, interquartile range, variance, and standard deviation.

Dispersion

Just as central tendency information can be included in a variety of forms of SPSS output, measures of dispersion can also be customized. Below is sample output showing various measures of dispersion generated from the frequencies command, descriptives command, and explore command. The display differs across the outputs, but the results are the same because each analysis was conducted using the same data file.

Frequencies Output Frequencies output with measures of dispersion
Descriptives Output Descriptives output with measures of dispersion
Explore Output Explore output with measures of dispersion

We will start with the range. To calculate the range of a data distribution, you subtract the lowest value from the highest value. In the data set shown below (which was used to generate the above outputs), the calculated range would be 55 - 10 = 45.

25.00 34.00 33.00 15.00 46.00
33.00 29.00 42.00 27.00 21.00
35.00 44.00 15.00 27.00 19.00
37.00 36.00 36.00 33.00 26.00
55.00 22.00 41.00 46.00 19.00
27.00 51.00 20.00 10.00 17.00
40.00 29.00 25.00 16.00 24.00
33.00 21.00 38.00 37.00 21.00
39.00 28.00 47.00 18.00 27.00
28.00 29.00 32.00 14.00 16.00

Using the output to check our calculation, we can confirm that our calculated range was correct by reading the value labeled range. If you wanted to triple-check the range calculation, you could even subtract the lowest value from the highest value of the desired variable. On the output these values are labeled minimum and maximum, respectively. Given an SPSS output, then, you could determine or calculate the range of the underlying data set.

To further explore the spread of the data, you could examine the interquartile range (IQR). If you calculated the IQR manually, you could check your calculations by reading the IQR from the explore output. The IQR and range give you information about how "spread out" the data is. The variance and standard deviation provide information about how the data vary relative to the mean of the data. If you were given two SPSS outputs that displayed two different values of variance and standard deviation, you could determine which data set had data that were more clustered around the mean. For example, compare the outputs below.

Example A Example B

Which of the two examples indicates the greater scatter of data around the mean? Recall from the lesson on variability that if the data are widely scattered about the mean, the variance and the standard deviation will be somewhat large. Because the variance and standard deviation of Example A are larger than that of Example B, the data for Example A would be more widely scattered about the mean. Likewise, the data for Example B would be more tightly clustered about the mean.

Other Descriptive Statistics

If you were given output but no data set, could you still determine how many values were in the sample (data set)? The answer is yes using either the frequencies output or the descriptives output. The number of values in the sample is represented as N in the output. In this example there were 50 values in the data set. Sometimes you may have an incomplete data set (e.g., where some of the values of the variables being analyzed are missing). SPSS alerts you to the number of missing values in the data set (for the variable being analyzed) in the frequencies output and the descriptives output.

Another statistic that you have already encountered that is included on the explore output is the skewness. Recall that skewness provides information about the shape (symmetry) of a distribution. If the skewness value on the output is positive, the distribution has a positive skew. Similarly, if the skewness value is negative, the distribution has a negative skew. Based on what you learned about the shape of distributions, how would you interpret a positive or negative skew value on an SPSS output? If the skew value is positive, the variable being examined has mostly high values with relatively few low values. Conversely, if the skew value is negative, the variable has mostly low values with relatively few high values.

The other statistics on the sample output (e.g., confidence interval, trimmed mean, and kurtosis) will be covered later in this course or in the intermediate statistics course. For the present course, you should be able to use SPSS output to interpret the measures that are covered in the central tendency and dispersion lesson. You should also be able to generate the types of output that contain information about central tendency and dispersion, which will be covered next.


Generating Central Tendency and Dispersion Output

As indicated by the three sets of output we interpreted, there are several ways to generate central tendency output in SPSS. Three methods are presented in the following sections: (a) using the frequencies command, (b) using the descriptives command, and (c) using the explore command. The method that you would use when analyzing your own data would depend upon your preference or the types of information that you needed to include in the output.

Using the Frequencies Command

You can include the mean, median, and mode in frequency output generated using the frequencies command. The output will also include the sum of the values of the specified variable(s) as well as the number of values included in the calculation(s).

To create a frequency table using SPSS, open the analyze menu and select the frequencies command from the descriptive statistics submenu.

In the Frequencies dialog box, you must specify the name of the variable for which you want to calculate measures of central tendency. To specific a variable to include in the output:

  1. Click on the variable name in the list of available variables to select it.
  2. Then click the add button. You can add as many variables as desired to the output (as long as the variables already exist in the data file). SPSS will generate output for each variable that is listed in the variable(s) box.
  3. If you want to include frequency tables in the output, make sure that the box labeled display frequency tables is checked.
  4. Click the statistics button to open the statistics dialog box.
  5. In the Frequencies: Statistics dialog box, click in the corresponding checkbox(es) to select the measure(s) of central tendency and dispersion that you want to include in the output. Then click the continue button to return to the Frequencies dialog box.









  6. Click the OK button in the Frequencies box to generate the frequency output for the selected variable(s). Each statistic that you selected will be included on the output.

Using the Descriptives Command

The descriptives output can also be customized to provide various measures of central tendency and dispersion. To generate descriptives output using SPSS, open the analyze menu and select the descriptives command from the descriptive statistics submenu.

In the Descriptives dialog box:

  1. Add the desired variable(s) to the list of variables to be analyzed. In the example shown below, the variable to be analyzed is named score.
  2. Click the options button to access the options for customizing the output.
  3. In the Descriptives: Options box, select the measure(s) that you want to include in the output. After selecting the desired options, click the continue button to return to the Descriptives dialog box.
  4. Click the OK button in the Descriptives dialog box to generate the output. Each statistic that you selected will be included on the output.

Using the Explore Command

The last method of obtaining central tendency and dispersion output that we will use is the explore command. Using this method, you can obtain the same information from the descriptives output with a few additional statistics. To generate explore output using SPSS, open the analyze menu and select the explore command from the descriptive statistics submenu.

 

In the Explore dialog box:

  1. Add the desired variable(s) to the list of dependent variables to be analyzed. In the example shown below, the variable to be analyzed is named score.
  2. Select the statistics option for display if you do not want to include a stem-and-leaf diagram or a histogram in the output. Otherwise, select both as the display option, and use the plots button to specify the type of plot you want to include in the output.
  3. Click the statistics button to customize the output.
  4. In the Explore: Statistics box, make sure that the descriptives box is checked. (Using the explore command, you do not have the option of specifying which measures of central tendency and dispersion are included in the output; the default output includes the mean, median, variance, standard deviation, minimum value, maximum value, range, IQR, skew, and kurtosis.) Click the continue button to return to the Explore dialog box.
  5. Click the OK button in the Explore dialog box to generate the output.
    Explore output with measures of dispersion

 

Notice the similarities and differences in the outputs generated using the frequencies, descriptives, and explore commands. The calculations are the same no matter which output you use, but the information displays differently. When determining which command to use, select the one that provides the information you need in a preferable format.


Review

You can generate central tendency and dispersion output using several different methods in SPSS. Using the information that you have learned in this lesson and given sample SPSS output, you can:

  • Determine the mean, median, and mode of a specified variable in a data set.
  • Determine the range, standard deviation, variance, and IQR of a specified variable in a data set.
  • Determine the number of values for a specified variable in a data set.
  • You can generate central tendency and dispersion output using the frequencies command.
  • You can generate central tendency and dispersion output using the descriptives command.
  • You can generate central tendency and dispersion output using the explore command.

To practice generating frequency output using SPSS, download one of the following percentiles data files:

(Note: If the linked file does not begin downloading when you click the link, right-click on the link and select save target as or save link as from the menu.)

 

 

© 2007 by Melissa Kelly and L. K. Curda. All rights reserved. Updated on September 26, 2007