Missing Values

It is common for some of your data records to be incomplete. This may occur when respondents to a survey omit responses to some questions or answer a question incorrectly making it impossible to determine what response was intended. In some cases, a respondent may refuse to answer a question or may not know the answer.

In the case of numeric data, a blank cell in the SPSS data file is assigned a value of system missing. It is generally recommended that missing numeric data values be left blank. Missing values normally are omitted from most calculations in SPSS.

When defining a variable's attributes in the Data Editor's Variable View, you also can designate user-defined missing values. These may be up to three discrete (individual) missing values, a range of missing values, or a range plus one discrete value. User-defined missing values are useful when you want to keep track of the different reasons why a variable's value is missing.

For example, let's say you're asking respondents in a survey their annual household income. For the sake of simplicity, let's say that there are two possible valid codes, 1 representing less than $50,000, and 2 representing $50,000 or more. Three codes are reserved for several types of missing values: 7 for "refused," 8 for "don't know," and 9 for "not applicable" (e.g., perhaps the respondent doesn't live in a conventional household). Value labels are assigned as shown in Figure 8:

Value labels for income question

Figure 8

User-defined missing values are specified via the Data Editor's Variable View, similar to specifying other variable attributes. Click on the cell under the Missing column. A button with an ellipsis ( ellipsis button ) appears. Click on this, and you'll be prompted to enter up to three missing values, or a range of missing values and optionally, one more value to be considered missing (see Figure 9).

Specifying missing values

Figure 9

In the example above, the three user-defined missing value codes are entered as discrete values. They could also have been entered as a range of missing values, with Low=7 and High=9.

When the data are analyzed, SPSS does not include missing values in the calculations of statistics that sumarize legitimate responses. For example, in a frequency table of the responses to the income item, "valid percentages" are computed for the valid values, in this case, 1 or 2 (see Figure 10). The SPSS Frequencies procedure also tabulates missing values and computes their percentages of all reponses given for the variable. Note that "system missing" values (blanks) are also tabulated.

Frequency table of income

Figure 10

Back Index Next