Lesson 2:

Entering Data, Computing Descriptive Statistics, Transforming and Selecting Your Data

The first step in working with SPSS is to enter your data and to create an SPSS data file. Although we will assume in this lesson that you are typing in your data for the first time, you should be aware that SPSS can also read already established data files from other programs such as Excel and Lotus 1-2-3. When you start an SPSS session, the initial screen is the data editor. As you can see below, this looks and functions like a spreadsheet.

Initial SPSS Data Editor Screen

Key Point

 The key to typing in your data is to realize that the data from each case or participant must be typed on a separate line. So, for example, if you were interested in analyzing the five test scores of the 20 students in a class, you would use one line (with five grade scores on each line) for each of the 20 students. If you collected six characteristics (e.g., average family income, average number of children, . . .) for 80 countries, you would use one line for each of the 80 countries (with the six characteristics on each line).

Example

Let's take a concrete example. Assume that you are interested in developing a profile of people who use a soup kitchen in your city. To do this you collected the following information from a random sample of 50 users of the soup kitchen.
 
 
Person 
Gender
Age
Number of Siblings
Health Score 
Personality Score
Activity Score
Male  76  16.64  15  -4 
Female  28  60.83  22 
Male  39  44.25  18 
Male  47  49.13  36 
Female  56  30.67  25  -1 
Female  61  29.37  20  -3 
. . .
50  Male  59  35.92  31 

Other Points to Keep in Mind

As you type in your data, you will need to do each of the following.
  1. Create a separate line for each case, which in this particular example is each person.
  2. Create a column for each variable of interest. In this example, we will use seven columns, one for each of the following variables (person, gender, age, number of siblings, health score, personality score, activity score). Note that it may not be essential to create a column for a person identification number, but we will do so simply to help us keep track of the data.
  3. Develop a numerical code for the gender variable. In this case, we will assign the value of 1 to females and the value of 2 to males.

Creating the Data File

In this section, we will describe the step-by-step procedure for creating your data file.

Step 1. Double-click on the gray portion of the first variable column (labeled var) on your screen. This should give you a Define Variable window that looks like the one below.

Defining the Variable Window

Step 2. In the space for Variable Name, type the desired variable name, which can be no more than 8 characters in length. The first character must be alphabetic; the remaining characters can be alphabetic and/or numeric, and no spaces can appear in the name. So, in this case, let's type "person."

Step 3. You can also Change Settings at this time. Among other things, this allows you to change any of the following.
 
 
Change Settings
Default Value
Type--Type of variable  numeric 
Total number of characters 
Number of characters beyond the decimal point 
Labels--allows you to list a more extensive label for your variable. Eight character variable names are difficult to remember, and we recommend that you always exercise the option of listing a more descriptive label.  none 
Missing Values--enables you to designate certain scores as missing.  none 
Column Format--allows you to change the maximum number of characters in a column.  8

 

In this example, because there are no decimal points in our person variable, click on Type and change the Decimal Places to 0. Also, click on Labels and type in a label like "participant number" in the Variable Label slot. Then press Continue. Note that we could have exercised other options such as modifying the column width and whether the numbers appear left-justified, right-justified, or centered.

Step 4. Now, you should set up the next variable. Double-click on the next var column and type "gender" in the Variable Name field. Because we have decided to use the codes of 1 and 2 to represent females and males, respectively, you can click on type to change the Decimal Places to 0. Next, click on Labels to provide a label for your variable--for example, you might type "gender of the participants." In this case, because we have specified numeric codes for the different values for our variable and we are likely to forget these over time, we should specify Value Labels. In the field for Value, type a "1," and in the field for Value Label, type a label such as "females." Then create the label of "males" for a value "2."

Step 5.You should now define each of the remaining four variables.

Step 6. Now, type in the data for the first six persons in our data sheet. Start by clicking in the left-most column of the first line and type the person's number (i.e., "1"), then, press the Tab Key or the right arrow key and type the first person's gender (i.e., "2"). Continue to do this until the data are typed in. Below we have listed a copy of the data file that we created.

Data file Containing the First Six Lines of the Data

Step 7. When you are satisfied with your data file, you should save it. Click on File and Save and type in a file name (e.g., "soupkit"). Note that SPSS automatically adds the ". sav" suffix to your file name. This is the SPSS suffix that is used to designate data files.

Computing Means and Standard Deviations

Once you have typed in your data, performing statistical procedures is relatively simple. To give you a sense of how to do so, as well as to expose you to some of the powerful SPSS tools for analyzing data, we will take you through several examples. In the first example, we will compute the mean and standard deviations for each of the variables from our soup kitchen study. In the second, we will do this separately for males and females. Finally, in a third example, we will compute a new variable that is a composite of the Health Activity and Personality Scores and compute the mean and standard deviation for this measure.

Computing the Mean and Standard Deviation for All Scores

Step 1. Click on Statistics, then Summarize, then Descriptives.

Step 2. Highlight each of the variables for which you are interested in computing descriptive statistics (e.g., Age, Number of Siblings, Health Score, Personality Score, and Activity Score) and move them into the Variables column. Note that you can move all of these variables over at one time by clicking and dragging over the items that you want to select.

Step 3. Click on Options and select the desired statistics. At minimum, you should select the Mean and Std. deviation.

Step 4. Your output should look like that below. You may have fewer or more statistics depending on your selection in the Options menu. Note that the variable labels appear on your printout.

Output from the Analysis

Step 5. If you would like a hard copy of this output, you can print it by clicking on the print icon on the tool bar. Also, you can save the output by clicking on the disk icon. Note that SPSS automatically adds the ".spo" suffix to your output file name.

Computing Means and Standard Deviations for Males and Females Separately

Step 1. First click on Data and then Split File. This allows you to split the file according to a particular variable and conduct separate analyses for each level of the variable.

Step 2. Next you should select Organize output by groups and move Gender from the variable list to the Groups Based on list. When you have done this, click on OK.

Step 3. Now, click on Statistics, Summarize, and Descriptives. Select the Desired variable(s) and Options.

Step 4. Note in your output that two sets of data summaries are presented--one for females and one for males.

Computing a New Variable and Then Performing Descriptive Statistics

Assume that in addition to our interest in calculating the means and standard deviations for the collected variables, we wish to compute a composite score that roughly represents some overall measure of physical and mental health. Specifically, assume that we wish to compute the mean and standard deviation for a new variable called overall health that represents the average of each person's Health Score and Personality Score. You can do this by performing the following steps.

Step 1. Click on Transform and then Compute. This should produce the screen shown below.

Compute Variable Screen

Step 2. Type the name of the variable that you wish to create (in this case, we will name it "ohealth") in the Target Variable field. Now, you need to type in the computation that you wish to have performed in the Numeric Expression field. You can use all the operations listed on the bottom of this screen, and it is important to realize that operations within parentheses are performed first. Thus, if we wish to compute the mean of the Health and Personality Scores, we need to add these together before we divide by 2. To do so, simply type (or move over) variable names for the Health and Personality Scores and enclose these within parentheses. Next, click on the / button (or type it-- this is the symbol for division) and follow this with the number "2." Your target variable and your numeric expression fields might look like the following:
Target Variable Numeric Expression
ohealth =  (health + personal)/2 

Step 3. Now you should click on the Type&Label field. This will enable you to create a longer label for your variable and modify the type and width of the variable. Once you have done this, press Continue. Then press OK.

Step 4. Once you have done this, notice that the new variable appears in your data file. Now you are ready to calculate the mean and standard deviation for this new variable by using the procedures outlined above.
 
 

Another Example to Help You Practice Creating a Data File for a Data Set

If you would like further practice creating a data file and computing descriptive statistics, work your way through the following example. Assume that you have conducted an experiment to determine if newborn infants prefer patterned or plain stimuli. In the study, 10 infants were presented with both plain and patterned figures. The amount of time spent in looking at each type of stimulus was measured. In addition, the race of the infants and the age (in days) were recorded. The data sheet for this experiment appears below.
 
 
Participant
Race
Age
Time Viewing Patterned Figure
Time Viewing Plain Figure
Am. Indian  15 
Asian  13 
White  17 
White  10  11 
Afr. Amer.  14 
Hispanic  16 
White 
White  14  12 
Asian  21 
10  Afr. Amer.  13 

Below is one possible way to create your data file. Note that you have to create a numeric code for the race variable and in this case we used 1 = American Indian, 2 = Asian, 3 = African American, 4 = Hispanic, and 5 = White. Once you have typed the data file, compare it with the one below. Then, try to compute means and standard deviations for the time viewing the patterned figure and the time viewing the plain figure variables. The results should look like those in the output file below.

Data File for the Infant Viewing Experiment

 Data File for the Data

Output File for the Infant Viewing Experiment

 Ouput File for the Data



 Return to Contents