Spss Merge Files Multiple Key Variables

Spss Merge Files Multiple Key Variables

This document is currently incomplete, but contains enough information to handle several common problems.
 lets you combine variables from several into a single dataset. In addition to merging variables from several data files, the command can also deal with situations where cases or variables are missing in some of the datasets.

An introductory example
Let us take a simple example to introduce merging data: A first dataset contains the following country data:
CountryData
NameLabel
Country
Continent 
AgriGDPAgriculture (as % of GDP), 1997
PopGrow1Annual population growth rate (%), 1975-1997
PopGrow2Annual population growth rate (%), 1997-2015
GovExpdCentral government expenditure (as % of GDP), 1997
DocDensDoctors (per 100,000 people), 1993
The second file contains literacy data.
AdultLiteracy
NameLabel
Country
Continent 
AdLitAdult literacy rate (%), 1997
AdLitFemaleAdult literacy rate (%), Female, 1997
AdLitMaleAdult literacy rate (%), Male, 1997
Use MATCH FILES to merge the dads2.sav and faminc2.sav files based on famid; Below we show the commands for performing the merge. GET FILE='dads.sav'. SORT CASES BY famid. SAVE OUTFILE='dads2.sav'. GET FILE='faminc.sav'. SORT CASES BY famid. SAVE OUTFILE='faminc2.sav'. MATCH FILES FILE='dads2.sav' /FILE='faminc2.sav' /BY famid. If all variables are same, you can merge cases by clicking Data - Merge file - add cases. If you are adding variables, click Data -merge file -add variables.

CountryData
Name	Label
Country
Continent
AgriGDP	Agriculture (as % of GDP), 1997
PopGrow1	Annual population growth rate (%), 1975-1997
PopGrow2	Annual population growth rate (%), 1997-2015
GovExpd	Central government expenditure (as % of GDP), 1997
DocDens	Doctors (per 100,000 people), 1993

AdultLiteracy
Name	Label
Country
Continent
AdLit	Adult literacy rate (%), 1997
AdLitFemale	Adult literacy rate (%), Female, 1997
AdLitMale	Adult literacy rate (%), Male, 1997

To analyze variables from both datasets together, you will have to combine the two datasets into a single one. More specifically the literacy data has to be added to the currently , i.e. CountryData (merge the two datasets). shows the following dialog:
 Select the 'AdultLiteracy' dataset from the list, then . Note that the 'AdultLiteracy' is currently open, you could also use the second option to find (browse) it somewhere on your computer. 
 In the next dialog box you will find: 
(Right panel): A list of the variables that you will find in the new active dataset to be created. Note that the source dataset is shown using (*) and (+) notation [In SPSS '*' always refers to the ]. If you do not want all variables you can move any variable to the left hand panel.
(Left panel): A list of variables to be excluded from the dataset to be built. By default, all duplicate variables from non-active dataset(s) appear on that list. If you would like to include one of the variables listed, you will have to it, as you cannot have to variables with the same name.
Match cases on key variables in sorted files: This option has been checked, and the country variable defined as key variable, to make sure that countries match, i.e. SPSS will verify that when considering a particular case, that the country names are identical (matching). By default (i. e. not selecting this option) SPSS would assume that both datasets contain the same observations in the same order and does not verify anything.
Important: Both datasets must be sorted before starting the merge.
Both Files provide cases has been checked to deal with situations where you do not have the same number of observations in both datasets, e.g. some countries might be missing in one of the datasets. The resulting dataset will contain all countries from both datasets. The values for the variables of a missing case in one of the datasets will be padded with SYSMIS. Note also that country names must match exactly, i.e. 'Korea, North', 'North Korea', 'north Korea', 'NORTH KOREA' are all different countries!!
 For the analysis of a survey [Survey dataset](individual level data) from several countries you would like to add country specific information, like GDP per capita available in a different dataset (CountryInfo, aggregate data), e.g. in the dataset to be produced there will be a GPD per capita variable where for each Albanian respondent you will find the GDP per capita for Albania.
In technical terms this means that for each observation in the Survey dataset you will look up the GDP value in the CountrInfo dataset.
 After merging the looks like this. 

 Here's the dialog panel that corresponds to our example ( 
Merge Variables Spss

Country is the that links both datasets.
The (also lookup file) from which the data is to be fetched for each observation in the active dataset, i.e. in our example CountryInfo is the Keyed table.
Both datasets need to be sorted before merging them. Before starting the process, SPSS issues a warning.
If - in our example - a country cannot be found in the lookup file (keyed table), a SYSMIS value will be assigned to each respondent from that country. 
See also
Spss Merge Files Multiple Key Variables Pdf The MATCH FILES command doing the job offers quite many options, in addition to the various options shown on the Merge files menu 
The MATCH FILES command offers many more options, than the menu and performs a wide variety of matching and merging operations that go beyond the simple tasks described here.

An introductory example
Let us take a simple example to introduce merging data: A first dataset contains the following country data:
CountryData
NameLabel
Country
Continent 
AgriGDPAgriculture (as % of GDP), 1997
PopGrow1Annual population growth rate (%), 1975-1997
PopGrow2Annual population growth rate (%), 1997-2015
GovExpdCentral government expenditure (as % of GDP), 1997
DocDensDoctors (per 100,000 people), 1993
The second file contains literacy data.
AdultLiteracy
NameLabel
Country
Continent 
AdLitAdult literacy rate (%), 1997
AdLitFemaleAdult literacy rate (%), Female, 1997
AdLitMaleAdult literacy rate (%), Male, 1997
Use MATCH FILES to merge the dads2.sav and faminc2.sav files based on famid; Below we show the commands for performing the merge. GET FILE='dads.sav'. SORT CASES BY famid. SAVE OUTFILE='dads2.sav'. GET FILE='faminc.sav'. SORT CASES BY famid. SAVE OUTFILE='faminc2.sav'. MATCH FILES FILE='dads2.sav' /FILE='faminc2.sav' /BY famid. If all variables are same, you can merge cases by clicking Data - Merge file - add cases. If you are adding variables, click Data -merge file -add variables.
To analyze variables from both datasets together, you will have to combine the two datasets into a single one. More specifically the literacy data has to be added to the currently , i.e. CountryData (merge the two datasets). shows the following dialog:
 Select the 'AdultLiteracy' dataset from the list, then . Note that the 'AdultLiteracy' is currently open, you could also use the second option to find (browse) it somewhere on your computer. 
 In the next dialog box you will find: 
(Right panel): A list of the variables that you will find in the new active dataset to be created. Note that the source dataset is shown using (*) and (+) notation [In SPSS '*' always refers to the ]. If you do not want all variables you can move any variable to the left hand panel.
(Left panel): A list of variables to be excluded from the dataset to be built. By default, all duplicate variables from non-active dataset(s) appear on that list. If you would like to include one of the variables listed, you will have to it, as you cannot have to variables with the same name.
Match cases on key variables in sorted files: This option has been checked, and the country variable defined as key variable, to make sure that countries match, i.e. SPSS will verify that when considering a particular case, that the country names are identical (matching). By default (i. e. not selecting this option) SPSS would assume that both datasets contain the same observations in the same order and does not verify anything.
Important: Both datasets must be sorted before starting the merge.
Both Files provide cases has been checked to deal with situations where you do not have the same number of observations in both datasets, e.g. some countries might be missing in one of the datasets. The resulting dataset will contain all countries from both datasets. The values for the variables of a missing case in one of the datasets will be padded with SYSMIS. Note also that country names must match exactly, i.e. 'Korea, North', 'North Korea', 'north Korea', 'NORTH KOREA' are all different countries!!
 For the analysis of a survey [Survey dataset](individual level data) from several countries you would like to add country specific information, like GDP per capita available in a different dataset (CountryInfo, aggregate data), e.g. in the dataset to be produced there will be a GPD per capita variable where for each Albanian respondent you will find the GDP per capita for Albania.
In technical terms this means that for each observation in the Survey dataset you will look up the GDP value in the CountrInfo dataset.
 After merging the looks like this. 
 Here's the dialog panel that corresponds to our example ( 
Merge Variables SpssCountry is the that links both datasets.
The (also lookup file) from which the data is to be fetched for each observation in the active dataset, i.e. in our example CountryInfo is the Keyed table.
Both datasets need to be sorted before merging them. Before starting the process, SPSS issues a warning.
If - in our example - a country cannot be found in the lookup file (keyed table), a SYSMIS value will be assigned to each respondent from that country. 
See also
Spss Merge Files Multiple Key Variables Pdf The MATCH FILES command doing the job offers quite many options, in addition to the various options shown on the Merge files menu 
The MATCH FILES command offers many more options, than the menu and performs a wide variety of matching and merging operations that go beyond the simple tasks described here.
Merging Data Files In SPSS - East Carolina University