***************************************
* Enhanced Version of Migration Files *
***************************************
The Special Tabulation Product (STP28) files (county to county
migration tally, abbrevatiated format) are available from the
US Bureau of the Census via their FTP server. The files may be
retrieved by establishing an anonymous ftp session with the
Bureau's server (ftp ftp.census.gov; login as user ftp; supply
email address when prompted for a password; change directories
to /pub/stp).
The files are posted by the Census Bureau in PKZIP self-extracting
executable files containing one data file per state, one documentation
file, one data file containing marginal totals, and one file
presenting county-state names and FIPS codes.
The structure of these files may generate problems in processing.
For example:
- The files cannot be processed in mainframe environments before
decompression, which may require users to have to transfer the
files twice: once to the PC, followed by decompression, and then
up to the mainframe.
- The files contain a single data file per state with data on
in-migration to the state only (in addition to within state
migration data)
- The table matrix for race and race/not hispanic breakdowns are
grouped together in the same data file (P1 Race of Person: white,
black, asian, and other; and P2 Spanish Origin of Person: not
spanish, mexican, puerto rican, cuban, and other)
- The table matrix for P1 and P2 contain only a single count field,
for the number of persons who moved between 1985 and 1990.
From these data and files structure it becomes evident that:
- Calculations of netmigration, including the flow of persons from
the state of interest to other states, requires that all other
state data files must be searched. This involves handling 50
self-extracting executable files and requires a fair amount of
free disk space to process. For example, the data file for
Pennsylvania has no information about how many persons moved from
Pennsylvania to New York, Ohio, or California during the period
of 1985 to 1990.
- Similarly, large amounts of data need to be processed even if
only race of person or spanish origin of person is needed (ie,
only access to P1 or P2 table amtric is derired).
Henk Meij of CIESIN (Saginaw, MI) and John Blodgett of the University
of Missouri St. Louis (MO SDC) have created an alternative set of
files containing the same basic data but in a restructured format.
The primary advantage of this new format is more easily obtainable
data on both in- and out-migration for a specific geographic area.
The enhancements involve separations of table matrixes, addition of
multiple count fields for in and out moves on table matrixes, and
creation of a third file for net-migration flows. In all files,
information about migration flows in both directions are stored on
a single record for ease of processing.
The files are available via anonymous FTP and Mosaic at CIESIN (see
below for details). The format of these enhanced files is described
in detail below. There are 3 separate files for each state.
o The "p1" and "p2" files contain all the data from the original
census STP28 files, but the data by race (p1) and by hispanic
origin (p2) have been separated into individual state level files.
The p1 and p2 files have been enhanced by the creating of two count
fields per record instead of one: the POPIN field counts persons
moving into the county (COUNTY) from another county (COUNTY2),
while POPOUT counts persons moving in the opposite direction.
The headers record type is gone, and the two county codes appear
in every record, a tradeoff that involves a little more storage
but a lot more processing convenience when needing to subset the
files.
o The third state level file, "tf", contains the "total flows" of
persons for all states (into the county, out of the county and
within county moves). The total flows are reported for all counties
within a state, and all counties outside of the state which were
sources or destinations of moves involving the state of interest.
This file was created by aggregating the more detailed data in
the P1 file
The layout of the these files is quite simple. The following SAS (r)
statements can be used to read each of the files. The codes for the
stratifier fields are identical to those used by the Census Bureau
(consult the codebook).
/* read tf record */
INPUT COUNTY $1-5 COUNTY2 $6-10 POPIN POPOUT;
/* read p1 record */
INPUT COUNTY $1-5 COUNTY2 $6-10
RACE $11 SEX $12 NATIVITY $13 POVSTAT $14 EDUC $15
AGE $16-17 POPIN POPOUT;
/* read p2 record */
same as p1 record but substitute "SPANISH" for "RACE"
Note that the two population count fields (POPIN and POPOUT) are
read as "free form" instead of fixed-length fields. These fields
will only be as long as their values require with one leading blank
as delimiter.
These variables may be defined as (again SAS (r) syntax example):
/* assign labels */
LABEL POPIN =' IN-MIGRATION COUNTY2 TO COUNTY, 85-90'
POPOUT='OUT-MIGRATION COUNTY TO COUNTY2, 85-90' ;
Examples of the data are presented next for the three files
mentioned above involving the counties Adair County, MO (29001) and
St. Clair County, IL (17163) from the MO files:
/* p1 file */
29001171631111305 5 0
29001171631113204 7 0
29001171631113304 6 0
29001171631113305 8 0
29001171631211205 4 0
29001171632111304 9 0
29001171632112406 0 5
/* p2 file */
29001171631111304 9 0
29001171631111305 5 0
29001171631112406 0 5
29001171631113204 7 0
29001171631113304 6 0
29001171631113305 8 0
29001171631211205 4 0
/* tf file */
2900100000 11641 0 Total number of persons in county(1)
2900129001 4719 4719 Total moves within county 29001.
2900117163 39 5 Total moves between counties in example.
1716329001 5 39 Same record but from IL file.
In this example a "total flows" record indicates migration of 39
persons moving into Adair from St. Clair, while 5 persons moved
to St. Clair from Adair(2).
Notes:
1-A value of '00000' in the 2nd county field indicates data
pertaining to persons over 5 who resided in the county and
did not move between 1985 and 1990 (this is undocumented
information and needs verification by Bill Frey and/or
Census Bureau)
2-if there are no moves between two counties there will be
no record for that county pair
3-a move in one direction but none in the opposite direction
are indicated by a '0' in either the POPIN or POPOUT fields.
There is one type of file which had to be created to generate the
total flows files, the interstate migration files, or "im" files.
The data structure is identical to the original Bureau of the Census
files. The "im" files contain all migration from a state of interest
to all other states. In other words, by combining the "im" file for
a particular state with the original data file for a particular for
that state, the p1, p2 and tf files may be created. An example of
this data involving the counties mentioned above is presented next.
/* im file */
01716329001
12112406 5
21112406 5
To obtain these files from CIESIN you follow the instructions below
for an anonymous ftp session. Or via Mosaic, the files may be
retrieved directly.
FTP Retrieval:
ftp ftp.ciesin.org
(if this naming convention yields the error
"host unknown", try ...)
ftp 160.39.8.201
Name: ftp
(log in as user ftp)
Password: shere_khan@jungle.book.org
(email address as password)
ftp> cd /pub/census
(change directory to archive)
ftp> binary
(turn binary mode on, needed for *.zip files)
At this point informative messages (the readme files) will be
echoed to the screen when changing into directories. Descend into
the directories "usa" and "stp" to find the subdirectories "p1",
"p2" and "tf". You may retrieve as many files as you like (note:
unfortunately the ftp server does not currently support
on-the-fly-decompression so only compressed files can be retrieved
(ZIP). If you need an unzip binary for your platform or require
the source code, retrieve the appropriate file from
/pub/census/src).
Mosaic Retrieval:
Load URL into Mosaic
http://www.ciesin.org/
and follow the links "Data Access"
"Dataset Guides"
"US Demography"
and connect to the anoynous ftp archive of census data products.
Or, load the demography home page directly
http://www.ciesin.org/datasets/us-demog/us-demog-home.html
When you click on the link depicting the anonymous FTP server, you
may either traverse the directories yourself ("browse this archive")
or you may select the files you want to retrieve from the appropriate
section. When selected, Mosaic will prompt you for a local
path/filename combination to save the retrieved file.
##################################################
If you have questions about the logistics of accessing these files
at CIESIN you can send an e-mail message to Henk Meij
(hmeij@ciesin.org) or ciesin.info@ciesin.org.
If you have questions about the content of the files (understanding
the data, not how to do custom applications!) you can send an email
message to John Blodgett (c1921@umslvma.umsl.edu).