* Enhanced Version of Migration Files *

The Special Tabulation Product (STP28) files (county to county 
migration tally, abbrevatiated format) are available from the 
US Bureau of the Census via their FTP server. The files may be 
retrieved by establishing an anonymous ftp session with the 
Bureau's server (ftp ftp.census.gov; login as user ftp; supply 
email address when prompted for a password; change directories 
to /pub/stp).

The files are posted by the Census Bureau in PKZIP self-extracting 
executable files containing one data file per state, one documentation 
file, one data file containing marginal totals, and one file 
presenting county-state names and FIPS codes.

The structure of these files may generate problems in processing.
For example:

- The files cannot be processed in mainframe environments before 
  decompression, which may require users to have to transfer the 
  files twice: once to the PC, followed by decompression, and then 
  up to the mainframe.

- The files contain a single data file per state with data on 
  in-migration to the state only (in addition to within state 
  migration data)

- The table matrix for race and race/not hispanic breakdowns are 
  grouped together in the same data file (P1 Race of Person: white, 
  black, asian, and other; and P2 Spanish Origin of Person: not 
  spanish, mexican, puerto rican, cuban, and other)

- The table matrix for P1 and P2 contain only a single count field, 
  for the number of persons who moved between 1985 and 1990.

From these data and files structure it becomes evident that:

- Calculations of netmigration, including the flow of persons from 
  the state of interest to other states, requires that all other 
  state data files must be searched. This involves handling 50 
  self-extracting executable files and requires a fair amount of 
  free disk space to process.  For example, the data file for 
  Pennsylvania has no information about how many persons moved from 
  Pennsylvania to New York, Ohio, or California during the period 
  of 1985 to 1990.

- Similarly, large amounts of data need to be processed even if 
  only race of person or spanish origin of person is needed (ie, 
  only access to P1 or P2 table amtric is derired).

Henk Meij of CIESIN (Saginaw, MI) and John Blodgett of the University 
of Missouri St. Louis (MO SDC) have created an alternative set of 
files containing the same basic data but in a restructured format.  
The primary advantage of this new format is more easily obtainable 
data on both in- and out-migration for a specific geographic area.

The enhancements involve separations of table matrixes, addition of 
multiple count fields for in and out moves on table matrixes, and 
creation of a third file for net-migration flows. In all files, 
information about migration flows in both directions are stored on 
a single record for ease of processing.

The files are available via anonymous FTP and Mosaic at CIESIN (see 
below for details). The format of these enhanced files is described 
in detail below.  There are 3 separate files for each state.  

o The "p1" and "p2" files contain all the data from the original 
  census STP28 files, but the data by race (p1) and by hispanic 
  origin (p2) have been separated into individual state level files. 
  The p1 and p2 files have been enhanced by the creating of two count 
  fields per record instead of one: the POPIN field counts persons 
  moving into the county (COUNTY) from another county (COUNTY2), 
  while POPOUT counts persons moving in the opposite direction.  
  The headers record type is gone, and the two county codes appear 
  in every record, a tradeoff that involves a little more storage 
  but a lot more processing convenience when needing to subset the 

o The third state level file, "tf", contains the "total flows" of 
  persons for all states (into the county, out of the county and 
  within county moves).  The total flows are reported for all counties 
  within a state, and all counties outside of the state which were 
  sources or destinations of moves involving the state of interest. 
  This file was created by aggregating the more detailed data in 
  the P1 file

The layout of the these files is quite simple. The following SAS (r) 
statements can be used to read each of the files. The codes for the 
stratifier fields are identical to those used by the Census Bureau 
(consult the codebook).

/* read tf record */

/* read p1 record */
     INPUT COUNTY $1-5 COUNTY2 $6-10
           RACE $11 SEX $12 NATIVITY $13 POVSTAT $14 EDUC $15
           AGE $16-17 POPIN POPOUT;

/* read p2 record */
     same as p1 record but substitute "SPANISH" for "RACE"

Note that the two population count fields (POPIN and POPOUT) are 
read as "free form" instead of fixed-length fields.  These fields 
will only be as long as their values require with one leading blank 
as delimiter.

These variables may be defined as (again SAS (r) syntax example):

/* assign labels */

Examples of the data are presented next for the three files
mentioned above involving the counties Adair County, MO (29001) and
St.  Clair County, IL (17163) from the MO files:

/* p1 file */
29001171631111305 5 0
29001171631113204 7 0
29001171631113304 6 0
29001171631113305 8 0
29001171631211205 4 0
29001171632111304 9 0
29001171632112406 0 5

/* p2 file */
29001171631111304 9 0
29001171631111305 5 0
29001171631112406 0 5
29001171631113204 7 0
29001171631113304 6 0
29001171631113305 8 0
29001171631211205 4 0

/* tf file */
2900100000 11641 0		Total number of persons in county(1)
2900129001 4719 4719		Total moves within county 29001.
2900117163 39 5			Total moves between counties in example.
1716329001 5 39			Same record but from IL file.

In this example a "total flows" record indicates migration of 39 
persons moving into Adair from St. Clair, while 5 persons moved 
to St. Clair from Adair(2).

	1-A value of '00000' in the 2nd county field indicates data 
	  pertaining to persons over 5 who resided in the county and 
	  did not move between 1985 and 1990 (this is undocumented 
	  information and needs verification by Bill Frey and/or 
	  Census Bureau)
	2-if there are no moves between two counties there will be 
	  no record for that county pair
        3-a move in one direction but none in the opposite direction
	  are indicated by a '0' in either the POPIN or POPOUT fields.

There is one type of file which had to be created to generate the 
total flows files, the interstate migration files, or "im" files. 
The data structure is identical to the original Bureau of the Census 
files. The "im" files contain all migration from a state of interest 
to all other states. In other words, by combining the "im" file for 
a particular state with the original data file for a particular for 
that state, the p1, p2 and tf files may be created. An example of 
this data involving the counties mentioned above is presented next.

/* im file */
12112406     5
21112406     5

To obtain these files from CIESIN you follow the instructions below 
for an anonymous ftp session. Or via Mosaic, the files may be 
retrieved directly.

FTP Retrieval:

	ftp ftp.ciesin.org
	(if this naming convention yields the error 
	 "host unknown", try ...)

	Name: ftp
	(log in as user ftp)

	Password: shere_khan@jungle.book.org
	(email address as password)

	ftp> cd /pub/census
	(change directory to archive)

	ftp> binary
	(turn binary mode on, needed for *.zip files)

At this point informative messages (the readme files) will be 
echoed to the screen when changing into directories. Descend into 
the directories "usa" and "stp" to find the subdirectories "p1", 
"p2" and "tf". You may retrieve as many files as you like (note: 
unfortunately the ftp server does not currently support 
on-the-fly-decompression so only compressed files can be retrieved 
(ZIP). If you need an unzip binary for your platform or require 
the source code, retrieve the appropriate file from 

Mosaic Retrieval: 

	Load URL into Mosaic


	and follow the links "Data Access"
				   "Dataset Guides"
					  "US Demography"

	and connect to the anoynous ftp archive of census data products.
	Or, load the demography home page directly


When you click on the link depicting the anonymous FTP server, you 
may either traverse the directories yourself ("browse this archive") 
or you may select the files you want to retrieve from the appropriate 
section.  When selected, Mosaic will prompt you for a local 
path/filename combination to save the retrieved file.


If you have questions about the logistics of accessing these files 
at CIESIN you can send an e-mail message to Henk Meij 
(hmeij@ciesin.org) or ciesin.info@ciesin.org. 

If you have questions about the content of the files (understanding 
the data, not how to do custom applications!) you can send an email 
message to John Blodgett (c1921@umslvma.umsl.edu).