Data dictionary and -infile- Which data input command (Part 3)

A piece of FIES


As we mentioned in Part 2 of this series, -infile- is used to load fixed-format data into Stata. But, what is a fixed-format (also fixed-width) data file? A fixed-format data is a text file wherein the format is specified by the length of each variable. For example, the following data is a 29-column data, where columns 1-4 is allotted for the 4-digit year; columns 5-19 is for country name; and columns 20-29 is for the data on population (I used the lines to represent the spaces to fill in the space allotted for each variable):

1960Philippines____27053834__
1960Thailand______27652013__
2007Philippines____87892094__
2007Thailand______63832135__

A data dictionary, a text file that instructs Stata how to read and store the contents of the data file, is required to load a fixed-format data. The dictionary and data may be in a single file or in two separate files. The dictionary contains the following information: storage type, data format, variable name, and and the starting column (or length of each variable). The dictionary may also contain the labels of each variable. For example, we can write a dictionary for our population data above as:

/beginning of dictionary/
dictionary {
_column(1) int year %4f “Year”
_column(5) str15  country  %15s “Country”
_column(20) float population %10f “Population”
}
/*end of dictionary; begginning of data*/
1960Philippines____27053834__
1960Thailand______27652013__
2007Philippines____87892094__
2007Thailand______63832135__
/*end of data (Note: the lines are only used to represent the blank spaces to fill in the allotted columns for each variable)*/

We can save this as population.dct file, and load into Stata by typing:

infile using population.dct, clear

The Family Income and Expenditure Survey (FIES) of the Philippines is another example of a fixed-width data, with a dictionary in a separate file. The image above—a string of numbers broken by blank spaces—is taken from a FIES data file. You can load this file into Stata with the help of the data dictionary, fies.dct, that accompanies the data, fies.dat:

infile fies.dct, using(fies.dat) clear

Leave a Reply