-forvalues- and the other -use- syntax

-use-, which loads a Stata-format dataset (*.dta), is most likely the first command one learns in Stata. Every Stata user masters the syntax:

use filename, [clear]         /loads file filename/

[Note: Stata ignores everything inside /  /]

But the other -use- syntax:

use [varlist] [in] [if] using filename, [clear]         //loads a subset of file filename

is most often ignored.

[Note: Stata ignores everything in the line after //]

When is it practical to use this other -use- syntax? If you are working with small data files (<< 1gb), you will not have any problem processing this in today’s ordinary 32- or 64-bit computers using a 32-bit Stata. However, if you are working with larger files, you will find it very slow or impossible to open huge files in Stata. A 32-bit computer (even if you have a 4GB RAM) allocates limited memory for individual task; and, even if you have a 64-bit computer but running a 32-bit Stata, there is not much you can do to expand the memory allocated to Stata. But, who says that you need to open the huge file at once? You can open the file by batch by using the other -use- syntax! For example, you can open the data census.dta by batch according to its geographic variable, reg, with integer entries 1 to 20:

forvalues i=1/20 {
use if reg==<em>i</em>' using <em>census.dta</em> /* loads all observations with reg==i’ /
save censusi'.dta</em>, replace                  / saves the subset file to censusi’ */

The first loop above loads all observation with reg==1 and saves the file to census1.dta; The second loop loads all observations with reg==2 and saves the file to census2.dta; and so on.

[Note: the punctuations that enclose i are not the same. The first (`) is a backquote (the one below the tilde, ~, in the keyboard) and the second (‘) is the apostrophe (the one below the quotation mark “).]

Related post: -foreach- for all

Leave a Reply