In Preserving numerical format after string transformation, we used -substr- to generate a new variable that contains the first 2-digits of a numerical code. We have transformed our numeric variable,

*code*, into a string because, as we said, P-E-M-D-A-S will not do the trick. To recall, we used the following command:

**gen**

*string3*=substr(string(

*code*, “

*%06.0f*“),

*1*,

*2*)

While it is true that the basic arithmetic operations will not help, there exists a function in Stata that will return the same first 2-digit code. This is the int() function. int(x) returns the integer obtained from dropping the digits after the decimal point of number x. For example:

**gen**

*string4*=int(

*code*/10000)

/* This will return the first 2 digits of the 6-digit code. */

Thus, if

*code*==250001,

*string4*==25; and if

*code*==10001,

*string4*==1.

[Note: int() does not round the number x to the nearest integer; for this use the function round().For other math functions, type: “help math functions”.]

Now, suppose you want to keep the last 4 digits. Will int() help? Yes:

**gen**

*string5*=

*code*– int(

*code*/10000)*10000

If

*code*==250001,

*string5*==1 (

*string5*==250001-250000); and if

*code*==10001,

*string5*==1 (string5==10001-10000).

Similarly, you can use modular arithmetic (also called “remainder arithmetic”). In Stata, the modulo function is mod():

**gen**

*string6*=mod(

*code*, 10000)

If

*code*==250001, mod(250001,10000)==1.

Filed under: Basic functions Tagged: | int, mod, round, string, substr

andrew fischer lees, on 16 February 2011 at 12:54 PM said:very helpful -thanks!