Concatinating string variables


To concatenate is to join the characters of 2 or more variables from end to end. In Stata, this can be done by using either -gen- or -egen-. As example, suppose we have the variables var1, var2, and var3. var1 and var2 are string variables, while var3 is numeric. The values in _n==1 (line 1 of the data) are the following:

var1 var2 var3
aaa  bbb  1111

gen var4 = var1 + var2 /* concatenates var1 and var2 */

var1 var2 var3 var4
aaa  bbb  1111 aaabbb

But,

gen var5 = var1 + var3

will return a “type mismatch” error because var3 is not a string. To concatenate var1 and var3, either transform var3 to string first or use -egen-.

tostring var3, replace
gen var5 = var1 + var3

OR

egen var5 = concat(var1 var3)

-egen concat()- automatically converts var3 into string so that it can be joined with var1.

You may also concatenate the variables with other characters or strings. For example:

gen var5 = var1 + “-” + var2
OR
egen var5 = concat(var1 var2), punct(“-“)

var1 var2 var3 var5
aaa  bbb  1111 aaa-bbb

gen var6 = var1 + ” to ” + var2
OR
egen var5 = concat(var1 var2), punct(” to “)

var1 var2 var3 var6
aaa  bbb  1111 aaa to bbb

Type -help egen- for more concat() options.

2 Responses

  1. Thank you so much. Helpful really

  2. Very helpful.
    thanks,
    AM.

Leave a Reply