Saving variable labels before -collapse-

collapse literally collapses the dataset  into a dataset of summary statistics. After collapse, the dataset in memory is lost unless -preserve- was declared. Also, the labels of all variables in clist are replaced with (stat) variable name, where stat can be mean, sum, etc. (see help collapse).

Instead of retyping all the variable labels, you can use the extended macro function var label (see help extended_fcn) to  save the variable label of each variable in the varlist into a local macro before collapse. Restore the labels using label var (see help label). Example:
sysuse auto, clear

foreach var of varlist * {
    local vlab`var': var label `var'

collapse price - gear_ratio, by(foreign)

foreach var of varlist * {
    label var `var' "`vlab`var''"

Counting occurrence of strings within strings

Somebody asked how to count the number of occurrences of a string within a string. For example, if I have the following data, I want to generate new variables countSS, countSM, and countSG that contains the number of occurrences of “SS”, “SM”, or “SG” in variable awards.

input id str40 awards
1    “SS; SS; SM; SG”
2    “SM; SG”
3    “SG; SG; SG; SS”
4    “SS; SS; SG; SG; SS; SM; SG”

Here is one solution using the macro extended function -subinstr- (-help extended_fcn-).

local tocount SS SM SG
foreach t of local tocount{
gen countt'</em>=0
<strong>local </strong><em>N</em> = _N
<strong>forvalues </strong>i = 1/
local a = awards[i']
<strong>local </strong><em>c</em> : subinstr local  a  "
t'” “t'" , all  count(local <em>c2</em>)
<strong>replace </strong>count
t’ = c2' ini’

*Thanks to Jacob Reynolds ( for the question. Although, for the best advise on Stata, Statalist is the best place to ask :). See Stuck? Hello Statalist .

File names in a local macro

Last week I needed to convert a number of Stata data files into text files so that they can be uploaded to Googledocs (why Googledocs is another story).  If my file names have a specific pattern, such as:


-forvalues- would have done the trick.

forvalues y=1/9{
use “data200y'.dta", clear
outfile using "data
200y’.txt”, dictionary replace

My file names, however, look like the file names you get when you type -sysuse dir-:


If there were few of them, it would have been alright to place them in a local macro by listing all the file names.

local datafiles auto autornd bplong bpwide cancer
foreach file of local datafiles{
use <em>file</em>'.<em>dta</em>, clear
<strong>outfile using </strong>"
file.txt“, dictionary replace

The latter, however, is not efficient since I have to manually type so many file names so that they can be stored in a macro. Here is where Stata’s extended macro functions (-help extended_fcn-) comes to the rescue. Stata has exactly the right function for what I wanted to do.

local datafiles : dir . files  “*.dta”
foreach file of local datafiles{
local filenew : subinstr local file “.dta” “.txt”
use file', clear
<strong>outfile using </strong>
filenew‘, dictionary replace

local datafiles : dir . files  “*.dta”— makes a list, named datafiles, of all files with the extension .dta in the current directory .

local filenew : subinstr local file “.dta” “.txt”— replaces the file extensions of the file names from .dta to .txt and makes a new list, named filenew, of the new file names.