creative destruction: collapse and contract


Creative destruction, coined by Joseph Schumpeter in Capitalism, Socialism, and Democracy, refers to the process by which new innovations kill old inefficient products or processes. But we are not talking about that but instead, of destroying data to create more useful information. By destroying, we mean altering the data currently loaded in memory with no undo button to rely to. When you load or open data into Stata, Stata stores the data in your machine’s RAM. Any changes made, therefore, are not permanent or saved in your hard drive until you call on save, but still be careful that you do not overwrite your raw data files.
Continue reading

Saving variable labels before -collapse-


collapse literally collapses the dataset  into a dataset of summary statistics. After collapse, the dataset in memory is lost unless -preserve- was declared. Also, the labels of all variables in clist are replaced with (stat) variable name, where stat can be mean, sum, etc. (see help collapse).

Instead of retyping all the variable labels, you can use the extended macro function var label (see help extended_fcn) to  save the variable label of each variable in the varlist into a local macro before collapse. Restore the labels using label var (see help label). Example:
sysuse auto, clear

foreach var of varlist * {
    local vlab`var': var label `var'
    }

collapse price - gear_ratio, by(foreign)

foreach var of varlist * {
    label var `var' "`vlab`var''"
    }