Can’t get enough of factor variables


In relation to yesterday’s blog post, here are three links to useful sites/files that provide very good introduction to Stata’s factor variables:

Christopher Baum’s Factor Variables and Marginal Effects in Stata 11

Michael Mitchell’s Stata Tidbit of the Week – What the ##?

UCLA ATS’s What’s new in Stata 11: Factor variables, margins and interactions

Getting to know “factor variables”


This is an update to the earlier post i. without the prefix -xi-. So the i.‘s (or “i options” as Joe Glass called it) have a name. Stata calls them “factor variables” and there is more to them than i. .See -help fvvarlist- for the documentation and some very helpful examples.

i. without the prefix -xi-


This blog (including the pile of books to read, the GRE reviewers to go through, the TV shows to watch…this list will never end) has been neglected for more than week. Anyway… here is i. .

i. is usually used with the -xi- prefix. But it may also be used without -xi-. i. allows you to include dummy variables for a categorical variable without explicitly generating new variables for each category.

For example:

sysuse auto, clear
reg price i.rep78

is the same as generating dummy variables for each of the n=5 categories of rep78 first and including n-1 of them in the regression:

sysuse auto, clear
tab rep78, gen(rep78_)
reg price rep78_2 rep78_3 rep78_4 rep78_5

By default, the category with the lowest value (in this case, n=1) is omitted. No new variables are generated using the command above. Without the -xi- prefix, however, the use of i. is limited to only one of the four possible dummy variable creation allowed with -xi-. With -xi-, it is possible to directly specify interactions. Also, with -xi-, it is possible to choose the omitted category. See –help xi-.