Color palette


There is no better way to choose colors than to see the actual colors, not just the list of colors as in those you get when you type:

help colorstyle

Stata has a built-in color viewer -palette color-, but this command can only show you upto two colors: Figure 1 results from typing:

palette color blue khaki



There are at least 2 user-written programs to display color palettes: Nick Winter’s -full_palette- and Michael Mitchell’s -vgcolormap-. -full_palette- displays a palette of Stata’s graph colors, with their names and RGB numbers (Figure 2). -vgcolormap- displays Stata’s colors and their names (Figure 3).

To use -full_palette-, install the program by typing:

ssc install full_palette

then type (start with an empty data set):

full_palette

Figure 2: full_palette


To use -vgcolormap-, install the following:

net from http://www.stata-press.com/data/vgsg
net install vgsg

then type:

vgcolormap

Figure 3: vgcolormap


Note that by installing vgsg, you are not only installing vgcolormap, but also other resources associated with the book A Visual Guide to Stata Graphics, Third Edition. To find out what has been installed, type:

whelp vgsg

Stacked bars


-graph bar- and -graph twoway bar- draw Stata (vertical) bar charts. -graph bar- draws bar charts over a categorical X variable and has more options than -graph twoway bar-, which draws bar charts with numerical X and Y values. Here, we use -graph bar- to draw stack bar graphs (figures 1 and 2).

In figure 1, we show the composition of Philippine exports according to the Leamer’s classification of tradeable goods at different time periods. The command to draw this graph is:

Figure 1


#delimit ;
graph bar (sum) value,
over(leamer)
over(year)
asyvars stack
legend(cols(3) size(vsmall))
ytitle(“Export Value” “(in million US$)”)
ylabel(, angle(0) format(%12.0gc))
title(Philippines)
subtitle(1965-2007);
#delimit cr

The (stat) option, (sum), adds up the export values for commodities that belong to a group for a specific year, as we have specified in our over() options. The (stat) option, which allows you to specify all stats options used by the command -collapse- (e.g. mean (default), count, max, etc.) is very convenient as it saves you from  constructing another dataset just to draw the graphs.

By typing “asyvars” and “stack”, we are specifying that the variable in the first over() option will be treated as Y-axis variables, and that these y-variables will be presented as stacked bars. Note that, with a long dataset,  the command:

graph bar value, over(leamer) asyvars

will draw the same graph as:

graph bar value_leamer1 value_leamer2 … value_leamer10

with a wide dataset.

The ordering of over() matters. In figure 2, we have interchanged the order of over(leamer) and over(year).

Figure 2


#delimit ;
graph bar (sum) value,
percentages
over(year)
over(leamer, label(angle(90)))
asyvars stack
legend(cols(5) size(small) colfirst)
ytitle(“Export Share, %“)
ylabel(, angle(0) format(%10.0gc))
title(Philippines)
subtitle(by Leamer’s classification)
;
#delimit cr

By specifying “percentages” for figure 2, we are specifying that the share of value the total, instead of the absolute values, will be reported.

Also note how the legends in the 2 charts differ. The general option to control the legend for all Stata graphs is the option legend(legend options). In the examples above we have use the following legend options:

cols() //number of columns
size() //size of texts
colfirst //specifies that the order is top to bottom rather than left to right

For more legend options, type “help legend_option”.

-histogram- and the -addplot- option




The beauty of Stata graphics lies in their flexibility—they can be highly customized. To illustrate, the histogram of proximity (right photo) was drawn using the command:

#delimit ;
histogram proximity, bcolor(ltblue) width(0.025) start(0) freq
addplot(histogram proximity if proximity>0.4, bcolor(yellow) width(0.025) start(0) freq ||
histogram proximity if proximity>0.55, bcolor(blue) width(0.025) start(0) freq ||
histogram proximity if proximity>0.65, bcolor(red) width(0.025) start(0) freq)
ytitle(“Number of Links”)
ylabel(0(10000)30000, format(%8.0gc) angle(0))
xlabel(0.20 0.40 0.55 0.65 0.85)
xtitle(“Proximity”)
note(“Note: The total number of links for the 779 products is (779×778)/2=303,031.”)
legend(off)
;

[Note: Since graph commands can be very long, they can be managed better by changing the delimiter to “;” ]

I know! Isn’t it easier to draw this in Microsoft Excel? If you already have a summary of the frequencies in Excel and you only need to draw it once, maybe it is easier to draw it there. But if you are to draw similar charts for 20 countries at different periods in time, Stata will make your life so much easier. Going back to our example above, this command draws 4 overlaying histograms of the same variable, proximity, in different colors.

The first line “histogram proximity, bcolor(ltblue) width(0.025) start(0) freq” draws the histogram in light blue wherein the width of each bin is equal to 0.025. By specifying start(0), we force the minimum of the range to be zero instead of the actual minimum in the data; and by specifying the option “freq”, we want to draw the frequencies not the densities or percentages. The second line uses the -addplot- option to overlay 3 more histograms in different colors. The -addplot- is an option that add plots to graphs that are not of  Stata’s -graphcommand (we will elaborate on this in future post), such as -histogram-.  If we only need to show the histogram of proximity, -addplot- here is not necessary. But, since we want to highlight different ranges of proximity values, we use -addplot- to create the different color effect.  The rest of the command are explained below:

ytitle(“Number of Links”)
/* y-axis title */

ylabel(0(10000)30000, format(%8.0gc) angle(0))
/* specifies that: (1) the y-axis will be labeled from
0 to 30000 with intervals of 10000 (2) the format of the
lables is a generic format with comma; and (3) the
orientation of the labels is horizontal */

xlabel(0.20 0.40 0.55 0.65 0.85)
/* specifies that the x-axis will be labed by the
specified numbers only */

xtitle(“Proximity”)
/* x-axis title */

note(“Note: The total number of links for the 779 products is (779×778)/2=303,031.”)
/*create a note */

legend(off)
/* supresses the legends */

All these options are general -graph- options. Once you are familiar with these most commonly used options,  you can forget Excel charts.