Error in reading DBF

In Fun with maps in Stata I noted an error that occurs when trying to convert shapefiles from PhilGIS to Stata data files using -shp2dta- (Kevin Crow). It turns out that the column widths are more than 244 (even though the strings in the cells are not long). The easiest way to get around this is to open the dbf file using another program and change the column widths.[1] In my case, I used MS Excel.[2] If you are using Excel 2003, there is a ‘save as dbf’ option. But for some reason, this option is not available in Excel 2007 (what were the geniuses at Microsoft thinking?). Here is where the ExcelToDBF add-in, developed by Andrea Bonfiglio, comes to the rescue.[3] Once you have download and installed the add-in, the ‘DBF IV’ option will be added to the ‘Save As’ menu. To illustrate, here is a screencast.

[1] Some Statalist posts hint that one can use Mata. Didn’t try that.
[2] Al Tongco of PhilGIS also suggested the free spreadsheet program OpenOffice Calc.
[3] I have also tried 2 other add-ins (SaveDBFIV and XLSX2DBF) but they returned an error when I tried to convert a huge file.

Fun with maps in Stata

Today I was toying with Kevin Crow’s -shp2dta- and Maurizio Pisati’s -spmap-. -shp2dta- transforms spatial datasets in shapefile format into Stata data format and -spmap- graphs data onto maps. For starters, I browsed through Stata’s FAQ on -spmap- and Friedrich Huebler’s blog post “Guide to creating maps with Stata”, where instructions are clearly detailed. And, of course, I also checked out -help spmap- and -help shp2dta-.

It is surprising to find how easy it is to represent data onto maps using Stata. I know people who hire GIS specialists or buy expensive GIS software to draw the simplest of maps while they are armed with Stata. I have nothing against GIS experts or GIS software packages. Sure they are needed for some sophisticated GIS applications. But for very basic maps, I think -spmap- is more than enough.

The availability of basic geo-referenced database is also becoming less and less of a constraint. Global Administrative Areas (GADM), for example, provides free access to GIS datasets for almost all countries. PhilGIS* also provides free access to GIS data on the Philippines.

Using free shapefile datasets from GADM and provincial poverty data** from the Philippine National Statistics Coordination Board (NSCB), here is my first map:

The map shows poverty incidence in the Philippines by province for 2003 and 2009. Each shade of red represents a quartile (the default), where the darkest shade corresponds to the provinces with the highest incidence rates.

Here’s how I created the map:
cd c:/map    /* Change the directory to where the shapefiles are */

/* Convert shapefiles into Stata data */
shp2dta using PHL_adm1, database(phdb) coordinates(phxy) ///
    genid(id) replace

/* This created 2 Statadata files phdb.dta (master dataset) 
and phxy.dta (basemap dataset), where the variable id in the master
dataset corresponds to the variable _ID in the basemap dataset. */

use phdb.dta, clear /* Load the master dataset */
merge 1:1 ID_1 using prov.dta 
             /* Merge poverty data into the masterfile */
assert _merge==3 /* Make sure all provinces are merged */
drop _merge

/* Draw the maps */
local year 2003 2009
foreach y of local year{
    cap graph drop pov`y'
    spmap pov`y' using phxy.dta, ///
    id(id) fcolor(Reds)  ///
    title("Poverty Incidence `y'") ///
    name(pov`y') nodraw
graph combine pov2003 pov2009

  • PhilGIS has incorporated Philippine Standard Geographic Classification (PSGC) codes into the barangay (village) level dataset. This is very helpful. I have not used PhilGIS data yet because Stata returns the following error when I apply -shp2dta- using the data from PhilGIS:

st_addvar():  3300  argument out of range
read_dbf():     –  function returned error
<istmt>:     –  function returned error

I am waiting for a response from Stata Technical support. A previous Statalist post suggests that the error has something to do with strings exceeding the limit.

** I had to create a Stata data file out of the table in NSCB’s website and add the unique IDs for each province so that I can merge it later with the ‘master’ dataset. Data for Surigao del Norte and Maguindanao were used for Dinagat Islands and Shariff Kabunsuan, respectively, as they were not reported separately for these newly created provinces.