Economics should be open

October 13, 2009

Notepad++ and Stata, a better do file editor

Filed under: Stata — howardchong @ 1:16 am

I do not like the bult in stata editor. It makes reading stata do files a chore. I come a bit from the programming world which will show commented lines and blocks in a different color and highlight reserved words. I looked for al alternative stata text editor / do file editor and like Notepad++.

Notepad++ is a good alternative. You can still run blocks of code (like control-D) and who do file (like control-R) if you set it up. Plus it’s free.


October 1, 2009

Difference between WLS and weighted average

Filed under: Data Insights, Stata — howardchong @ 11:00 pm

I hear the phrase “what does it look when we weight the data” a lot. It confused me for a while, but I figured it out: it could mean two things, so the response should be, which of the two do you want?

Weighted Least Squares and weighted average are opposite concepts, in a sense.


March 26, 2009

Sample regular expressions for stata

Filed under: coding, Stata — howardchong @ 10:22 pm

Just a note to show how to use regular expressions in stata for text processing.

PROBLEM: I had a lot of codes in variable dsmnem the had “:” and “.” characters. I wanted to do a reshape my data and use these strings as the j variable, i.e. “reshape … j( dsmnem) string”
SOLUTION: regular expressions

replace dsmnem=lower(regexr(dsmnem,”:”,”_”))
replace dsmnem=regexr(dsmnem,”\.”,””)

These two lines replace periods and colons with emptytext and underscore respectively. Note that I have to use the escape character to specify the period character; otherwise the period has a special meaning in the regular expression.

Weird how they call it “regexr” and not “regexp” or “regexpr”, but whatever.

By the way, dsmnem is datastream mnemomic

Stata “unique” command helpful

Filed under: Stata — howardchong @ 9:58 pm

I just found the unique stata command.

PROBLEM: I have a correspondence table of companies to domains. One company can have multiple domains. I wanted a count of the number of unique companies.
SOLUTION: download the “unique” stata command. Install by running “ssc install unique”. You can also read more about it from the site:

ALTERNATIVE: I used to just do “keep company” and then “duplicates drop”. It was a hack, but it worked. If you have a small number, a easy way to do it is “tab company” and just count the lines.

February 6, 2009

Traversing a directory in stata

Filed under: coding, Open Source, Stata — howardchong @ 12:17 am

I found a nice way to traverse a directory and load all the files in the directory. The key stata commands are to run a directory listing and output the list to a file. Then, you just have to use “levelsof” (or levels) to get your file names.

PROBLEM: Load a bunch of fixed-width TXT files in a directory without having to list all the file names.
STATA code

Unfortunately stata has some lame limits on the number of characters in a string (type “help limits”, 244 is the smallest limit), so this will break if you have too many files. You can probably fix this by using a matrix of strings, but I didn’t need to do that.

Another kludgy way of loading ALOT of files would be to not use levels of and each time open up the filelist.txt file and do something like “local filename_to_get=v1[`i']” and loop over i=1 to numfiles.

January 8, 2009

file “outreg2_prf.ado” not found

Filed under: coding, Open Source, Stata — howardchong @ 1:03 am

So, I get the above stata error when using outreg2 which I install with “ssc install outreg2″.

This articles tells you (1) what I did to trigger the error message and (2) what steps I took to fix it.

UPDATE JUL2009: A good comment below suggests (from stata staff) that  it has to do with disk writes. So, that’s the best answer to date.


December 2, 2008

MS Access Tables to Stata, for Residential Energy Consumption Survey

Problem: To convert MS Access tables (of the EIA Residential Energy Consumption Survey 1997 data) to CSV files for STATA import

This is also a general script for converting MS Tables to CSVs

Solution: Wrote a VBA script


  1. Open MS Access
  2. Tools | Macros | Visual Basic Editor
  3. Create a new module on your database by doing: Right Clickon database | Insert | Module
  4. Copy the following text (without the line numbers)
    1. Sub ExportAllTablesCsv()
    2. Dim dbMyDB As Database
    3. Set dbMyDB = OpenDatabase(“recs97_converted.mdb”)
    4.     For Each tdfCurrent In dbMyDB.TableDefs
    5.         fileoutname = “C:\” & tdfCurrent.Name & “.csv”
    6.         If Left(tdfCurrent.Name, 2) <> “MS” Then
    7.             DoCmd.TransferText acExportDelim, , tdfCurrent.Name, fileoutname, TRUE
    8.         End If
    10.     Next
    11. End Sub
  5. For your customization:
    1. change your mdb file to your mdb file
    2. The If statement (line 10) is there to deal with the fact that there are certain system tables that I needed to skip. None of my tables started with “MS”, so this was a simple non-general fix.
    3. If you look up help for TransferText it has a SpecificationName. I left that argument blank. You can change to tabs or other things using that.
    4. I had trouble with getting column headings, but jwhite at showed me the light. This code now gives column headings with the TRUE argument.  For more details, check out

If you can think of a better solution, I’d love your thoughts!

Other possible/failed solutions:

  • I tried running a query that joined all the tables with the EIA ID Num as the key for each table. Though this was easy, I got the “too many fields” error.
  • I think there is a way to use SQL (or something???) to select all the tables and write them and to import those into another program. STATA, to my limited knowledge, doesn’t do SQL. 
  • STATA supposedly does XML. One can use TransferText to write XML too. I don’t know XML well enough to try it.

Any comment appreciated.





November 17, 2008

perl script for transposing Stata outreg2 output

Filed under: coding, Data Insights, Stata — Tags: , , — howardchong @ 10:38 pm

I’m using Stata’s outreg2 command and love it. But I run this look over 600 stocks. Excel doesn’t allow me to view 600 columns (Except in the newer version).  So, I need to transpose the outreg2 file. It’s too wide. Too many columns.

My former post on someone else’s perl script ( actually doesn’t work correctly. I had to make two modifications, and the result is the perl script downloadable from here:

The two modifications are that 1) files are saved with tabs rather than commas. No big deal, I just changed the split operator and 2) the original script freaked out when there were blanks in the data. All blanks are ignored.

November 14, 2008

STATA: Generating a bunch of lagged variables

Filed under: coding, Stata — Tags: , — howardchong @ 10:23 pm

This small blog post is just a note on how to create a bunch of lagged variables using a simple forvalues loop.

* this gives you a list of your variables
foreach varname in varlist qqq - zzz {
* this says to generate lagged variables for all variables in the
* variable list between qqq and zzz
  forvalues i=1/9 {
  *generate 9 lagged values for each
     by date, sort: gen lag`i'`varname'=`varname'[_n-`i']

so, if you have variables 10 varaibles between qqq and zzz inclusive, this script will generate 9 lagged variables for each.

October 2, 2008

UCLA stata graphing

Filed under: Data Insights, Stata — Tags: , , — howardchong @ 10:57 pm


Just want to give a shout out that the people at UCLA statistical consulting rock my world. They give me lots of understanding about how to get graphs to look they way I want them to.

Older Posts »

The Silver is the New Black Theme Blog at


Get every new post delivered to your Inbox.