Economics should be open

October 1, 2009

Difference between WLS and weighted average

Filed under: Data Insights, Stata — howardchong @ 11:00 pm

I hear the phrase “what does it look when we weight the data” a lot. It confused me for a while, but I figured it out: it could mean two things, so the response should be, which of the two do you want?

Weighted Least Squares and weighted average are opposite concepts, in a sense.



August 26, 2009

List of european power plants, data sources for electricity generation

Filed under: Carbon Trading, Data Insights, Energy, Open Source — howardchong @ 6:19 pm

I was looking for a list of power plants in Europe in 2008. I didn’t find one. You know why? It just got created in late 2008, and I just found it in 2009.

More beta below the bump.


June 4, 2009

Billing Data and Randomized Experiments in Energy Efficiency Evaluation a research survey

Filed under: California, Data Insights, Energy, Residential — howardchong @ 9:44 pm

I’m doing a research survey of empirical evaluations of energy efficiency using billing data. Much evaluation is done in the laboratory and these estimates are extrapolated to the field. I’m looking at whether field data has been used to test the laboratory assumptions. I found one by Dubin et al from 1986. I review why this is important and other related articles. This is part of my ongoing research so feedback, especially detailed and esoteric knowledge are greatly appreciated.


April 3, 2009

Why Chinese Electric Cars are different

Filed under: China, Data Insights, Energy — howardchong @ 7:10 am

NYTimes just reported that China’s going gangbusters on electric car development:

Are they going to win with stolen foreign patents? Or putting up trade barriers? Or just win with their extremely cheap labor.

Nope. That may be some of it, but one big issue is the price of electricity. Electricity is cheap in China. (more…)

December 2, 2008

MS Access Tables to Stata, for Residential Energy Consumption Survey

Problem: To convert MS Access tables (of the EIA Residential Energy Consumption Survey 1997 data) to CSV files for STATA import

This is also a general script for converting MS Tables to CSVs

Solution: Wrote a VBA script


  1. Open MS Access
  2. Tools | Macros | Visual Basic Editor
  3. Create a new module on your database by doing: Right Clickon database | Insert | Module
  4. Copy the following text (without the line numbers)
    1. Sub ExportAllTablesCsv()
    2. Dim dbMyDB As Database
    3. Set dbMyDB = OpenDatabase(“recs97_converted.mdb”)
    4.     For Each tdfCurrent In dbMyDB.TableDefs
    5.         fileoutname = “C:\” & tdfCurrent.Name & “.csv”
    6.         If Left(tdfCurrent.Name, 2) <> “MS” Then
    7.             DoCmd.TransferText acExportDelim, , tdfCurrent.Name, fileoutname, TRUE
    8.         End If
    10.     Next
    11. End Sub
  5. For your customization:
    1. change your mdb file to your mdb file
    2. The If statement (line 10) is there to deal with the fact that there are certain system tables that I needed to skip. None of my tables started with “MS”, so this was a simple non-general fix.
    3. If you look up help for TransferText it has a SpecificationName. I left that argument blank. You can change to tabs or other things using that.
    4. I had trouble with getting column headings, but jwhite at showed me the light. This code now gives column headings with the TRUE argument.  For more details, check out

If you can think of a better solution, I’d love your thoughts!

Other possible/failed solutions:

  • I tried running a query that joined all the tables with the EIA ID Num as the key for each table. Though this was easy, I got the “too many fields” error.
  • I think there is a way to use SQL (or something???) to select all the tables and write them and to import those into another program. STATA, to my limited knowledge, doesn’t do SQL. 
  • STATA supposedly does XML. One can use TransferText to write XML too. I don’t know XML well enough to try it.

Any comment appreciated.





November 17, 2008

perl script for transposing Stata outreg2 output

Filed under: coding, Data Insights, Stata — Tags: , , — howardchong @ 10:38 pm

I’m using Stata’s outreg2 command and love it. But I run this look over 600 stocks. Excel doesn’t allow me to view 600 columns (Except in the newer version).  So, I need to transpose the outreg2 file. It’s too wide. Too many columns.

My former post on someone else’s perl script ( actually doesn’t work correctly. I had to make two modifications, and the result is the perl script downloadable from here:

The two modifications are that 1) files are saved with tabs rather than commas. No big deal, I just changed the split operator and 2) the original script freaked out when there were blanks in the data. All blanks are ignored.

November 6, 2008

EPER, consistency with indirect emissions, email

So using EPER (European Pollution Emissions Register), I’ve found some anomalies with Aluminum data. The main question is whether some countries might be counting indirect emissions (akin to life cycle analysis). Aluminum,  as I understand it (I can cite a McKenzie report), has very small direct process emissions and mainly uses electricity. Since the CO2 generated is usually attributed to the power plant, that CO2 is “indirectly” emitted by Aluminum producers.

On the same token, my personal direct emissions are the gasoline/petrol and natural gas I burn. The indirect is all the CO2 associated with the electricity I use.

Here is a letter I sent to EPER folks at the European Commission. Hope I get a response!



To Whom It May Concern:
I am a researcher at the Univ of California, Berkeley working with the EPER data. First of all, I want to thank you for providing this information; I’ve found the information very useful and appreciate full access to the database tables (MS ACCESS).
I have a question about the underlying the questionnaire, a copy of which I could not find online (ASIDE: does every country implement their own version of the questionnaire?). My question regards direct vs indirect emissions of CO2 and double-counting. If a manufacturing firm only uses electricity, is the manufacturer’s CO2 emissions zero and the indirect CO2 is reported at the utility level?
Aluminum, anecdotally, should have very small direct CO2 emissions because the main input is electricity. However, the following is just one example of an aluminum producer record with very large CO2 emissions. This leads me to believe that indirect emissions are counted sometimes.
Thank you in advance for your help in understanding the data.

CountryID:  DE

ReportYear:  2004

Emission.FacilityID:  216219


Address:  Aluminiumallee 1

City:  Essen


Emissions (metric tons):  301000

Code:  2.1/2.2/2.3/2.4/2.5/2.6

Description:  Metal industry and metal ore roasting or sintering installations, Installations for the production of ferrous and non-ferrous metals

Text:   Aluminium production                                                                                                                                                                                   

MainActivity:  1

ActivityID:  12

Howard Chong
Dept. of Agricultural and Resource Economics and UC Energy Institute
UC Berkeley
Office: 510-643-4831
Cell: 510-333-0539

October 31, 2008

CITL, coverage for the ETS, matching to EPER

Filed under: Carbon Trading, Data Insights, Open Source — Tags: , , , , , — howardchong @ 9:53 pm

This post is a big deal for me because it really pushes me to stay true to open source principles.

So, here’s the deal.
The ETS is the Emissions Trading Scheme, a cap and trade carbon program in Europe.
The CITL is the Community Independent Transaction Log for the ETS.
The EPER is European Pollution Emissions Register ( which is a European version of the Toxics Release Inventory in the US, only much better in that it covers more emissions (including CO2).

And, my current project is this 50 hour effort to match records in the CITL to records in the EPER.

What’s the big deal? Well, I’m getting insight into what companies were excluded from the ETS, something that may or may not be well highlighted in the national allocation plans. For all the mandarins running the ETS, could it be that they failed to ensure that countries included all units that should be under the ETS in the ETS? It gets to the question of whether allowances were too high (somewhat, my own sense is that economic activity and weather had something to do with the “over-allocation”).

So, here’s the deal. There’s plenty I want to do with this data and I think there is a small time window to do it. So, if you want to work on this project with my matched database, please write me.

As academia is all about getting credit for what you do, we’d have to talk carefully about credit, etc. But my prior is that any work done would be collaborative and everyone gets to share credit.

If you are a private firm doing proprietary market research (i.e., you wouldn’t want what you do with the data to be public), ask me what info you need, and I’ll probably give it to you, perhaps for a fee or some other trade. This information has a full list of contact information for EUA permit holders.

I’m already telling you too much by telling you that there’s something interesting in the EPER-CITL data matching, but that’s the risk I’m taking. Partly because I think it is more important that good research be done and get out there than that I get total credit.

You comments are deeply appreciated.

October 2, 2008

UCLA stata graphing

Filed under: Data Insights, Stata — Tags: , , — howardchong @ 10:57 pm


Just want to give a shout out that the people at UCLA statistical consulting rock my world. They give me lots of understanding about how to get graphs to look they way I want them to.

stata transpose string variable without xpose

Filed under: coding, Data Insights, Stata — Tags: , , , , — howardchong @ 10:52 pm

So STATA will let you transpose the data with the xpose command, but this does not handle string data.



I had a set of stock price series. Variable names were data and the stock codes. rows were days



1/1/2005   $1    $5   $10


12/31/2005 ...


So, I managed to do it as follows:

1) First, rename all stock variables “price”+STOCKNAME

foreach vn of varlist STOCK1-STOCKN {
  quiet: rename `vn’ price`vn’

2) reshape long
3) reshape wide


reshape long price, i(realdate) j(name) string
drop date
reshape wide price, i(name) j(realdate)


Note that I needed realdate to be an integer, so I ran a
gen realdate=date(datestr,”mdy”)
and then dropped date.
If I keep date as a string, I can’t have the slashes in the string variable name, so you do have to somehow convert it to something you want. You can replace the slashes with underscores and then add the “string” argument to the second reshape.

Older Posts »

Create a free website or blog at