Economics should be open

October 31, 2008

CITL, coverage for the ETS, matching to EPER

Filed under: Carbon Trading, Data Insights, Open Source — Tags: , , , , , — howardchong @ 9:53 pm

This post is a big deal for me because it really pushes me to stay true to open source principles.

So, here’s the deal.
The ETS is the Emissions Trading Scheme, a cap and trade carbon program in Europe.
The CITL is the Community Independent Transaction Log for the ETS.
The EPER is European Pollution Emissions Register ( which is a European version of the Toxics Release Inventory in the US, only much better in that it covers more emissions (including CO2).

And, my current project is this 50 hour effort to match records in the CITL to records in the EPER.

What’s the big deal? Well, I’m getting insight into what companies were excluded from the ETS, something that may or may not be well highlighted in the national allocation plans. For all the mandarins running the ETS, could it be that they failed to ensure that countries included all units that should be under the ETS in the ETS? It gets to the question of whether allowances were too high (somewhat, my own sense is that economic activity and weather had something to do with the “over-allocation”).

So, here’s the deal. There’s plenty I want to do with this data and I think there is a small time window to do it. So, if you want to work on this project with my matched database, please write me.

As academia is all about getting credit for what you do, we’d have to talk carefully about credit, etc. But my prior is that any work done would be collaborative and everyone gets to share credit.

If you are a private firm doing proprietary market research (i.e., you wouldn’t want what you do with the data to be public), ask me what info you need, and I’ll probably give it to you, perhaps for a fee or some other trade. This information has a full list of contact information for EUA permit holders.

I’m already telling you too much by telling you that there’s something interesting in the EPER-CITL data matching, but that’s the risk I’m taking. Partly because I think it is more important that good research be done and get out there than that I get total credit.

You comments are deeply appreciated.


October 2, 2008

UCLA stata graphing

Filed under: Data Insights, Stata — Tags: , , — howardchong @ 10:57 pm


Just want to give a shout out that the people at UCLA statistical consulting rock my world. They give me lots of understanding about how to get graphs to look they way I want them to.

stata transpose string variable without xpose

Filed under: coding, Data Insights, Stata — Tags: , , , , — howardchong @ 10:52 pm

So STATA will let you transpose the data with the xpose command, but this does not handle string data.



I had a set of stock price series. Variable names were data and the stock codes. rows were days



1/1/2005   $1    $5   $10


12/31/2005 ...


So, I managed to do it as follows:

1) First, rename all stock variables “price”+STOCKNAME

foreach vn of varlist STOCK1-STOCKN {
  quiet: rename `vn’ price`vn’

2) reshape long
3) reshape wide


reshape long price, i(realdate) j(name) string
drop date
reshape wide price, i(name) j(realdate)


Note that I needed realdate to be an integer, so I ran a
gen realdate=date(datestr,”mdy”)
and then dropped date.
If I keep date as a string, I can’t have the slashes in the string variable name, so you do have to somehow convert it to something you want. You can replace the slashes with underscores and then add the “string” argument to the second reshape.

perl script for transposing csv

Filed under: coding, Data Insights — Tags: , — howardchong @ 12:05 am



I needed to tranpose a 600×5 csv (comma separated values) file so I could read it in Excel 2003.

Found what I needed here:

However, I did need to modify the code one bit. See the discussion below


Thanks for the script.

Just a comment though. You have the if condition:
elsif ($AoA[$j][$i] eq “”){
print RESULT “\n”;

this ignore that all elements past j in $AoA[$j][$i].

That is, if you have any missing values that are coded as blanks, this imposes that blanks are afterwards. I think this is probably good for your dataset (you have streams of observations of different lengths (?))

Since my data has missing observations coded as blanks, I’m gonna remove this elseif condition.

As an example

a csv file with one line:
1, 2, 3, 4, 5, , 7, 8, 9

would be transposed to:

and the values after the blank would get dropped off.

October 1, 2008

stata, double quotes, file names with spaces, foreach

Filed under: Data Insights, Stata — Tags: , , , , , , , — howardchong @ 9:59 pm

It took me more than 10 minutes to figure out, so I’m posting this tip here.

I needed to loop over several files. Pseudocode would look like this:

global files a.csv b.csv c.csv

foreach file of global files {



Problem was that my filenames had spaces. I tried the following, but it didn’t quite work.

global files “a data.csv” “b data.csv” “c data.csv”


The solution that finally worked was to use Stata’s annoying double quotes:


global filelist `””STOXX 600 1of3.csv” “STOXX 600 2of3.csv” “STOXX 600 3of3.csv””‘


Note the extra `” and “‘ at the ends.

Create a free website or blog at