I’m doing a research survey of empirical evaluations of energy efficiency using billing data. Much evaluation is done in the laboratory and these estimates are extrapolated to the field. I’m looking at weather field data has been used to test the laboratory assumptions. I found one by Dubin et al from 1986. I review why this is important and other related articles. This is part of my ongoing research so feedback, especially detailed and esoteric knowledge are greatly appreciated.

Read More »

As researchers, we try hard to be rigorous, but one of the biases we are most susceptible to is that our local world is representative of and similar to the rest of the world. This is untrue, except given homogeneity. If there is any variation, then at least somebody is “abnormal”.

Hence, I was surprised to learn that CA is not normal in terms of heating fuel use and that this non-normality is pretty significant. CA is extremely natural gas intensive, with the most credible reports (consider CA RASS 2004) saying that ~70-90% of homes heated by natural gas. Using a national data set, we can see the variation. In 1997 (RECS), 68% of households in CA used natural gas heating. Oregon and Washington, cooler climates with more heating load, had 20% of households using natural gas, and 62% heating with electricity. Part of this may be due to  the lower prices for electricity in the Pacific Northwest.

What does this mean? It means that I shouldn’t be comparing CA to OR and WA because it’s an apples to oranges comparison. I also shouldn’t be doing difference in difference analyses because whatever treatment we can think of will be polluted by these other state-specific factors.

Read More »

I had always thought that driving was way worse for the environment than taking mass transit. Just think of it; I’m lugging around 2 tons of metal to move me from one place to another. A motorcycle is much better, but a car is so dang convenient. And the road network is perfectly designed to work best with cars.

 

But out comes some studies with contradictory evidence that says that transit may actually be worse for the environment. The key here, is that a bus gets about 5-7 mpg; so you need about 6 people riding on average (for the whole line, not just the middle) to equal each person driving separately by car. Where this is true, it’s very carbon efficient. Where it is not, well, that may be what’s happening in Cleveland, OH.

 

Read More »

NYTimes just reported that China’s going gangbusters on electric car development:

http://www.nytimes.com/2009/04/02/business/global/02electric.html?em

Are they going to win with stolen foreign patents? Or putting up trade barriers? Or just win with their extremely cheap labor.

Nope. That may be some of it, but one big issue is the price of electricity. Electricity is cheap in China. Read More »

This article explains why a alarge hedge fund manager doesn’t like the PPIP. Political risk, and more details about the execution make it look even more fishy. There are supposed to be only 5 fund managers that manage these funds.

 

More here:

http://www.businessinsider.com/henry-blodget-ray-dalio-why-bridgewater-wont-be-playing-tim-geithners-ppip-2009-4

So, Stiglitz and Krugman have come out against the Geithner Plan, as have I.

 

As a quick exercise, I made an excel worksheet that values a toxic asset under the Geithner plan and under risk neutrality.

You can find it here:  http://are.berkeley.edu/~chong/filesforblog/Geithner%20Arithmetic.xls

If you play around with the percentages, you’ll note, as Krugman has in the NYTimes, that the government is effectively giving investors a free put option.

I’m assuming a 7-to-1 leverage ratio and a 50-50 match.

Do note that this makes clear how the price gets inflated and where the government eats the loss (when the asset price falls more than 12.5%).

However, it completely assumes that the market doesn’t have a good sense of what percentages to put in for the probilities of value for the asset. The holders argue they won’t sell because nobody will buy given the probabilities they believe to be true. The buyers argue that the market is really risky and that these probabilities are too high. It’s like when I go to buy a used car. I always look at the faults and try to say, all these things will probably break in a year. And the seller says, “This is a great car.” If it’s so great, the seller should keep it and hold it to maturity!

One thing that the Geithner plan does get right is that they mandate that the buyers must put these assets in a buy-and-hold strategy.

One thing Geithner may not have considered is how to bundle the assets. He needs to bundle as many assets together as possible!

For example, if the asset is a single mortgage, the probabilities will be nonzero only for  $100 (if they pay it) and $0 (if they default). But if you bundle 1000 of these, you start getting probabilities that aren’t so extreme. (you’d have a binomial distribution distribution). Whether you are buying a block of 1000 or a single mortgage doesn’t matter normally, but the Geithner plan has a non-recourse loan that is  MUCH more valuable if you buy each one separately.

It could be argued that the government exposure is actually going to be quite small if they bundle assets together enough. There is the problem that there is too much correlation across assets so bundling might not work. In that case, the government might want to throw some negatively correlated assets in with the sale.

There’s still the cheating issue, though. See http://opensourceeconomics.wordpress.com/2009/03/25/how-to-scam-the-geithner-plan/

I have it on good authority that “The cluster option is robust for within-cluster serial correlation of arbitrary form.” Also, that it is ok to run fixed effects on the same level as your clustering.

Question1:  If I have panel data (individual-year) and have individual level fixed effects, does it make sense to cluster on the individual level?

 

Question2:  If I have panel data (individual-year) and have state-level fixed effects, does it make sense to cluster on the state level?
The answer to both, according to Michael Anderson, is Yes, you can do it.

 

Stata implements cluster and robust together. I think that the original Moulton clustering specification is not robust to serial correlation.

 

I wrote (and was wrong, apparently):

Thank you very much, Michael.

When you write about

“If, however, observations that are close together (along the time dimension) have a higher correlation than observations that are far apart (along the time dimension), then the fixed effect will not remove this form of serial correlation.”,

my understanding is that clustering will not correct for this, but a serial correlation correction will.
The reason is that, within a cluster, clustering ignores time. Page 3 of Imben’s notes (http://are.berkeley.edu/courses/ARE213/spring2006/lect4_06jan26.pdf) shows the ZZ’ matrix, so clustering would treat the correlation between time 1 and T the same as between time 1 and 2.
Maybe you’re talking about a clustering estimator that uses a different ZZ’ matrix that decays further out from the cluster’s diagonal?

 

And this is how Stata models the clustering:

I just checked, and Stata implements the robust clustered standard error estimation as detailed here:

which is different than the Moulton technique Imbens taught. The robust cluster version probably controls for serial correlation as you say, if the Huber-White robust standard errors also controls for it.

————-  This is what I *thought* was true. I guess if I care to argue, I’d have to run a monte-carlo simulation and see.

The basic message is, don’t cluster on the fixed effect variable. The two are redundant, I think. But they are definitely not the same.

The secondary message is that I am not 100% sure. This isn’t a proof, but just the write up of my sketch of the understanding.

Read More »

I use conditional formatting a limited amount in Excel. It is great to visually highlight certain data. But It is not quite as useful as autofilter, which it seems like I use daily.

In any case, here are some good links for conditional formatting in the order of their usefulness to me:

 

Just a note to show how to use regular expressions in stata for text processing.

PROBLEM: I had a lot of codes in variable dsmnem the had “:” and “.” characters. I wanted to do a reshape my data and use these strings as the j variable, i.e. “reshape … j( dsmnem) string”
SOLUTION: regular expressions

replace dsmnem=lower(regexr(dsmnem,”:”,”_”))
replace dsmnem=regexr(dsmnem,”\.”,”")

These two lines replace periods and colons with emptytext and underscore respectively. Note that I have to use the escape character to specify the period character; otherwise the period has a special meaning in the regular expression.

Weird how they call it “regexr” and not “regexp” or “regexpr”, but whatever.

By the way, dsmnem is datastream mnemomic

I just found the unique stata command.

PROBLEM: I have a correspondence table of companies to domains. One company can have multiple domains. I wanted a count of the number of unique companies.
SOLUTION: download the “unique” stata command. Install by running “ssc install unique”. You can also read more about it from the site: http://ideas.repec.org/c/boc/bocode/s354201.html

ALTERNATIVE: I used to just do “keep company” and then “duplicates drop”. It was a hack, but it worked. If you have a small number, a easy way to do it is “tab company” and just count the lines.