Use of “robust” after regress in Stata seems to be automatic. From the textbooks, robust is more asymptotically efficient, and there is only a small hit for not assuming homoskedasticity.
There is one problem I am encountering, and that is in small samples. If your coefficient of interest has very little variation, be careful. Especially when measuring treatment effects where you have very few or very many treated observations.
Here’s where I had problems: I am conducting a stock market event study. We are trying to figure out what size of event window to use, so we run, for each stock:
return=alpha + beta * market + gamma1 * Day1 +gamma2*Day2 + gamma3*Day3 +epsilon
when we run this with robust, we get much smaller standard errors. We run the Ftest of “test Day1+Day2+Day3=0″ and get significance.
But, when we later run:
return=alpha + beta * market + gamma1to3 * Day1to3 * 1/3 + epsilon,
we don’t get stat significance. And it’s all due to robust. (note, the 1/3 is there to rescale for 3 days)
Delving further, I run a simulation where the dgp is y=x*1/4 + eps
I then regress y on x and a 1day event. I simulate 1000 times and here’s my results:
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
op | 1000 -.0266356 1.000336 -2.86643 3.137967
os | 1000 1.000005 .0218117 .9318347 1.066719
opval | 1000 .4983793 .2890897 .0011551 .999441
rp | 1000 -.0266356 1.000336 -2.86643 3.137967
rs | 1000 .0435009 .0135115 .0295254 .1079442
-------------+--------------------------------------------------------
rpval | 1000 .0268978 .121465 0 .9918482
o designates OLS. op is the point estimate for event (should be 0), os is the stderr, and opval is the pvalue. r designates robust, and rp, rs, and rpval are similarly defined.
Note that robust gives smaller std errors. In fact, much smaller. And this is due to the fact that this coefficient is estimated off one observation.
Importantly, this is NOT fixed if you increase the number of total observations. You have to increase the number of observations in the event. I rerun this with an eventsize of 10, and then OLS and robust are close to matching.
See and run the code to see what I mean.
CODE
* simulation showing how robust raises the t stat for a time series event.capture program drop eventSimprogram define eventSim, rclasssyntax [, obs(integer 1) eventsize(integer 1) xcoeff(real 1) ]drop _allset obs `obs'gen n=_ngen x=rnormal()gen y=x*`xcoeff'+rnormal()gen event=n<=`eventsize'regress y xregress y x eventmatrix eb=e(b)matrix ev=e(V)return scalar OLSpoint=eb[1,2]return scalar OLSstderr=sqrt(ev[2,2])test eventreturn scalar OLSpvalue=r(p)regress y x event, robustmatrix eb=e(b)matrix ev=e(V)return scalar ROBpoint=eb[1,2]return scalar ROBstderr=sqrt(ev[2,2])test evenreturn scalar ROBpvalue=r(p)endsimulate op=r(OLSpoint) os=r(OLSstderr) opval=r(OLSpvalue) rp=r(ROBpoint) rs=r(ROBstderr) rpval=r(ROBpvalue), reps(1000): eventSim, obs(1000) eventsize(1) xcoeff(.25)susimulate op=r(OLSpoint) os=r(OLSstderr) opval=r(OLSpvalue) rp=r(ROBpoint) rs=r(ROBstderr) rpval=r(ROBpvalue), reps(1000): eventSim, obs(1000) eventsize(10) xcoeff(.25)su** note the mean of os, the mean of rs, and the standard deviation of op should all match.** the reason is because the dgp is homoskedastic and the robust and ols extimators are both consistent.** when eventsize is 10, they match pretty well.** when eventsize is 1, the match is very bad. Robust gives abnormally small standard errors.** my guess as to why is because the Huber-White robust estimates allow there to be a different sigmasq for the event period.** When the event is small (or has size 1), then event is perfectly fitted (has zero residual) and that term in the** heteroskedasticity matrix is small (or zero).* simulation showing how robust raises the t stat for a time series event. capture program drop eventSimprogram define eventSim, rclass syntax [, obs(integer 1) eventsize(integer 1) xcoeff(real 1) ] drop _all set obs `obs' gen n=_n gen x=rnormal() gen y=x*`xcoeff'+rnormal() gen event=n<=`eventsize' regress y x regress y x event matrix eb=e(b) matrix ev=e(V) return scalar OLSpoint=eb[1,2] return scalar OLSstderr=sqrt(ev[2,2]) test event return scalar OLSpvalue=r(p) regress y x event, robust matrix eb=e(b) matrix ev=e(V) return scalar ROBpoint=eb[1,2] return scalar ROBstderr=sqrt(ev[2,2]) test even return scalar ROBpvalue=r(p)end simulate op=r(OLSpoint) os=r(OLSstderr) opval=r(OLSpvalue) rp=r(ROBpoint) rs=r(ROBstderr) rpval=r(ROBpvalue), reps(1000): eventSim, obs(1000) eventsize(1) xcoeff(.25)susimulate op=r(OLSpoint) os=r(OLSstderr) opval=r(OLSpvalue) rp=r(ROBpoint) rs=r(ROBstderr) rpval=r(ROBpvalue), reps(1000): eventSim, obs(1000) eventsize(10) xcoeff(.25)su ** note the mean of os, the mean of rs, and the standard deviation of op should all match.** the reason is because the dgp is homoskedastic and the robust and ols extimators are both consistent.** when eventsize is 10, they match pretty well.** when eventsize is 1, the match is very bad. Robust gives abnormally small standard errors.** my guess as to why is because the Huber-White robust estimates allow there to be a different sigmasq for the event period.** When the event is small (or has size 1), then event is perfectly fitted (has zero residual) and that term in the** heteroskedasticity matrix is small (or zero).
See page 259 of Wooldridge 2003.
The variance of beta is the weighted sum of OLS residuals with weights given by the variation of X from the mean of X, quantity squared. The residual (u_i) for the event is 0 when eventsize=1. This is the crux of the problem. As the event size gets larger, the residual in the event window more closely matches the actual disturbance terms.
This isn’t a small sample issue, in the sense that Wooldridge suggests on page 260 a correction of n/(n-k-1). In the simulations, n is 1000, k is 1 or 2; so that correction is quite small.
Comment by howardchong — November 4, 2009 @ 8:22 pm