I hear the phrase “what does it look when we weight the data” a lot. It confused me for a while, but I figured it out: it could mean two things, so the response should be, which of the two do you want?

Weighted Least Squares and weighted average are opposite concepts, in a sense.

Weighted Least Squares (WLS) is an assumption about your error structure. Let’s say your data is across cities and includes a population variable. If, in STATA, you type:

regress LHS population otherRHS [aweight=population]

then you are assuming that var(eps_i)=epssq_0*population, AKA variance is higher for large populations, you should give extra weight to the *smaller observations*. (See Greene, *Econometric Analysis)*

If instead you want bigger populations to matter more, you’d have to aweight by something inversely proportional, perhaps the inverse of population.

When doing weighted average, you multiply by the population, so bigger cities have more influence. Granted, using a weighted average is restricted to computing averages, it’s not used as part of a regression analysis.

STATA also uses population weighting, which is more like weighted average than not. There, you are assuming precisely var(eps_i)=epssq_0/population. This functional form is driven by the a priori knowledge that your data points represent different sample sizes.

Be wary of anyone who says, “We should have observations with X matter more, so let’s run WLS with X.” They are mixing two concepts.

### Like this:

Like Loading...

*Related*

## Leave a Reply