I have been thinking of the problem of controlling for occupation when checking for discrimination as a form of collider bias, so I added a small simulation and discussion of it as an example in my #Mixtape on p. 74-75 showing this will bias estimates. scunning.com/cunningham_mix…
Should you control for occupations in a wage regression if you are trying to estimate gender wage penalty?
The 1-word answer is "No."
The 9 word answer "No, but that regression can tell us something useful."
In general, we don't want to control for occs because part of the lower pay for women occurs through barriers to higher paid occs. For ex: "glass ceiling" means its harder for women to enter high-paid managerial positions. Controlling for glass ceiling underestimates penalty.
But what if occs are also correlated with "skills" which are hard to directly measure? Could controlling for occs help with this problem?
Another possibility: can comparing gender gap with and w/o occ controls tell us about whether the *mechanism* of discrimination is occs?
This matters in practice. In a recent study of gender wage gap, Blau and Kahn show estimates with and without occ/ind controls: find occ/ind controls "explain" about half of the gender gap today in US.
Is this informative?
Recently @erinhengel made this very imp point that controlling for occ can under-estimate the impact of discrimination. I totally agree. This also led to an exchange with @causalinf about whether a regression with control is ever "informative." (Very smart people, both!)
Using an adapted version of Erin's code, I tried to shed some light on why a regression with occ controls can, indeed, be informative.
In general, the true effect will lie between the "no controls" and the "occ controls" estimate. So these two estimates can provide a bound.
If I got it right, with Erin's DGP discrimination affects both occ choice and wage, so controlling for occupation is "over control". But not controlling for occupation (which proxies for ability) is "under control." The bounds are tighter when occ/ability correlation low.
Why does this all matter? Because I think it's useful to to look at the decomposition done by Blau and Kahn: that gender wage gap today is largely unaffected by "standard" human cap controls, but is reduced by 1/2 with controls for occs and inds. 8/N
Fact that gender wage gap falls by 1/2 from occ/ind controls but is still ~10% provides an useful bound. We can argue if the occ portion reflects ability confounders, discrimination, or something else. But the effect w/ & w/o occ controls bounds this disagreement usefully. 9/N
Finally, the STATA code that used Erin and Scott's codes is here: dropbox.com/s/hies448ukhta…
Is not controlling for occ important from the immediate policy perspective? I.e. calls for equal pay laws aren't very pragmatic if they are trying to address the half of the gap that isn't within occ. Of course that half is still important just not easily solved as the other one
Not necessarily. The between-occupation gap could be caused by the with in-group occupation-gap. Within-occupation gaps will determine human capital allocation.
Matt Darling 🌐
@theamazingjex I just want to point out that this is a whole subdiscipline in Sociology (Stratification), which has addressed (theoretically and empirically) the question of why the occupations in which women are concentrated are paid less. See work by @EnglandPaula.