Key aspect is sample splitting: predicted Y & predicted D fcns are estimated in a auxiliary sample. Final reg in main sample.
This reduces bias.
But also reduces sample size.
So they do "cross-fitting": swap main & aux sample & repeat.
You get 2 estimates; take average.
Improves on double selection (DS) approach, also by Chernozhukov.
DS chooses covariates that *either* predict Y or D using LASSO on full sample, then do OLS of Y on D & chosen predictors.
DS relies on sparsity, needs LASSO.
DoubleML uses sample splitting, allows many ML tools.
Has anyone implemented this paper on the canonical Lalonde dataset? I see that the paper has 3 empirical examples, but I'm surprised they didn't compare their results to the p-score matching lit (i.e. Dehijia and Wahba, Smith and Todd).