1. Exquisite Tweets from @arindube, @MartinSGaynor, @autoregress, @paulgp

    Woody_WongECollected by Woody_WongE

    If you are intrigued by "double-machine-learning" of Chernozhuov et al., but find it intimidating to read the paper arxiv.org/abs/1608.00060, these two slides explain the basic idea quite simply.

    from norges-bank.no/contentassets/…

    Reply Retweet Like

    arindube

    Arindrajit Dube

    Key aspect is sample splitting: predicted Y & predicted D fcns are estimated in a auxiliary sample. Final reg in main sample.
    This reduces bias.
    But also reduces sample size.
    So they do "cross-fitting": swap main & aux sample & repeat.
    You get 2 estimates; take average.

    Reply Retweet Like

    arindube

    Arindrajit Dube

    For your Y & D predictions, pick your favorite ML tool (Random Forest, Boosting, LASSO). Or ensemble.

    Tune the model within the auxilary sample (which has a training and testing subsamples)

    Once ML part is done & you have your Y and D residuals, it's just a bivariate regression

    Reply Retweet Like

    arindube

    Arindrajit Dube

    Improves on double selection (DS) approach, also by Chernozhukov.
    DS chooses covariates that *either* predict Y or D using LASSO on full sample, then do OLS of Y on D & chosen predictors.
    DS relies on sparsity, needs LASSO.
    DoubleML uses sample splitting, allows many ML tools.

    Reply Retweet Like

    arindube

    Arindrajit Dube

    DoubleML can be used in much more complicated setups than partial linear model.

    E.g., can have heterogeneous treatment effects & non-linearities. Can use the ML-predicted value of D (propensity score) to reweight. Etc.

    But wanted to convey that the basic idea is simple.

    FIN.

    Reply Retweet Like

    arindube

    Arindrajit Dube

  2. Thanks. Very clear. Advantage of ML over standard IV (if you have strong instruments) is..? Nonparametric? Dimensionality reduction (if you have that issue)? ...?

    Reply Retweet Like

    MartinSGaynor

    Martin Gaynor

  3. In the IV case, I would think gains from ML would mostly come when you have many instruments

    Reply Retweet Like

    arindube

    Arindrajit Dube

  4. I should also point to the great (and accessible!) piece by @Susan_Athey from a few months ago that covers this and several other new directions: nber.org/chapters/c1400…

    Reply Retweet Like

    arindube

    Arindrajit Dube

  5. Has anyone implemented this paper on the canonical Lalonde dataset? I see that the paper has 3 empirical examples, but I'm surprised they didn't compare their results to the p-score matching lit (i.e. Dehijia and Wahba, Smith and Todd).

    Reply Retweet Like

    paulgp

    Paul Goldsmith-Pinkham