1. Exquisite Tweets from @DinaPomeranz

    Woody_WongECollected by Woody_WongE

    I'm going to live tweet Esther Duflo's Master Lecture at the NBER Development Economics meeting on "Machinistas meet Randomistas: Some useful ML tools for RCT researchers"

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    This presentation will focus more on methods rather than on anything specifically about development economics.

    Many empirical researchers are starting to use machine learning (ML) in their work. This can provide some pointers and avoid reinventing the wheel.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Machine learning usually is good for predictions while RCTs are mostly focused on causal identification. So is there even any intersection? The answer is yes.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    In ML there are attempts for getting at causal estimation. And in RCTs there are sometimes high dimensional problems, where ML can help for choosing:

    - Choosing covariates
    - Subgroup Analysis for heterogeneity
    - Cases with many treatments and potential outcomes

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    We can do a comparison of results of different methods including ML, following the example of Lalonde.

    There is a paper by Chernozhukhov and many co-authors:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    When people do RCTs, they can now going forward use them to also help test the validity of some of these new methods.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    There is no magic there. Similarly as to how with matching methods,the matching is only as good as the observable variables that researchers have, machine learning has these limits, too.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    The Lalonde exercise with the two step machine learning.

    Caveats: 1) Is the IV estimate valid? 2) Is the effect different for compliers in the local average treatment effect than in population?

    Can address the 2nd caveat with this method but not the 1st:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    For some outcomes the results are similar, for others very different. Hard to know which is the better one. Would be great to compare directly with RCT results. It was not possible here.

    Interesting question: What kind of heterogeneity can be picked up with observables?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Second topic of the lecture: Choosing control variables in RCTs.

    People tend to include the variables that are very imbalanced and those that they think most predict the outcomes.

    It turns out with Lasso it has been shown that's exactly what one should indeed do.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    This is better than using a specific set of variables pre-specified in a pre-analysis plan, since you don't know ex-ante which variables will be unbalanced.

    Make the choice ex-post, but in a structured way (can pre specify this process.)

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Third topic of the lecture: heterogeneity of treatment effects on different populations.

    This is something that could actually have substantive impact for many research areas.

    Often there is a large number of groups one could look at.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    That's potentially a problem for multiple hypothesis testing.

    One solution is pre-registering.

    This is important for clinical trials. There, the goal is simply to test, does this drug work? Also, there's a slot of money on the table to find a "yes" answer.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    However its different in social science, where we want to learn more about the world.

    There, pre-registering specific subgroups risks throwing out valuable information that is only available ex-post.

    Alternative solution: use a ML process to select the groups.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    However, it's not as simple. We don't just want to predict who will benefit most in this particular setting based on a large number of variable. We want to draw inferences out of sample for particular types of people.

    Focus on key feature that are of interest.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    This approach is generic for different types of ML being used.

    Proofs are in the paper.

    Two different strategies:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    One of the lessons: if one is interested in heterogeneity, might want to double the sample size. But don't need to double the number of clusters. Can split the sample within the clusters.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Reject zero: some real heterogeneity in loans, output, and profit, but not in consumption.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Sorting from least to most affected groups.

    Now can try and say something about the characteristics of people in these groups:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Next we're going to look at a more substantive application, which is not in development. It's about France:

    Comparing two intervention that have the same goal, but have been analyzed in two separate studies:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    So there are both differences in the intervention but also in the population.

    Can we predict what would have been the effect of intervention A in the population B?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Comparing 2 programs. One by a microcredit agency, one by the social service.

    They target youths with difficult employment situations who have a self-employment project.

    The first program is selective. Tries to select the most enthusiastic & most prepared. The second takes all.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Also the first program really tries to train the youths in entreneurship. The second tries to help them see if entreperenuship is a good fit for them, or employment would be better.

    Second program has effects. First doesn't.

    Is this because of selection or treatment?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Matching on observables: not much of a difference.

    But: the control group in the first program still develops very differently from the second.

    They select on the basis of an interview that the researchers don't observe.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Take the experiment where have many baseline characteristics.

    Look at probability of being employed in the control group. Sample people in the other experiment who have similar characteristics:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz