1. Exquisite Tweets from @DinaPomeranz

    Woody_WongECollected by Woody_WongE

    I'm going to live tweet @m_sendhil's talk on "Economic Applications of Machine Learning" at @econ_uzh today

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    We'll see 3 applications today. There have been hugely impressive engineering advances in this area in recent years.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Sendhil starts by introducing concepts with the example of face recognition.

    The first step is to transform the engineering problem into an empirical problem. Rather than programming what a face looks like, make it into an empirical learning exercise. Is this a face, yes or no?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    A lot of this looks like computer science discovering statistics.

    A large part is like money ball: application of statistical methods to new areas.

    But that's not all.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Many applications are high dimensional statistical problems. These are cases with many more variables than observations.

    This type of high-dimensionality is relatively new. Especially what is new is that there are now a set of practical tools for this "applied nonparametrics".

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    These models are data hungry in interesting ways. The more observations I get, the better more variable-intensive specifications will fit.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Example: House prices.

    Improvement of R-square from OLS to bettet method from 42% to 46%. Is that a lot? Do we care?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Difference of traditional econometrics regressions and machine learning:

    First fits Y to X with low dimensionality. Machine learning does it out of sample and with high dimensionality.

    Is machine learning just better?

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Machine learning is perfect for a problem of predicting, like face recognition.

    Economist often care less about that & more about the causal role of specific covariates.

    E.g. if two covariates are correlated, we care about that for parameter estimation, but not for prediction.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    In other words, the classical approach is about estimating beta-hat, while machine learning is about estimating y-hat.

    We'll now look at 3 types of application for economics:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Application 1: machine learning in aid of parameter estimation.

    Experimental analysis. Imagine I ran an RCT & I have a large number of outcomes. I want to see if there's an overall effect.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Better than adjustment for multiple hypothesis testing (Bonferoni, etc.): check whether the outcomes predicts treatment better than chance. If yes, there's an effect. This requires fewer assumptions than what we typically need.

    There are also ways to get exact p-values here.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Treatment heterogeneity/subgroup analysis:

    We know already that the question about which group responds more is not causal.

    See work by Athey and Imbens on this.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    There are many parameter estimation problems that have machine learning aspects to them.

    Here's a list:

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Application type 2: When predictability itself is of economic interest.

    Example: 25% of health spending is in last part of people's life. If we could predict who would die despite of this treatment, we would not need to spend this amount and to make people undergo treatment.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Problem: So far we can really not predict well at all who will die. Even among people with very similar conditions.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Would a better predictor fix this? No. When we have a base rate of dying of 5%, even with an excellent predictor we would mispredict many cases.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Another example: lots of testing for heart attacks are wasted.

    The decision of who to test is actually a prediction problem: I wanna test the people for whom the test will find that they have a heart attack.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Rather than looking at the average, need to differentiate for whom the yield is large and for whom not.

    There is under-testing of some people and over-testing of others.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Problem: if you don't do the test, you don't know what the outcome of the test would have been.

    Then testing on that selected sample will not be sufficient.

    Need to look at what happened to the untested later on, and see if they got heart attacks.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Turns out there is huge under-testing.

    It's a problem of sorting, of not predicting well whom to test. Not a case of generalized overtesting.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz

    Getting people to cut overuse leads them to also cut high-yield use.

    Reply Retweet Like

    DinaPomeranz

    Dina D. Pomeranz