1. Exquisite Tweets from @Undercoverhist, @agoodmanbacon

    Woody_WongECollected by Woody_WongE

    2017 HOPE special issue is finally out! Historians’ take on the so-called “empirical turn in economics.” read.dukeupress.edu/hope/issue/49/…

    Reply Retweet Like

    Undercoverhist

    Beatrice Cherrier

  2. This looks very interesting. It turns out that I have been reading some of the first papers to ever use "regressions" and they are SO INTERESTING. (Follow if you really want to know about old bin scatters and "arithmometers".)

    Beatrice Cherrier @Undercoverhist
    2017 HOPE special issue is finally out! Historians’ take on the so-called “empirical turn in economics.” read.dukeupress.edu/hope/issue/49/…

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    I might get some of this early stuff wrong, so someone please correct me. The normal distribution itself was discovered by Gauss (or maybe de Moivre) in the 1700s. An astronomer named Bravais turns out to have made progress on the relationship between two normals in 1846.

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    He apparently obtained the "regression line" relating two normal RVs: rho*sigma_1/sigma_2.

    (Bravais also has a mountain named after him: en.wikipedia.org/wiki/Bravaisbe…)

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    I'm going to totally gloss over Galton, which I realize is not great in a thread about regressions, but it is this normal regression coefficient that is used in follow up work on heredity by Karl Pearson (1896). (Pearson calls Galton's work "epoch-making".)

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    So now we get to the guy I'm interested in: Udny Yule. He wasn't happy with analyses confined to normals because, well, many things (including, as he notes, most economic things) aren't normal. But what is the "regression line" that relates two variables when they aren't normal?

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    Yule first applies least squares (somewhat apologetically) to a bivariate relationship. I think, though, that he really wants the fitted line to match up with a bin scatter (which he can easily plot).

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    He gets this for the slope (S here means summation and, he is for some reason using x as the outcome and y as the RHS variable):

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    agoodmanbacon

    Andrew Goodman-Bacon

    These formulae are literally how Yule calculated his regression coefficients. Apparently the variances/covariances weren't that easy to get (or data came in frequency tables). Be glad we don't have sections or appendices like this anymore...

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    The big twist for me is that the application to which these new methods were put was to estimate the causal effect of welfare policy ("out-relief ratio" or the share of all recipients not served in an institution like an almshouse) on recipiency ("pauperism"). I do this all day.

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    This, for example, is quite a nice description of the problems of causal inference in the social sciences. The goal, therefore, of this early regression analysis was to estimate causal effects.

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    The result of Yule's 1897 paper is just one regression coefficient showing a positive relationship b/w the jurisdiction level "pauperism" rate for men over 65 to the out-relief ratio. This was an improvement over more eyeball type methods comparing time series.

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    Nevertheless, in 1899 Yule returned w/ a new model (estimated in growth rates) that explicitly controlled for TWO variables: population, and the share elderly. He also did it for 1871-81 and 1881-91 and for 4 areas: rural, mixed, urban, metropolitan. The arithmometer was humming!

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    This simple first-difference type regression yielded positive coefficients for the out-relief ratio for all specifications and sub-samples. Yule is quite bullish on the causal interpretation (oh how I wish I could give an 1899-era seminar).

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    This gets me to the thing that reminded me of these papers in the first place, which was @ProfNoto's thread on figures. Yule notices some heterogeneity in the out-relief coefficients and makes this figure plotting each of the 8 coefficients against the mean X in each sample.

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    While that figure maybe doesn't blow our minds today, it was beyond the frontier in 1899. Yule walked (cranked on his slide rule) so we could run (plot event-studies or big-data non-parametric bin-scatters).

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon

    ugh, I forgot the arithmometer part. Arithmometers were apparently mechanical calculators dating back to 1820, and the Brunsviga was a relatively recent model. The Stata 15 of 1899.

    en.wikipedia.org/wiki/Odhner_Ar…

    Reply Retweet Like

    agoodmanbacon

    Andrew Goodman-Bacon