This looks very interesting. It turns out that I have been reading some of the first papers to ever use "regressions" and they are SO INTERESTING. (Follow if you really want to know about old bin scatters and "arithmometers".)
I might get some of this early stuff wrong, so someone please correct me. The normal distribution itself was discovered by Gauss (or maybe de Moivre) in the 1700s. An astronomer named Bravais turns out to have made progress on the relationship between two normals in 1846.
He apparently obtained the "regression line" relating two normal RVs: rho*sigma_1/sigma_2.
(Bravais also has a mountain named after him: en.wikipedia.org/wiki/Bravaisbe…)
I'm going to totally gloss over Galton, which I realize is not great in a thread about regressions, but it is this normal regression coefficient that is used in follow up work on heredity by Karl Pearson (1896). (Pearson calls Galton's work "epoch-making".)
So now we get to the guy I'm interested in: Udny Yule. He wasn't happy with analyses confined to normals because, well, many things (including, as he notes, most economic things) aren't normal. But what is the "regression line" that relates two variables when they aren't normal?
Yule first applies least squares (somewhat apologetically) to a bivariate relationship. I think, though, that he really wants the fitted line to match up with a bin scatter (which he can easily plot).
He gets this for the slope (S here means summation and, he is for some reason using x as the outcome and y as the RHS variable):
These formulae are literally how Yule calculated his regression coefficients. Apparently the variances/covariances weren't that easy to get (or data came in frequency tables). Be glad we don't have sections or appendices like this anymore...
The big twist for me is that the application to which these new methods were put was to estimate the causal effect of welfare policy ("out-relief ratio" or the share of all recipients not served in an institution like an almshouse) on recipiency ("pauperism"). I do this all day.
This, for example, is quite a nice description of the problems of causal inference in the social sciences. The goal, therefore, of this early regression analysis was to estimate causal effects.
The result of Yule's 1897 paper is just one regression coefficient showing a positive relationship b/w the jurisdiction level "pauperism" rate for men over 65 to the out-relief ratio. This was an improvement over more eyeball type methods comparing time series.
Nevertheless, in 1899 Yule returned w/ a new model (estimated in growth rates) that explicitly controlled for TWO variables: population, and the share elderly. He also did it for 1871-81 and 1881-91 and for 4 areas: rural, mixed, urban, metropolitan. The arithmometer was humming!
This simple first-difference type regression yielded positive coefficients for the out-relief ratio for all specifications and sub-samples. Yule is quite bullish on the causal interpretation (oh how I wish I could give an 1899-era seminar).
This gets me to the thing that reminded me of these papers in the first place, which was @ProfNoto's thread on figures. Yule notices some heterogeneity in the out-relief coefficients and makes this figure plotting each of the 8 coefficients against the mean X in each sample.
While that figure maybe doesn't blow our minds today, it was beyond the frontier in 1899. Yule walked (cranked on his slide rule) so we could run (plot event-studies or big-data non-parametric bin-scatters).