Statistics, Eugenics, and Me
A personal reckoning of my failure to acknowledge the origins of my field
This week marks Holocaust Memorial Day and the end of my five years studying in the Statistics department of University College London (UCL), which was the home, for decades, of the most prominent eugenicists.
Until this week, it didn’t bother me that I studied the work of Karl Pearson and Ronald Fisher in the Galton Lecture Theatre and Pearson Building. But it should have done.
Much of my family was murdered in the holocaust as part of a regime to eradicate a supposed “inferior race”. Yet, I still viewed Pearson, Fisher, and Galton (and others) as the Fathers of Statistics who deserved to be recognised and respected for their contributions. I had naïvely assumed they were a product of their time and that their research was a natural progression in the statistics behind genetics. This was not the case.
Note: This is a personal article that I wrote in order to hold my own naïvety to account. However, this is aimed at other statisticians and data scientists as I am surely not the only person who failed to consider the true origins of our field.
Sir Francis Galton
Francis Galton, “the founder of the field of behavioral and educational statistics”:¹
- Pioneered the use of questionnaires
- Discovered regression to the mean
- Re-discovered correlation and regression and discovered how to apply these in anthropology, psychology, and more
- Defined the concept of standard deviation
Galton is considered so fundamentally important and critical to the development of statistics, that many will deliberately overlook his establishment of the field of Eugenics in 1883.
A product of his time?
In reviewing four books about the life of Galton, Clauser (2007)¹ uses this argument to criticise one book:
“The volume is also limited by Brookes’s tendency to see every aspect of Galton’s life in relation to his views on eugenics. Brookes fails to place Galton’s views in the context of the times in which he lived…Although Galton’s views of the indigenous populations that he encountered in Africa might well be seen as enlightened by Victorian standards, Brookes views them with a 21st-century perspective and finds evidence of Galton’s intolerance.”
The ‘product of its time’ argument is constantly used to justify bigotry throughout the years. To put this in perspective: Galton was inspired by his cousin’s work, The Origin of Species. Darwin could be called a product of his time as he used words like ‘savages’ to describe certain populations. However, Darwin spoke openly against racism and did not promote or contribute to any of Galton’s work on eugenics.² If there is any doubt about the innate racism in Galton’s ‘eugenics’, below is his reasoning for coining the term:
We greatly want a brief word to express the science of improving stock, which…takes cognisance of all influences that tend in however remote a degree to give the more suitable races or strains of blood a better chance of prevailing speedily over the less suitable than they otherwise would have had.³
And if there is still some doubt about Galton’s views, he also wrote:
There exists a sentiment, for the most part quite unreasonable, against the gradual extinction of an inferior race.⁴
If, on my first day of university, sitting in the Galton Lecture Theatre, I had been told that Galton felt that genocide was ‘for the most part quite unreasonable’, I may have felt less comfortable with his name and picture around me. My research fundamentally relies on Galton’s work, I am indebted to Galton. That does not mean I needed to hear and say his name every day for three years.

Karl Pearson
Karl Pearson was Galton’s protégé and amongst many notable achievements:
- Developed hypothesis testing
- Developed the use of p-values
- Defined the Chi-Squared test
- Introduced the Method of Moments
An antisemite
In the year Mein Kampf was published, Pearson wrote about the Jewish population:
“[they] will develop into a parasitic race…Taken on the average, and regarding both sexes, this alien Jewish population is somewhat inferior physically and mentally to the native population.”⁶
And when Hitler was made Chancellor, Pearson made a point to say:
Even at the present day there are far too many general impressions drawn from limited or too often wrongly interpreted experience, and far too many inadequately demonstrated and too lightly accepted theories for any nation to proceed hastily with unlimited Eugenic legislation.⁷
Which may appear reasonable until immediately followed by his caveat:
This statement, however, must never be taken as an excuse for indefinitely suspending all Eugenic teaching and every form of communal action in matters of sex.⁷
I concluded my undergraduate with a presentation, that fundamentally depended on the work of Pearson, in a small classroom in the Pearson building. Pearson, the first Chair of the Department of Eugenics at UCL, has contributed to my life and my work in a way that I cannot and will not ignore. His contributions were essential to the field of Statistics, and many others. But again I am left wondering if I would have felt as comfortable walking through the Pearson building, if I had been provided more context about the man it was named after.

Sir Ronald Fisher
Fisher’s work in statistics established and promoted many important methods of statistical inference. His contributions include:
- Establishing p = 0.05 as the normal threshold for significant p-values
- Promoting Maximum Likelihood Estimation
- Developing the ANalysis Of VAriance (ANOVA)
- The
iris
dataset (this seems an incredibly minor contribution but I use it daily)⁹
Like Pearson and Galton, Fisher was revered. When asked who the “greatest biologist since Darwin” was, Richard Dawkins nominated Fisher¹⁰ (this may not be surprising given Dawkins’ own publicised views). Bodmer, one of Fisher’s students wrote a brief but glowing biography of Fisher’s life and described Fisher as “sweet” and “nice”.¹¹ Neither mentioned eugenics.
On the Nazi eugenicist Otmar Freiherr von Verschuer, Fisher wrote:
In spite of their prejudices I have no doubt also that the Party sincerely wished to benefit the German racial stock, especially by the elimination of manifest defectives, such as those deficient mentally, and I do not doubt that von Verschuer gave, as I should have done, his support to such a movement.¹²
I did not come across Fisher’s name until my third year at UCL, once again lauded as a great statistician. Needless to say, genocide was not discussed.
Buildings and Statements
As put in a recent article, discussing buildings named after people:
The right way to understand them and their ideas is through a properly contextualised display in a museum, not through an uncommented memorial that conceals more than it reveals.¹³
The same is true of sweeping statements about people. I assume that when Dawkins described Fisher as the “greatest biologist since Darwin”, this was based on pure output and contributions to the field alone. However the word “greatest”, like a building named after Fisher, includes judgement about the person itself. It implicitly assumes this person was “great” and we automatically assume (though this is not part of the definition) that “great” means “good”. Sweeping statements, as with uncommented memorials, conceal more than they reveal.
The Fathers of Statistics and Eugenics
Denaming buildings and lecture theatres is easy. It’s a step that should have been taken a long time ago, but I am not criticising UCL. To their credit, they spent years gathering information and putting out surveys to hear everyone’s feedback and opinions.
To work in statistics or data science is to work in the shadow of eugenics. For five years, I was not even aware that I was under this shadow and I am deeply ashamed. But I cannot blame myself. I used the equations I was provided and studied the methods I was told, I didn’t consider the names belonging to the methods. I believe I am likely in the majority with other mathematicians and statisticians. This is not okay. I believe that all Statistics courses should include, at the very least, one lecture on the full history of Statistics and its “Fathers”.
Success from failure
Even when buildings are renamed, I will still (daily) be using ‘Pearson’s Chi-Squared test’ and ‘Fisher’s information’. Their names are stuck with me forever and now so are their beliefs. Whilst I will respect and even be inspired by the contributions of these men to statistics, I can now talk about them without glorifying their characters or inadvertently condoning their beliefs. I will utilise the methods of the past to promote the positive, good, and ethical choices of the future. Learning where others failed will help generations of statisticians learn how to succeed.
Products of our time
Unfortunately, I can only end this on a warning. Whilst it is unlikely (though not impossible) that eugenics will re-emerge in modern politics, statistics and data are being manipulated now more than ever.
From naïve misuses of machine learning that return erroneous results, to the malicious manipulation of raw data, the field is being tested. From governments to individuals, Russia to Palo Alto, anti-maskers to anti-vaxxers; the full consequences of data and statistics malpractice remains unknown.
Sitting idly by as this happens will make us ‘a product of their time’. This is not good enough. Data Science needs more regulation. Doctors have the Hippocratic Oath, why don’t we have the Nightingale Oath: “Manipulate no data nor results. Promote ethical uses of statistics. Only train models you understand. Don’t promote Eugenics”.
References
¹Clauser BE. The Life and Labors of Francis Galton: A Review of Four Recent Books About the Father of Behavioral Statistics. Journal of Educational and Behavioral Statistics. 2007;32(4):440–444. doi:10.3102/1076998607307449
²https://medium.com/science-and-philosophy/charles-darwin-on-racism-slavery-and-eugenics-cb6416b8277c
³Galton. Inquiries Into Human Faculty and Its Development. Macmillan. 1883. p. 24.
⁴Charny, Israel W.; Adalian, Rouben Paul; Jacobs, Steven L.; Markusen, Eric; Sherman, Marc I. (1999). Encyclopedia of Genocide: A-H. ABC-CLIO. p. 218. ISBN 978–0–87436–928–1.
⁵https://imagestore.ucl.ac.uk/imagestore/start/images?view=preview&fuid=UCL%20Library/Library%20Events/Galton%20Exhibition%20Opening/Galton_Exhb_0025.tif
⁶Pearson, Karl; Moul, Margaret (1925). “The Problem of Alien Immigration into Great Britain, Illustrated by an Examination of Russian and Polish Jewish Children”. Annals of Eugenics. I (2): 125–126. doi:10.1111/j.1469–1809.1925.tb02037.x.
⁷Pearson, Karl (1933). “VALE!”. Annals of Eugenics. 5 (4): 416. doi:10.1111/j.1469–1809.1933.tb02102.x.
⁸https://bartlett100.com/article/bartlett-buildings-the-pearson-building.html
⁹https://www.garrickadenbuie.com/blog/lets-move-on-from-iris
¹⁰https://www.edge.org/conversation/who-is-the-greatest-biologist-of-all-time
¹¹Walter Bodmer, RA Fisher, statistician and geneticist extraordinary: a personal view, International Journal of Epidemiology, Volume 32, Issue 6, December 2003, Pages 938–942, https://doi.org/10.1093/ije/dyg289
¹²After the Fall: Political Whitewashing, Professional Posturing, and Personal Refashioning in the Postwar Career of Otmar Freiherr von Verschuer. Sheila Faith Weiss. Isis 2010 101:4, 722–758
¹³https://www.newstatesman.com/international/science-tech/2020/07/ra-fisher-and-science-hatred
¹⁴https://en.wikipedia.org/wiki/Florence_Nightingale#/media/File:Florence_Nightingale_(H_Hering_NPG_x82368).jpg.