Yesterday’s post on race and high achievement produced some pushback from commenters who defended “race realism.” I want to engage further.
Let’s stipulate that in order to perform at the highest levels in some realms, such as chess or economics, one needs an IQ of 145 or higher, which is three standard deviations above normal for the white population (normal mean is 100, each standard deviation is 15). And assume that the mean IQ in the black population is 85, so that for blacks to achieve at the highest level, one must be four standard deviations above normal. Taking into account that there are roughly 6 times as many whites as blacks in the U.S. population, what will we expect the ratio of whites to blacks performing at the highest levels in these realms?
Consider two ways to make this calculation.
(a) Survey very large numbers of blacks and whites to arrive at estimates of the numbers in each racial category that has an IQ of 145 or higher.
(b) Assume that the formula for the normal distribution applies 3 and 4 standard deviations out, and use that to estimate the ratio of whites to blacks with an IQ of 145 or higher.
I would regard method (a) as reliable, as long as the samples were large enough. Method (b) instead is making a very risky assumption, which is that distributions that seem to follow the normal formula near the mean also follow the normal formula near the extremes.
Method (b) is often erroneous. In finance, Nassim Taleb and many others have pointed out the poor performance of calculations based on using the formula for the normal distribution at the extremes. Events that should occur less than once in a thousand years have occurred often in our lifetime. The problem is that approximating probabilities using the normal distribution tends to break down at the extremes. Phenomena that appear to be normally distributed close to the mean are not actually normally distributed across the full range.
Self-described “race realists” are relying on method (b) to estimate the ratio of whites to blacks with an IQ of 145 or higher. But we do not know whether the normal formula truly applies when we are that many standard deviations from the mean.
Relying on the normal formula gives .0013 as the probability of being above 3 standard deviations. At the moment, I cannot even find a calculator that will give me an answer other than “0” for the probability of being above 4 standard deviations. But I think that the ratio of the portion within 3 standard deviations to the portion within 4 standard deviations is about 42. Please correct me if you find a more accurate calculation.
With 6 times as many whites as blacks, method (b) would predict 42x6 = 250 times as many ultra-high achieving whites as blacks in the United States.
When I look around the economics profession, I do not see 250 whites for every ultra-high achieving black. As I pointed out in the previous post, there are 2 black winners of the Clark medal. There are fewer than 100 white winners, not 500.
On the other side, consider the Jewish American population. If Jews and non-Jews were equal in numbers and Jews have an average IQ one standard deviation higher, then there would be about 42 Jews for every non-Jew in an ultra-high achievement category. But since Jews are only 2.5 percent of the population, there should be about one Jew for every non-Jew in an ultra-high achievement category. In fact, I think that the number is much lower. For example, Jews are only 8 percent of American billionaires.
If these examples are representative, then it appears that “race realism” over-predicts racial differences in the proportion of very high achievers. If so, this could be because IQ does not continue to follow the normal distribution at the extremes. And it could be because factors other than IQ are important for very high achievement.
If you want to assume that the normal distribution applies to IQ at the extremes, you can do that. And if you want to assume that IQ determines outcome in some realm, you can do that. But you are using a model to arrive at your conclusions, and people are entitled to question the assumptions in your model. In describing your analysis, I would come up with some term less self-assured than “realism.”
UPDATE: A commenter points to this analysis, by Taleb. Worth looking at. I worry that Taleb’s rhetoric goes too far, making it sound like IQ should be ignored altogether. I would say that the Gaussian assumptions about IQ and its relationship to outcomes are a map, not the territory. It is better to rely on observed data, not on assumptions.
Kind of boggled that Arnold spends so much time wailing about how women have ruined the world based on nothing more than his feeeeeeeeeeeeelings about what feminization must mean only to write two absurdly naive posts questioning incredibly consistent data on race and IQ.
So here's two very clear, easily established data points that anyone can look up to establish what Amy Wax means by "hardly any blacks".
The black average LSAT score is 141.7 with SD of 8.97. That means that a 4sd score for a black person is 177.5
The white average LSAT score is 153.18 with an SD of 9.27 A 3rd sd score for a white person is 180.
If you consider that a 175 is generally considered necessary to get into an elite law school--well, for whites and Asians, anyway--a 175 is the 99.1% for whites and at (according to an online calculator) 100% for blacks--that is, the probability that a black will get a score of 175 is functionally 0.
The top 5 law schools are 11% black, with a high at Columbia and Harvard of over 30% and a low at Yale and Cornell of 11%. If they were accepting purely on lsat scores, there would be functionally zero, although I imagine there would occasionally be an outlier.
Another stat:
Generally speaking, a 1400 SAT score is considered necessary for elite college acceptance, say top 50 schools. This is the equivalent to a 31 ACT.
The SAT is very obliging on scores and race. 2711 blacks scored a 1400 or higher in 2019. Nearly 76,000 whites did. The ACT is not revealing at all, and there are students who take both, so I took the number of students getting 31 and over, assumed the same percentages as the SAT in terms of racial distribution and then assumed 15% overlap. These are made up numbers but close enough.
So an estimated combined population of 1400 SAT/31 ACT scorers is about 117K white and a little under 4200 black.
If you just go down the colleges by ranking and see how many blacks and whites they actually accepted, the schools have accepted nearly 4400 blacks by the 28th ranked school. Meaning they are completely out of blacks with the necessary score. Meanwhile, they've used less than 25% of the whites with that score (and only about 18% of the Asians).
I've done the calculations to the 36th school on the list and they've already accepted 150% of the blacks with a 1400, 100% of Hispanics and only 38 and 27% of whites and Asians.
I'm not sure where the data will end up, but assume for now that the top 100 ranked colleges or more could effortlessly fill themselves with whites and Asians getting 1400 or higher, and those schools combined would only have 4200 blacks and 17000 Hispanics.
In reality, by the time you get to the 30th ranked school, they're taking close to 70% of their kids from below the 1400 standard. Such is the bullshit that the fiction of grades allows.
Once again: if colleges used a simple metric with no grades to lie about, the top 100 schools would be overwhelmingly white, somewhat less Asian, and there'd be 4000 blacks total.
The skew is huge.
Some quick points.
1.a. The most important statistic about intelligence-related statistics is that the intelligence threshold for being consistently competent at doing statistics is amazingly high.
1.b. The second most important statistic about intelligence-related statistics is that, ironically, no cognitive capability makes a more powerful case against the statistically-robust 'single factor' thesis of intelligence, than the capability to be consistently competent at doing statistics, because that thesis is hard to square with what we observe, which is that some people just 'grok' it and some people just can't seem to grok it and so keep fouling up, no matter how good at other maths they are. Andrew Gelman's blog will provide you with Exhibits A through ZZZ, heh. "Thinking in terms of exponential growth" is also apparently very unnatural, though not as uncorrelated with high intelligence as advanced statistics, which, as I mentioned in a comment to another post, is a surprisingly young field. The point is, the 'skill stack' of being both above the 1.a threshold and also having the mysterious 1.b gift from Athena is really rare, and if you are not getting numbers from someone like that you are reasonable in putting your shields up.
2. There are two ways people are using the normal distribution to argue these matters.
2.a. The first way is to try and make bold claims or predictions about human reality using math, and they usually are reporting numbers that are too precise, too confidently (i.e., without showing big confidence intervals) because they are not being honest or accurate about wide errors bands when extrapolating to extremes, or propagating those errors through the steps of calculation in the statistically appropriate manner. It's reasonable to be skeptical of the precision of these numbers, though usually not about whether they are in the right order of magnitude ballpark.
2.b. But the second way is to merely make a basic demonstration of the natural fact that for anything that tends to have something like a normal distribution, when comparing the proportional representation above some threshold of two sets with difference means, that proportion changes and tends to grow rapidly the farther away one gets from the means. And you have to make this demonstration over and over and over both because of 1.b and the 'unnatural' feel of it for most people, and also because the left's Narrative and political formula adamantly denies this fact and the impersonal, blameless naturalness of these proportions.
There is simply no better way to start a rigorous defense against those antisocial libels than to demonstrate by means of statistical principles that it is not merely theoretically possible but in fact *the usual case* that disparities in set representation will be much larger at the extremes than in the center. All the caveats about things not actually being normally distributed don't really chip away at this insight because it's robust against all kinds of modifications you could make to the basic bell curve model that aren't completely bizarre and unnatural mathematical constructs.
Here is the point about statistics. With the political and legal logic of 'disparate impact', the left is making a *mathematical* claim that the very existence of these large disparities are red flags which provide *strong statistical evidence* of unjust discrimination, and that the existence of this discrimination can be presumed with the burden of proving innocence and the absence of discrimination properly being shifted to the defendant.
That's bad enough, but could be worse under a standard of "strict liability" for which one can make no defense. But even worse than a strict liability regime that at least tells people there is strict liability, is a strict liability regime that *lies* about that and pretends to be merely a "rebuttable presumption" / "prove your innocence" regime.
But *how* can you prove your innocence when the adjudication system has deemed the statistical truths which must constitute the very heart of any such defense to be inadmissible untruths? To call this a rigged game is an understatement, and lately this kind of dishonest and time-wasting invitation to appeal or apply with the implication that it is actually possible to prevail when in fact the rejection was decided from the start has become alarmingly common throughout USG.
But again, the *statistical* point here is that there is simply no good alternative except for courageous and influential people using whatever protections or positions they enjoy to prove the truth about it over and over and over until just maybe it becomes embarrassing for a purportedly educated person to reflexively reject it in the manner of a proud know-nothing. Murray hoped for this once, especially as better genetic insights were discovered and accumulated, but alas it has not yet come to pass, as politically-indispensable beliefs have become impervious to counterargument in what passes for our 'intellectual' culture.
3. "I would regard method (a) as reliable, as long as the samples were large enough." - We have had had many, huge, reliable samples for a long time. The results of method (a) are easily available for anyone to look at, often for free. We have even have some solid genetics results now (e.g. Stephen Hsu's work), these are the opposite of secrets (e.g., Connor and Pesta: https://www.biorxiv.org/content/10.1101/2021.05.14.444173v2.full.pdf)
Look, I totally get why it has to be done sometimes. And I'm certainly not going to do what the progressives do and yell, "The Science Is Settled! SETTLED!!!" But it is simply not accurate to write about these matters as if this kind of research has not yet been done or not in sufficient amount and thus that there is still so much empirical uncertainty about these matters that really nobody can say anything with any confidence yet and it's reasonable to stay agnostic or on the fence about it, etc., etc. That is just not the case.