General intelligence, social privilege, and causal inference from factor analysis
by Sebastian Benthall
I came upon this excellent essay by Cosma Shalizi about how factor analysis has been spuriously used to support the scientific theory of General Intelligence (i.e., IQ). Shalizi, if you don’t know, is one of the best statisticians around. He writes really well and isn’t afraid to point out major blunders in things. He’s one of my favorite academics, and I don’t think I’m alone in this assessment.
First, a motive: Shalizi writes this essay because he thinks the scientific theory of General Intelligence, or a g factor that is some real property of the mind, is wrong. This theory is famous because (a) a lot of people DO believe in IQ as a real feature of the mind, and (b) a significant percentage of these people believe that IQ is hereditary and correlated with race, and (c) the ideas in (b) are used to justify pernicious and unjust social policy. Shalizi, being a principled statistician, appears to take scientific objection to (a) independently of his objection to (c), and argues persuasively that we can reject (a). How?
Shalizi’s point is that the general intelligence factor g is a latent variable that was supposedly discovered using a factor analysis of several different intelligence tests that were supposed to be independent of each other. You can take the data from these data sets and do a dimensionality reduction (that’s what factor analysis is) and get something that looks like a single factor, just as you can take a set of cars and do a dimensionality reduction and get something that looks like a single factor, “size”. The problem is that “intelligence”, just like “size”, can also be a combination of many other factors that are only indirectly associated with each other (height, length, mass, mass of specific components independent of each other, etc.). Once you have many different independent factors combining into one single reduced “dimension” of analysis, you no longer have a coherent causal story of how your general latent variable caused the phenomenon. You have, effectively, correlation without demonstrated causation and, moreover, the correlation is a construct of your data analysis method, and so isn’t really even telling you what correlations normally tell you.
To put it another way: the fact that some people seem to be generally smarter than other people can be due to thousands of independent factors that happen to combine when people apply themselves to different kinds of tasks. If some people were NOT seeming generally smarter than others, that would allow you to reject the hypothesis that there was general intelligence. But the mere presence of the aggregate phenomenon does not prove the existence of a real latent variable. In fact, Shalizi goes on to say, when you do the right kinds of tests to see if there really is a latent factor of ‘general intelligence’, you find that there isn’t any. And so it’s just the persistent and possibly motivated interpretation of the observational data that allows the stubborn myth of general intelligence to continue.
Are you following so far? If you are, it’s likely because you were already skeptical of IQ and its racial correlates to begin with. Now I’m going to switch it up though…
It is fairly common for educated people in the United States (for example) to talk about “privilege” of social groups. White privilege, male privilege–don’t tell me you haven’t at least heard of this stuff before; it is literally everywhere on the center-left news. Privilege here is considered to be a general factor that adheres in certain social groups. It is reinforced by all manner of social conditioning, especially through implicit bias in individual decision-making. This bias is so powerful it extends not to just cases of direct discrimination but also in cases where discrimination happens in a mediated way, for example through technical design. The evidence for these kinds of social privileging effects is obvious: we see inequality everywhere, and we can who is more powerful and benefited by the status quo and who isn’t.
You see where this is going now. I have the momentum. I can’t stop. Here it goes: Maybe this whole story about social privilege is as spuriously supported as the story about general intelligence? What if both narratives were over-interpretations of data that serve a political purpose, but which are not in fact based on sound causal inference techniques?
How could this be? Well, we might gather a lot of data about people: wealth, status, neighborhood, lifespan, etc. And then we could run a dimensionality reduction/factor analysis and get a significant factor that we could name “privilege” or “power”. Potentially that’s a single, real, latent variable. But also potentially it’s hundreds of independent factors spuriously combined into one. It would probably, if I had to bet on it, wind up looking a lot like the factor for “general intelligence”, which plays into the whole controversy about whether and how privilege and intelligence get confused. You must have heard the debates about, say, representation in the technical (or other high-status, high-paying) work force? One side says the smart people get hired; the other side say it’s the privileged (white male) people that get hired. Some jerk suggests that maybe the white males are smarter, and he gets fired. It’s a mess.
I’m offering you a pill right now. It’s not the red pill. It’s not the blue pill. It’s some other colored pill. Green?
There is no such thing as either general intelligence or group based social privilege. Each of these are the results of sloppy data compression over thousands of factors with a loose and subtle correlational structure. The reason why patterns of social behavior that we see are so robust against interventions is that each intervention can work against only one or two of these thousands of factors at a time. Discovering the real causal structure here is hard partly because the effect sizes are very small. Anybody with a simple explanation, especially a politically convenient explanation, is lying to you but also probably lying to themselves. We live in a complex world that resists our understanding and our actions to change it, though it can be better understood and changed through sound statistics. Most people aren’t bothering to do this, and that’s why the world is so dumb right now.
The second half of your critique reminds me of intersectionality. Reducing people to a single dimension (race, gender, wealth, education, age) makes it difficult to understand what is going on and it’s important to consider how different identities and privileges interact with each other and can become more or less salient depending on context.
Yes, that’s a good point. Intersectionality is more or less a mathematical consequence of thinking about a higher dimensional space of people.
Your characterization of intersectionality makes that idea sound especially robust and nuanced : “how different identities and privileges interact with each other and can become more or less salient depending on context.” That’s a wonderful way to phrase the variability of “what’s going on” in general.
I’ve heard “intersectionality” used in a much more basic way in even supposedly elite academic contexts, which I is quite common. In that understanding of intersectionality, more dimensions are considered (i.e. race and gender) but the impact is judged in a way that is (a) decontextualized and (b) basically additive.
I would argue that this basic version of intersectionality has a lot in common with the kind of “factor analysis” thinking Shalizi critiques. It is a similar *mathematical operation* on the underlying data set, which reduces the dimensionality of a complex observed world while losing a lot of causal structure.
The basic idea that Shalizi uses to explain away the g factor has been around for a hundred years. The question is then why the g factor model has persisted in psychology, even becoming more popular in recent decades. Is it just racism? Given that nothing prevents you from asserting that the black-white IQ difference is due to gaps in several different abilities, rather than in g, I don’t see why the g factor would be necessary for any racial theorizing; these two racial difference models are just a factor rotation away from each other. Here’s a critique of Shalizi’s article which sets forth some of the reasons why the g factor model has persisted in psychology. Racism is a no-show.
You claim that “when you do the right kinds of tests to see if there really is a latent factor of ‘general intelligence’, you find that there isn’t any.” That’s of course nonsense. No such magical tests exist, nor does Shalizi claim that they do (unless you are talking about Spearman’s original “two-factor” model). The only way to get clarity on this question is to triangulate evidence from various sources so as to arrive at the simplest and most likely explanation. A purely conceptual analysis like Shalizi’s won’t do.
Now, my personal hunch is that the g factor is not only real, but that it largely explains the existence of group-based “social privilege.” The virtue of this hypothesis is that it can be subjected to rather strong tests (particularly if you use admixture mapping). Of course, no one wants to test it, so it remains a hypothesis.
Thank you for this comment. I apologize for taking so long to read and approve it–I came upon a busy spell shortly after writing this and was wary of getting involved in an argument on the Internet, whichever way it turned.
The blog post you link to about the evidence for g-factor is long and I look forward to reading it. I may have something more insightful to say after I do. In any case, it looks like a very interesting exposition of applied statistics.
As to the motivational crux about the relationship between social privilege, intelligence g factor, and other confounding variables (common causes, say)… what is there to say? I agree that it should be tested!