A recent article from the US National Institutes of Health investigated whether there is such a thing as "gaydar", (for those living under a rock, the ability to correctly identify the sexual orientation of others at twenty paces.)
What is particularly interesting about their study is that it demonstrates, not only that gaydar is not a real skill, but that the (not-working) skill people call gaydar is really just a polite term for stereotyping.
Stereotyping is, is the more grownup term for "judging a book by its cover". It's not quite making the mistake of thinking "correlation is the same as causation" but it's an important and pernicious example of what happens when you don't understand correlation. It allows you to (mistakenly) infer vast amounts of information about individuals you don't know based tiny facts about them — things like their gender, skin colour.
The issue is that correlation is useful. And those who want to argue we should be taking advantage of that imperfect, but statistically relevant knowledge to make more efficient use of our time (say, police and airport security devoting more of their attention to individuals with dark skin and/or beards, or pundits drawing wild inferences from Donald Trump's taste in steak.).
In practical fact, we never get to establish causation, we only get to infer it from very particular instances of correlation — say, for example, when the correlation occurs inside of a carefully designed double-blind controlled experiment.
But correlation is also confusing and counter-intuitive — as is probability in general. This get especially bad when the event in question is very improbable. (In this case, identifying homosexuals, who are a minority in the population.) In those cases, humans are horrible at estimating and making predictions. In the case of the experiment, subjects mis-identified straight men as gay as much as 40% of the time.
That's because these estimates of probability are based on incomplete information. You take a belief like "Lots of gay men have Lady Gaga and Elton John on their iPods" and want to use that fact to judge whether a random person is gay by examining his iPod.* And say you actually surveyed a sufficiently large number of gay men and sufficiently many of them actually did listen to that music. Go so far as to say, for the sake of argument, that proportion is compellingly high. You should be able to use that to figure out whether or not someone is gay.
Unless you've actually sat through formal training in statistics, it's really hard to see what's wrong with that argument. It's obvious that it's not going to be a perfect correlation: some straight men will listen to that music, but it's a safe bet. It's the same argument that justifies profiling in law enforcement, the same argument that kept OJ Simpson out of prison, and the same reason that motivational drivel like The Seven Habits of Highly Effective People gets on the bestseller.
Throw in a little bit of confirmation bias, and if you correctly identify someone as gay one in ten times and you'll think you have a reliable system in place.
The problem is that language isn't precise enough to answer statistical questions. There are really two questions being confused:
- What proportion of gay men listen to Elton John?
- What proportion of men who listen to Elton John are gay?
Those questions seem interchangeable at the level of conversational English but they have completely different answers. The chance for errors explodes because straight men significantly outnumber gay men. (The species wouldn't have lasted very long if they didn't.) . It's the second one that matters and it's the second one which requires more data to answer. You actually have to spend most of your energy investigating the music tastes of straight men, the people outside of your target group. Which is the part that everyone misses.
The lesson is simple. Correlation is a powerful tool for inference in the hands of people trained to use it, but the rest of us vastly overestimate our ability to judge books by their covers. We are masters of making generalizations and jumping to conclusions.
In an unexpected way, I was teaching this principle to kids in a magic workshop over March Break. The objective was to do a rope trick where the magician had a secret knot tied on the rope but concealed in their hand. I had to explain (I should point out these magicians in training were under twelve) "You can't say, 'This rope has no knots in it, even inside my hand." They laughed and understood immediately. Simply saying, "I have a rope," implies there is no knot more effectively than any knot-related statement ever could.
It's difficult to overestimate the value in magic of the simple principle of keeping your mouth shut and allowing the audience to accept things at face value — to judge the book by its cover. When I think about it from the performer's side, it's a frightening reminder of the power inherent in allowing the audience to fool themselves.
*This person is clearly, like me, still living in 2009.