statistics

All about the Bayes

Bayes' Theorem, discovered by the Reverend Thomas Bayes in the eighteenth century. When you have something you're not sure of, it's the calculation you perform to update your belief when you encounter new evidence. It's essentially the mathematical underpinning of the scientific method and it's an incredibly valuable thing to understand.

All that coolness aside, this is still the most badass titles for a statistics lesson!

On Books, Covers and Judgement

A recent article from the US National Institutes of Health investigated whether there is such a thing as "gaydar", (for those living under a rock, the ability to correctly identify the sexual orientation of others at twenty paces.)

TL/DR: No

What is particularly interesting about their study is that it demonstrates, not only that gaydar is not a real skill, but that the (not-working) skill people call gaydar is really just a polite term for stereotyping. 

Stereotyping is, is the more grownup term for "judging a book by its cover". It's not quite making the mistake of thinking "correlation is the same as causation" but it's an important and pernicious example of what happens when you don't understand correlation. It allows you to (mistakenly) infer vast amounts of information about individuals you don't know based tiny facts about them — things like their gender, skin colour.

The issue is that correlation is useful. And those who want to argue we should be taking advantage of that imperfect, but statistically relevant knowledge to make more efficient use of our time (say, police and airport security devoting more of their attention to individuals with dark skin and/or beards, or pundits drawing wild inferences from Donald Trump's taste in steak.). 

In practical fact, we never get to establish causation, we only get to infer it from very particular instances of correlation — say, for example, when the correlation occurs inside of a carefully designed double-blind controlled experiment. 

But correlation is also confusing and counter-intuitive — as is probability in general. This get especially bad when the event in question is very improbable. (In this case, identifying homosexuals, who are a minority in the population.) In those cases, humans are horrible at estimating and making predictions. In the case of the experiment, subjects mis-identified straight men as gay as much as 40% of the time. 

That's because these estimates of probability are based on incomplete information. You take a belief like "Lots of gay men have Lady Gaga and Elton John on their iPods" and want to use that fact to judge whether a random person is gay by examining his iPod.* And say you actually surveyed a sufficiently large number of gay men and sufficiently many of them actually did listen to that music. Go so far as to say, for the sake of argument, that proportion is compellingly high. You should be able to use that to figure out whether or not someone is gay.

Unless you've actually sat through formal training in statistics, it's really hard to see what's wrong with that argument. It's obvious that it's not going to be a perfect correlation: some straight men will listen to that music, but it's a safe bet. It's the same argument that justifies profiling in law enforcement, the same argument that kept OJ Simpson out of prison, and the same reason that motivational drivel like The Seven Habits of Highly Effective People gets on the bestseller.

Throw in a little bit of confirmation bias, and if you correctly identify someone as gay one in ten times and you'll think you have a reliable system in place.

The problem is that language isn't precise enough to answer statistical questions. There are really two questions being confused:

  1. What proportion of gay men listen to Elton John?
  2. What proportion of men who listen to Elton John are gay?

Those questions seem interchangeable at the level of conversational English but they have completely different answers. The chance for errors explodes because straight men significantly outnumber gay men. (The species wouldn't have lasted very long if they didn't.) . It's the second one that matters and it's the second one which requires more data to answer. You actually have to spend most of your energy investigating the music tastes of straight men, the people outside of your target group. Which is the part that everyone misses.

The lesson is simple. Correlation is a powerful tool for inference in the hands of people trained to use it, but the rest of us vastly overestimate our ability to judge books by their covers. We are masters of making generalizations and jumping to conclusions. 

In an unexpected way, I was teaching this principle to kids in a magic workshop over March Break. The objective was to do a rope trick where the magician had a secret knot tied on the rope but concealed in their hand. I had to explain (I should point out these magicians in training were under twelve) "You can't say, 'This rope has no knots in it, even inside my hand." They laughed and understood immediately. Simply saying, "I have a rope," implies there is no knot more effectively than any knot-related statement ever could. 

It's difficult to overestimate the value in magic of the simple principle of keeping your mouth shut and allowing the audience to accept things at face value — to judge the book by its cover. When I think about it from the performer's side, it's a frightening reminder of the power inherent in allowing the audience to fool themselves. 

 

*This person is clearly, like me, still living in 2009.

No (fair) Dice

Persi Diaconis is an ex-magician. He left the world of professional magic to become a professor of statistics at Stanford. But those influences are still reflected in his work as many of the simple tools used in the exploration of statistics — coins, cards, dice — are also favourite tools of the magician. So nothing specifically to do with magic, but if you wanted to know how fair your super-complicated D&D dice were.

Watch to the end to get the link to the hidden part 2!

More Magical Mathematics

This will be the first of a series of three posts dedicated to mathematics, for no other reason then the coincidence that they all appeared in my life more or less at the same time. I'll begin with an interview with Persi Diaconis on The 7th Avenue Project. It's actually a little bit out of date (over a year old) and it relates, ostensibly, to his 2011 book Magical Mathematics (co-written with Ron Graham) Professor Persi Diaconis is a remarkable figure in magic who falls into that category of "greatest magicians no one has ever heard of." Provided you're willing to allow being interviewed for podcasts, being a published author and appearing on the front page of the New York Times never being heard of.

The interview is fascinating (and long). Perhaps it's the confirmation bias talking, but he seems to spend a great deal more time discussing magic than math — not that I would think of complaining. It also highlights the important but subtle difference between magical mathematics and mathematical magic. I noticed when the interviewer tripped up on the title and realized that there really is an important difference.

The stories involving Dai Vernon and Ricky Jay are also moving. Enjoy.

The twisted mind of a mathematician

The domain where I've had the the most — what you might call — formal academic training is mathematics. Having spent years tutoring students in math (which means, by implication you're spending time with students who are less adept than the average at math) I understand that there is a definite peculiarity in the way people approach problems in math. Ordinary thinking involves guessing an answer — taking a shot in the dark — then trying to justify the guess as quickly as possible so you can move on to new problems. This manifests with students prepping for multiple choice tests saying something like, "It's B, isn't it?". And if I nod yes, they're right and they get to go onto the next question. But, as happens more often, I don't nod and that guess hasn't brought them any closer to a solution to the problem.

Math involves stepping back and looking at the problem from many different angles. It seems extraordinarily counter-intuitive if your goal is simply to get the pencil mark in the bubble for B.

Professor Persi Diaconis, in addition to being a professor of statistics, is also a world renowned magician, so when his work pops up in my news feed, I perk up. This is a wonderful example of the application of mathematical thinking to a very mundane problem. I guess the typical reaction to be a transition from this guy's so weird to this guy's so freakin' smart