What I’ve Learned:
“Statistical significance: ‘Do you feel 95% confident, punk?'”
People are scared of numbers. Sometimes, the fear is justified. A 330 on your credit score report, for instance, is genuinely horrifying. So is a 410 on your SAT. Or anything greater than “two”, when asked how many cats your mother owns.
But most numbers are harmless. People only fear them because they might wind up in a statistic, and everyone is afraid of statistics. The saying is not “lies, damned lies and sharks with frickin’ laser beams”. It’s statistics. Even scarier than laser-sharks.
The problem is understanding. I can help — though only to a degree, because mathematics are involved, and I swore after memorizing the Pythagorean theorem that I was “full”, and couldn’t learn any more math.
(Which is probably why I’m familiar with the horror of subpar credit scores. And low SATs.
Someday, this will probably drive my mother to adopt a dozen cats. But not yet. Whiskers crossed.)
Happily, you don’t need math to demystify statistics; you only need to know about statistical significance.
(Although you might need a calculator or a fancy-ciphering web page to do some maths for you. Stand on the shoulders of Poindexters, my friend.)
Statistics can be manipulated to say just about anything — like a willing stool pigeon, or a guy trying to get a date with a lingerie model. The question is how confidently those stats say something, and that’s where statistical significance comes in.
Most scientists will run with a conclusion if they believe it’s at least 95% likely to be true. Some tests require 99%, and a few really crucial questions — like, can we clone Neil DeGrasse Tyson’s mustache in time for Halloween — need a 99.99% (or greater) probability before they’re accepted.
So how do researchers achieve those levels of confidence? Flip a thousand coins and see what comes up? Ask a Magic 8-Ball which answer is better? Co-author their papers with a pigskin-prognosticating porcupine?
(Based on recent scientific scandals, yes. A few of them apparently do.
But we try to weed these idiots out, based on their SAT scores. Or how many cats their mothers own.)
Real scientists determine statistical significance by performing calculations that take important factors into account, like the number of observations and the likelihood of the results.
For example, the “p-value” calculation, which involves math with Greek letters and squiggly brackets and other head-exploding details. But just remember it like this: the “p” in p-value stands for “pssshaw“, as in: “Pssshaw, you’re wrong; I bet your mom owns so many cats.”
Once calculated, the p-value is the probability (subtracted from 1) that your scientific conclusion is full of smoking cat turds. A 1.0 means you’re one hundred percent talking out your ass, and a value of 0.05 means you can be 95% sure you’re not vocalizing through your rectum.
The keys to getting low — meaning good — p-values are making a lot of observations, and having most of those come out one way, and not the other. A million dice rolls where every number comes up just as often doesn’t tell you anything about what’s coming up next. And — to the chagrin of sportscasters everywhere — a winning (or losing) streak of one, two or eight games isn’t sufficient to make their pre-game blather “significant”. Or coherent, if there’s a liquor cabinet in the press box.
Another example: over the years, I’ve worked with a number of Belgians. From my observations, 100% of Belgians are named Paul, 100% wear fashionable sweaters, and 50% say really inappropriate things in the workplace.
Those are statistics, based on real observations — and some very uncomfortable staff meetings. But do the conclusions have any statistical significance? If the number of observations is ten million, sure. If the number is two (which it is), then no, more observations are needed. You should take these stats, and all others with low (or ambiguous) statistical significance, with a healthy grain of salt.
Also, a huge pile of kitty litter. But preferably not from your mom.