Sans Fig Leaf
w
|
"The measure of wrongs"17 April, 2008 |
|
|
Many years ago everyone at the place I was then working had to attend a seminar about theft preventation and office security. The consultant leading the event told us that "less than ten percent of office theft crimes are solved." Only a few moments later he said, "four times out of five the office thief is an employee." Before I could say it, the accounting manager interrupted to ask, "How can you possibly know that 80 percent of the time the thief is an employee, if over 90 percent of the time you never find out who did it? What you mean is, in the virtually insignificant fraction of cases where the culprit is found, it's usually an employee." He tried to bluster through it, claiming that the sample of ten percent was enough to draw inferences from, but she would have none of it. She proceeded to lecture him on the difference between a statistically representative sample and a skewed sample. The problem was that, when a workplace crime was committed by an employee, the culprit was far more likely to be caught than if a total stranger committed a crime, if for no other reason than the employees can all be scrutinized by the investigator. "You already have their names, addresses, and can find their bank accounts," she explained. The sample, was therefore, always going to contain a disproportionate number of employees. I was reminded of this when I read a newspaper article which claimed that the most recent gubernatorial race in the state where I live had been "a statistical tie." While an article in another paper referred to at as "within the margin of error." Except both were wrong. In statistics, the margin of error is a way to quantify how much your sample may vary from the total population. So if you are surveying a bunch of potential voters, you want to know how likely it is the people you interviewed accurately represent the opinion of all the voters. But an election isn't the same as a survey. In an election, all that matters are the ballots that are actually cast. If a person means to vote but fails to, their vote is simply not part of the pool. You count all of the ballots, not a randomly selected sample of them. There can be mistakes in counting, but those mistakes aren't statistical errors. Ballots may be mangled, lost, difficult to read, or misread. If the vote is extremely close, being decided by a tiny fraction of a percent of the total votes cast, those counting inaccuracies can call the outcome into question. But it isn't the same thing as a statistical margin of error. While that may seem merely a matter of semantics, it's not. First, as far as we can tell, errors in counting or measuring usually amount to a much tinier "margin of mistakes" than the margins of error in surveys. More importantly, because the margin of error is a measure of how likely your sample represents the population, it is not necessary a measure of how close your sample's results are to a measure of the entire population. In other words, if 49% of those surveyed favor Candidate Smith with a "margin of plus-or-minus 5 percentage points," that does not mean that the actual percentage of the entire population who prefer Smith lies somewhere between 44% and 54%. That depends on another number, called the confidence interval or confidence level. And both the confidence interval and margin of error represent probabilities. No matter what the confidence level is, there will always be a chance that the numbers are completely out of the ballpark. So, when we are told that all the surveys running up to an election show two candidates in a "statistical tie," that does not mean that neither candidate is ahead. All it means is that the statisticians cannot determine which one is ahead. They may be nearly tied, but they may not. Talk of margins and intervals and probabilities is enough to make your head spin--and I haven't even mentioned the difference between relative and absolute margins. What's worse, the people reporting these things seldom understand all those differences, let alone know how and at what confidence level the sampling error was calculated. Plus several of the terms mean different things in different mathematical contexts. When used properly, by people who understand how they were calculated, and how all those margins, intervals, and probabilities work, statistics can be a powerful tool. Disasters have been averted and great works have been accomplished thanks to statistics. But no matter how meticulously the errors are calculated, there's always a chance we're wrong. |
||
|
The world always makes the assumption that the exposure of an error is identical with the discovery of truth--that the error and truth are simply opposite. They are nothing of the sort. What the world turns to, when it is cured of one error, is usually simply another error, and maybe one worse than the first one. --H. L. Mencken . |
||
| Previous Index Next Email | ||
Copyright © 2008 Gene Breshears. All Rights Reserved.