Friday, October 19, 2007

An Average Lie

I thought this was an astoundingly logical article from my publisher, Bob Gelinas. I'll let it speak for itself...ponder it.

(posted with Bob's permission)

An Average Lie
By Robert E. Gelinas, © 2007 All Rights Reserved.
"Average temperatures have climbed 1.4 degrees Fahrenheit (0.8 degree Celsius)
around the world since 1880, much of this in recent decades."
NASA's Goddard Institute for Space Studies
The truth may be puzzling. It may take some work to grapple with. It
may be counterintuitive. It may contradict deeply held prejudices. It may
not be consonant with what we desperately want to be true. But our
preferences do not determine what's true.
Carl Sagan

Irrationally held truths may be more harmful than reasoned errors.
Thomas H. Huxley

Here's an old joke, albeit dark humor: A guy is being tortured. He's bound and forced to sit naked on a
block of ice, freezing his behind, getting frostbite. They pour lighter fluid on his head and set fire to his hair. He's screaming in agony. One of his tormentors turns to another and remarks, "I don't know what he's complaining about. On average, he's doing just fine."

Indeed, statistically speaking, if your hair is on fire and your butt is freezing, then the mathematical
average between the two extremes may theoretically be "just fine"—even if the physical truth is torturous agony. This is an excellent example of the old line by Mark Twain in his autobiography: "There are three kinds of lies—lies, damn lies, and statistics."

In the context of "statistics," let's look at the simple and clear definition of the word "Average," used in the sense of an "arithmetic mean." Per the American Heritage Dictionary: The value obtained by dividing the sum of a set of quantities by the number of quantities in the set. For Example: Four students take a test. Their test scores are 75, 85, 90, and 95. The sum of these 4 numbers is 345. 345 divided by 4 equals 86.25. Therefore, the average test score was 86.25. Simple. Two students scored above average, one near average, one below average. This might be useful information to the teacher in helping her instruct her students. So far, so good.

Let's try another example of averaging. At the zoo there are 4 lions, 2 elephants, 6 zebras, 12 monkeys and 100 fish. What's the "average animal"? Wait – not the most populous, not percentages of the total. What's the average animal? Scratching your head? Do the math. OK, let's see… 4+2+6+12+100 = 124,
divided by 5 = 24.8. What does 24.8 mean? Is that the average number of animals of each kind? If so, that wasn't the question. Do you think perhaps the basic question makes no sense? Does the very concept of computing an "average animal" seem a bit absurd?

Let's try again. There are approximately 6 billion humans on earth. 51% of them are female, 49% are male. So what then is the "average gender"? That question doesn't make much sense either, does it?

OK then, what's the "average race" of all humans on earth? Or, what's the "average religion" of everyone on earth? Not the most predominant, not the percentage of each within a population—no, the question was: what's the mathematical average? What number is computed when the values of all the
samples are added together and then divided by the number of samples? What do these computed numbers represent? Still an absurd question?
Yes it is, and that's because the basic assumption of computing an "average anything" is that you are comparing a measured common value of all members in a common or homogenous set.

Let's try something everyone is a little more familiar with, something simpler. Let's try the weather.

What's the "average temperature" today in South Florida, i.e. if you know that today's low is 72 and its high is 92? Well, if you are looking for a daily average, the average would be 82 degrees. We get that by adding 72 and 92 and then dividing by two. That's easy. But if we were referring to a seasonal average,
we would have to decide whether we wanted to look at the daily high or low, or use the daily average like we just computed, and then observe how that number compares to the same day of the year in years past, or within some window of time, or something along those lines. Right?

If you did that exercise, all that math for the weather in South Florida on a particular day could be statistically correct. But wait a second. Let's go back to the daily average of 82 degrees. How long during the twenty-four hour day was the average of 82 degrees a physical reality versus a statistical abstract? In
fact, the low for the day occurs close to sunrise, just before dawn; and then after the sun rises, the heat of the day increases rapidly, getting into the high eighties by mid-morning and staying in the nineties all afternoon. So the actual period of time the air temperature is 82 degrees occurs in a very small window of time, and doesn't cool off below 82 until after midnight, technically the next day. In reality, if you took a temperature measurement each hour of the day, you'd discover that the true "average" temperature during that same day was much higher than 82. So with respect to the math used to compute the daily average, which was based solely on looking at only two data points of the high and the low, while it was statistically correct, it was also physically wrong.

Let's add another wrinkle to this line of thought. What is the average temperature between South Florida and Anchorage, Alaska on this same day we're looking at? Let's only look at the highs this time and assume the following as our measurements: it's 92 degrees for the high in South Florida, and Anchorage has a high of 50 degrees. If we add those two numbers we get 142. Divide them by two and we get 71 degrees as the "average temperature" for these two locales on that day. Again, this number is mathematically correct. But does it matter that the temperature in South Florida never got down to 71 degrees in this twenty-four hour period, or that the temperature will never get up to 71 in Anchorage? Please note, for both locales, the computed average temperature of 71 degrees never physically exists.

Let's add a couple of more data points to our "average temperature" calculations. Let's throw in Death Valley, California, which had a high of 120 in this same day we are considering, and South Pole Station in Antarctica, which had a high of minus 50 degrees on the same day. So, 92+50+120-50 = 212. Divide that
number by four and you get 53 degrees. Funnily enough, once again, while 53 degrees is the correct statistical calculation, not one of the four locales experienced a moment of 53 degrees that day. More to the point: the number 53 degrees is completely meaningless and irrelevant to each of the locales in

What's the flaw in the logic here? The problem is a simple matter of comparing "apples and oranges."

There's no common source of data points, specifically data points originating from the exact same set of causes and influences, which is what we're attempting to average. Averages can no more be applied to disparate climate locales than we can try to average animals at the zoo, gender, race, or religion. They're
different "sets," not a "common set," which defies the definition of the computation of an average. It can be done mathematically as an abstract calculation, but it makes no sense in reality.

Here's why: Remember the test taken by our four students? The common source of data points wasn't the students; it was the test. The four students were four iterations of the exact same test being taken, which is why an average score could be computed and the resulting number made sense and had some practical application. If, however, one student had taken a history test, the second a math test, the third an English test, and the fourth a biology test, would averaging their scores together have any meaning?

No. The common element of data points was that the same test was given to each student. When you change the source, or better yet "cause," of the data point values, you invalidate the common relevance of the data and make it nonsensical.
Now, you could average a history test, a math test, an English test, and a biology test, if you were talking about those four different tests all being taken by the same student. In that context, the common source of data points becomes the one student's performance. The numerical "average" computed becomes part of that student's overall grade average. However, that particular student's grade average
only has comparable applicability to any other student if another student took the exact same courses and tests—i.e. the same sampling and measurement criteria.

This is analogous to the weather in the sense that you may average multiple sampling points (a high or low within the day, week, month, year, years, etc.) if you are talking about just one locale or homogenous region, which is the common element influenced by a fixed set of factors germane to that specific place.
In the context of the weather or climate, the thermometer is simply the measuring device, like the teacher using her answer key to determine the number of right and wrong answers on a test and assigning a numerical score. The thermometer labels a value of a data point at a particular point in time, but it isn't the source or cause of the air temperature at any point in time. Air temperatures are clearly influenced by a myriad of factors: the sun, precipitation cycles, seasonal changes, day-night cycles, proximity to large bodies of water or deserts, altitude from sea level, ocean temperatures and currents, increased urbanization (concrete and asphalt tend to retain a lot more heat than grass and trees), volcanic/geothermal activity, forest fires, glacier retreat (which has been occurring continually for over 10,000 years since the last ice age), deforestation, and yes, even some man-made pollution, and many, many other factors unique to specific geographical locations.

That's why "averaging" temperatures from completely disparate climate zones, like Death Valley and Antarctica, may produce a mathematically correct statistic, but it is a number that is meaningless with respect to any individual climate used as a data point in the cumulative calculation.

The fatal flaw in the supposition here is that it is readily evident that temperatures in any and all locales are the result of many, many varying factors, not one unified cause nor a fixed and uniform set of causes, which means that temperature readings (an effect not a cause) are not "homogenous" and therefore cannot be
rationally "averaged" to produce any logical meaning. And, by the way, the total number of data points you add to this erroneously-based calculation doesn't matter. If it's nonsensical and meaningless to do it for two or four of them, then that must also be true for most if not all of them puddled together.

Otherwise, at what point of adding samples does doing something nonsensical and illogical begin to make sense and is no longer meaningless?

So…with all this in mind. When you hear reported that the "average global temperature" has risen by a little over half a degree or a whole degree in the last 100 years, what might you conclude about such an assertion?

Even if manylearned experts tell you that "the math doesn't lie," did it ever occur to anyone that the very concept of an "average global temperature"—as though a single number could accurately represent the cumulative unified condition of all the disparate climate varieties on the entire planet—in fact, doesn't exist anywhere in the real world beyond the abstract realm of statistics? And if that's true, then what value is any argument that is based upon a completely mythical concept?

"It is impossible to talk about a single temperature for something as
complicated as the climate of Earth. A temperature can be defined only
for a homogeneous system. Furthermore, the climate is not governed by
a single temperature. Rather, differences of temperatures drive the
processes and create the storms, sea currents, thunder, etc. which make
up the climate." (emphasis added)
Physicist Bjarne Andresen
Professor at The Niels Bohr Institute
University of Copenhagen
Ref: American Thinker, 3/18/07

"The great masses of the people…will more easily fall victims to a big lie
than a small one."
Adolf Hitler


Post a Comment

<< Home