Thursday, 29 November 2012

Beware the average

Which one's the average house?
I was struck by an item on the local news this morning saying that the average house price in the UK was £163,910 according to the Nationwide Building Society. This seemed a dubious statistic. Why? Because the average (or mean) is not a good measure of a distribution that isn't symmetrical. It's highly misleading. That's because the vast majority of houses in the UK are worth less than the average house price - and that is downright confusing.

Let's look at a simpler example to see what's going on. Imagine we have a room full of people and take their average earnings. Then we throw Bill Gates into the room. Bill's vast income would really bump up the average - so probably everyone else in the room would earn less than the average. The new average would not be representative of the room as a whole.

The reason a relatively small number of cases (in our room, Bill) can have a big impact is because the distribution - the spread of the incomes - is not symmetrical. Let's say the average income before Bill entered the room was £26,000 a year. Then the absolute maximum anyone can fall below that average is by £26,000. But there is no limit to how far above the average you can be. In Bill's case, he will be millions higher. So he has a much bigger impact on the average than a poor person does.

In such cases, the median is a very valuable number to know. This is just the middle value. We put all the people in a row in order of earnings and pick the middle number. With a distribution like our room - or house prices - the median gives us a much better feel for what a typical value is like than the average.

Which takes us back to the Nationwide. I took the liberty of dropping their Chief Economist, Robert Gardner an email and he was kind enough to call me back within 10 minutes (and to email through some bumf). You really wouldn't expect a financial institution to make such a basic statistical mistake... and they haven't. What the Nationwide repeatedly calls an average in their press releases isn't a simple average at all. Instead they stratify the data according to region, type of house and so forth and produce a rather messy weighted figure that could arguably be said to be the typical value - but it certainly isn't an average.

You can argue whether they should be rather clearer about just what the figure they are producing is, rather than calling it the average house price as they do, but at least it is a meaningful figure.

In other statistics, I'm afraid the press simply gets the words wrong. Quite often a government bureau will publish a median value and an average - they do so on earnings, for instance. What the media often does is to take the median value, because it's more meaningful, but calls it the average (presumably because they think the poor public can't cope with a hard word like 'median'). That's just bad journalism.

This distortion of the average is something that politicians wishing to attack another party and not being too scrupulous about their statistics can use to their advantage. If we want to tax those on high earnings and find the tax hits someone on the average wage, then there is an outcry, because that seems to imply that it hits the majority of ordinary people – but the majority actually earn less than the average wage. The naughty politician can play the numbers even more effectively by putting two people on an average wage into a household. Now we are not only using individuals that earn more than most, but a household where both partners do so. This pushes their collective income up so high that it puts the household in the top 25 per cent of all households, even though we are talking about two people who are on an average wage.

There's a simple message. Whenever you hear 'average' in statistics on the news or see them presented, it's worth taking the numbers with a pinch of salt unless you can verify just what lies behind that value.