Skip to main content

No sex please, I'm a statistician

My 'garbage statistics' detector went onto high alert when I noticed the headline 'Don’t panic, but there’s a one in 30 chance you’ve had sex with your cousin' in that usually understated publication*, the Metro.

We'll come back to that specific statistic, but it was probably prompted by a press release from a company called AncestryDNA, which apparently has done a 'demographic analysis' to produce this shock result. I can't find any link to the actual research, which is a touch suspicious, but various other publications have produced information from the press release including:

  • 'For the average Brit' there's a one in 300 chance that a complete stranger is their cousin
  • The average British person has 193,000 living cousins within Britain
  • The 'typical Brit' has five first cousins, right up to 174,000 sixth cousins
  • Researchers used birth rates and census data to estimate how many close living relatives each of us has.
Okay, so now we're getting a little closer to the facts, although there's still a lot of room for baloney in what we've been given. That last piece of information does mean that in principle such calculations are possible, though I suspect it was sampled rather than using the full census data.

We can, first of all, totally dismiss the headline, even if the numbers are right, because no one considers sixth cousins to be 'cousins' - in fact 'cousin' on its own specifically means first cousin - so this is clearly a ridiculous exaggeration. But it would also be interesting to see if those two figures - there's a 1 in 300 chance that complete stranger is a [sixth cousin or closer] and 1 in 30 chance 'you've had sex' with them.

There are about 65 million people in the UK, so select one at random and to get that 1 in 300 chance having 193,000 relatives is the right order of magnitude. However, none of us has an equal chance of coming into contact with everyone in Britain. My suspicion is that because there often clusters of relatives near where we live, there may be a better than 1 in 300 chance that a stranger you meet at random is a sixth cousin or closer.

I struggle a lot more with the 'one in 30 chance you've had sex with your cousin.' Firstly, as a headline it's too specific. They didn't say 'the average Brit', they said 'you.' Hardly any individual is 'the average Brit', so immediately this falls down as a suggestion. But even if we rework the headline to 'there's a one in 30 chance the average Brit had sex with their cousin' there are big problems.

The majority of individuals will have had significantly fewer than the mean number of sexual partners. Why? In a 2010 survey, these were apparently 9.3 for men and 4.7 for women. This average comes from a very skewed distribution. Women, for instance, can only have had 5 fewer than the average (in round figures) number of partners but could have had many more than the average. So this makes the average unrepresentative.

My guess (I could be wrong, because I don't have any information on the 'research') is that all AncestryDNA did was to take than 1 in 300 chance of a stranger being a cousin and divide it by 10 as the average number of partners. If so, that is dire in so many ways. They seemed to have applied the male figure to the population as a whole. Then there are issues with the way the population is segmented. One is that we are even less likely to have sex with someone from anywhere in the country than we are to meet them. And the other problem is that we tend to have sex with people of relatively similar age. This cuts out a vast swathe of the population, and could have a significant impact in terms of chances of being related.

You could say I'm breaking a butterfly on the wheel here. It was just a 'fun bit of research' for marketing purposes. But once you claim you have done serious research and get the media to spread it around, I think there is a responsibility to be clear how the numbers are produced, and to make that research as high quality as possible.


*Irony alert

Comments

Popular posts from this blog

Why I hate opera

If I'm honest, the title of this post is an exaggeration to make a point. I don't really hate opera. There are a couple of operas - notably Monteverdi's Incoranazione di Poppea and Purcell's Dido & Aeneas - that I quite like. But what I do find truly sickening is the reverence with which opera is treated, as if it were some particularly great art form. Nowhere was this more obvious than in ITV's recent gut-wrenchingly awful series Pop Star to Opera Star , where the likes of Alan Tichmarsh treated the real opera singers as if they were fragile pieces on Antiques Roadshow, and the music as if it were a gift of the gods. In my opinion - and I know not everyone agrees - opera is: Mediocre music Melodramatic plots Amateurishly hammy acting A forced and unpleasant singing style Ridiculously over-supported by public funds I won't even bother to go into any detail on the plots and the acting - this is just self-evident. But the other aspects need some ex

Is 5x3 the same as 3x5?

The Internet has gone mildly bonkers over a child in America who was marked down in a test because when asked to work out 5x3 by repeated addition he/she used 5+5+5 instead of 3+3+3+3+3. Those who support the teacher say that 5x3 means 'five lots of 3' where the complainants say that 'times' is commutative (reversible) so the distinction is meaningless as 5x3 and 3x5 are indistinguishable. It's certainly true that not all mathematical operations are commutative. I think we are all comfortable that 5-3 is not the same as 3-5.  However. This not true of multiplication (of numbers). And so if there is to be any distinction, it has to be in the use of English to interpret the 'x' sign. Unfortunately, even here there is no logical way of coming up with a definitive answer. I suspect most primary school teachers would expands 'times' as 'lots of' as mentioned above. So we get 5 x 3 as '5 lots of 3'. Unfortunately that only wor

Which idiot came up with percentage-based gradient signs

Rant warning: the contents of this post could sound like something produced by UKIP. I wish to make it clear that I do not in any way support or endorse that political party. In fact it gives me the creeps. Once upon a time, the signs for a steep hill on British roads displayed the gradient in a simple, easy-to-understand form. If the hill went up, say, one yard for every three yards forward it said '1 in 3'. Then some bureaucrat came along and decided that it would be a good idea to state the slope as a percentage. So now the sign for (say) a 1 in 10 slope says 10% (I think). That 'I think' is because the percentage-based slope is so unnatural. There are two ways we conventionally measure slopes. Either on X/Y coordiates (as in 1 in 4) or using degrees - say at a 15° angle. We don't measure them in percentages. It's easy to visualize a 1 in 3 slope, or a 30 degree angle. Much less obvious what a 33.333 recurring percent slope is. And what's a 100% slope