Skip to main content

The unnerving nature of collider bias

Not for the first time, I was inspired by listening to the excellent The Studies Show podcast. In their 26 November edition, Tom Chivers and Stuart Ritchie introduced collider bias, which is a horrible statistical anomaly that can make it seem that a study shows something entirely different from reality. What's particularly worrying is that, as the podcast demonstrates, many scientists aren't aware of this potential issue.

I had assumed collider bias would be something to do with the kind of huge statistical analysis necessary to interpret what's going on in a piece of equipment like the Large Hadron Collider at CERN - but in reality the 'collision' in question is simply down to the way a pair of arrows point to the same location on a kind of flow diagram. What this statistical anomaly can produce, though, is the kind of result we love to hear (and scientists love to find) - results that make you go 'Huh? That's surprising.'

Examples given included that of those hospitalised with COVID-19, smokers were more likely to survive than non-smokers; amongst cardiovascular disease sufferers, obese patients live longer; success at basketball is not linked to a player's height; and PhD candidates who have high scores in the tests often used to decide if someone should start on a PhD are not more likely to succeed than those who score badly.

To understand this, Tom and Stuart ask us to imagine a study of Hollywood stars. They suggest you get to be a Hollywood star because you are either beautiful or a talented actor (or both). Assuming more actors major on one attribute than both, then, for the population sample that is 'Hollywood stars', you will find that beauty is negatively correlated with acting talent. Actors with talent, it would seem, tend not to be beautiful, and vice versa. This would be statistically true, but there is no causal link. The real danger is then to apply the same reasoning to the population at large and think that to be a great actor, a person should be ugly. But it's an artefact of the way Hollywood stars are chosen, not a true causal relationship.

In the surprising examples mentioned above, where this has occurred in real studies (often resulting in convoluted arguments as to why, say, being obese gives better survival from cardiovascular disease), it's because in each case we are looking at a sub-population - for example professional basketball players or PhD candidates, not considering people at large. So, for example, successful PhD candidates tend to be either highly intelligent or very hard workers (or both). But by only looking at successful PhD candidates, those two groups will dominate, where looking at the population at large, highly intelligent (and hence high scoring) people will be more likely to gain a PhD.

In their podcast (to be honest, one of their more meandering episodes, as this is a really difficult effect to describe), Tom and Stuart point out that this is relatively easy to spot when the result is so counter-intuitive, a reasonable flag to check if there is something wrong with the analysis. But the error can be missed if it's less stand-out. 

Arguably a starting point should be that if you are studying a group that isn't typical of the population at large, then you need to be aware of this danger. This should be of particular interest, for example, to psychologists, who often do studies using university students as participants, because they are cheap and readily available. Unfortunately, though, they may well beg a collider bias population just waiting to happen. Take a listen to the podcast to find out more.

Image by Brandon Style from Unsplash... but it's not that sort of collider.

These articles will always be free - but if you'd like to support my online work, consider buying a virtual coffee:

See all Brian's online articles or subscribe to a weekly email free here

Comments

  1. Another salient example of collider bias, shamelessly exploited by Kovid Kranks, is that the majority of people who die of Covid have been vaccinated. This is true but massively disingenuous (at best). It's exactly the result we would expect if large numbers of people have been given a good, but imperfect, treatment, versus those who did not get the treatment and were exposed to the original (highly dangerous) problem. It's equally true that the majority of people who die in car crashes were wearing their seat belt, but nobody would stop wearing seat belts on that basis — although similar arguments were actually used to argue against seat belts in the 1980s by the "but muh freedom" crowd.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why I hate opera

If I'm honest, the title of this post is an exaggeration to make a point. I don't really hate opera. There are a couple of operas - notably Monteverdi's Incoranazione di Poppea and Purcell's Dido & Aeneas - that I quite like. But what I do find truly sickening is the reverence with which opera is treated, as if it were some particularly great art form. Nowhere was this more obvious than in ITV's 2010 gut-wrenchingly awful series Pop Star to Opera Star , where the likes of Alan Tichmarsh treated the real opera singers as if they were fragile pieces on Antiques Roadshow, and the music as if it were a gift of the gods. In my opinion - and I know not everyone agrees - opera is: Mediocre music Melodramatic plots Amateurishly hammy acting A forced and unpleasant singing style Ridiculously over-supported by public funds I won't even bother to go into any detail on the plots and the acting - this is just self-evident. But the other aspects need some exp...

Is 5x3 the same as 3x5?

The Internet has gone mildly bonkers over a child in America who was marked down in a test because when asked to work out 5x3 by repeated addition he/she used 5+5+5 instead of 3+3+3+3+3. Those who support the teacher say that 5x3 means 'five lots of 3' where the complainants say that 'times' is commutative (reversible) so the distinction is meaningless as 5x3 and 3x5 are indistinguishable. It's certainly true that not all mathematical operations are commutative. I think we are all comfortable that 5-3 is not the same as 3-5.  However. This not true of multiplication (of numbers). And so if there is to be any distinction, it has to be in the use of English to interpret the 'x' sign. Unfortunately, even here there is no logical way of coming up with a definitive answer. I suspect most primary school teachers would expands 'times' as 'lots of' as mentioned above. So we get 5 x 3 as '5 lots of 3'. Unfortunately that only wor...

Why backgammon is a better game than chess

I freely admit that chess, for those who enjoy it, is a wonderful game, but I honestly believe that as a game , backgammon is better (and this isn't just because I'm a lot better at playing backgammon than chess). Having relatively recently written a book on game theory, I have given quite a lot of thought to the nature of games, and from that I'd say that chess has two significant weaknesses compared with backgammon. One is the lack of randomness. Because backgammon includes the roll of the dice, it introduces a random factor into the play. Of course, a game that is totally random provides very little enjoyment. Tossing a coin isn't at all entertaining. But the clever thing about backgammon is that the randomness is contributory without dominating - there is still plenty of room for skill (apart from very flukey dice throws, I can always be beaten by a really good backgammon player), but the introduction of a random factor makes it more life-like, with more of a sense...