Monday, 31 May 2010

Probability that mangles the mind

Generally speaking I'm a bit so-so about recreational mathematics. I can't get very excited about polyominoes or tiling, for instance. But when the field strays into probability I get fascinated - and the mind gets boggled. Take the little probability problem mentioned in the New Scientist article I've linked to there. It gets rather lost in the article, and they don't describe it particularly well. Let's take a look.

The problem statement is simple. I have two children. One is a boy born on a Tuesday. What is the probability I have two boys? But to get a grip on this problem we need first to take a step back and look at a more basic problem. I have two children. One is a boy. What is the probability I have two boys?

A knee-jerk reaction to this is to think 'One's a boy - the other can either be a boy or a girl. So there's a 50:50 chance that the other is a boy. The probability that there are two boys is 50%.' Unfortunately that's wrong.

You can see why with this handy diagram. The first blobs are the older child. It's a boy or a girl, 50:50. Then in each case we've a 50:50 chance of a boy or girl for the second child. So each of the combinations has a 1 in 4 (or 25%) chance of occuring.

All the combinations except Girl-Girl fit our statement 'I have two children. One is a boy.' So we've got three equally like possibilities, of which only one has two boys. So there's a 1 in 3 chance that there are two boys.

If this sounds surprising, it's because the statement 'One is a boy' doesn't tell us which of the two children it's referring to. If we say 'The eldest one is a boy', then our 'common sense' assessment of probability applies. If the eldest is a boy, there are only two options with equal probability - second child is a boy or second child is a girl. So it's 50:50.

Now we're equipped to move on to the full version of the problem. I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?

Again, gut feel says 'The extra information provided can't make any difference. It must still be 1 in 3.' But startlingly, the probability is now 13 in 27 - pretty close to 50:50.

To explain this I should draw another diagram, but I can't be bothered, you'll have to imagine it. In this diagram there are 14 children in the first column. First boy born on a Sunday, First boy born on a Monday, First boy born on Tuesday... First girl born on a Sunday... through to First girl born on a Saturday.

Each of these fourteen first children has fourteen second children options. Second boy born on a Sunday... etc.

That's 196 combinations, but luckily we can eliminate most of them. Either the first or second boy must be born on a Tuesday. So the combinations were interested in are the fourteen that spread out from 'First boy born on a Tuesday' plus the thirteen that start from one of the other first children and are linked to 'Second boy born on a Tuesday.' So there are 27 combinations in all. How many of these involve two boys? Half of the first fourteen do - one for a second boy born on each day of the week. And for those thirteen with links on the right to 'second boy born on a Tuesday' six of them will have a boy as the first child (because we don't include 'First boy born on a Tuesday.') So thats 7+6 i.e. 13 combinations that provide us with two boys. So the chances of having two boys is 13 in 27.

Common sense really revolts at this. By simply saying what day of the week a boy was born on, we increase the probability of the other child being a boy. But we could have said any day of the week, so how can this possibly work? The only way I can think to describe what's happening is to say that by limiting the boy we know about to a certain birth day, we cut out a lot of the options. We are, in effect, bringing it closer to the sort of effect we get by saying 'the oldest child is a boy'. We are adding information to the picture.

The probabilites work. You can model this in a computer if you like and it's correct. But what's going on mangles the mind. Don't you just love probability?

(I ought to say, by the way, that this isn't quite a match to reality. It assumes there is an equal chance that either child is a boy or a girl, and that there is an equal chances of being born on each day of the week. In reality neither of these is quite true, but that doesn't matter for the purposes of the exercise.)

13 comments:

  1. Received by email from Graham:

    Nice one Brian,

    I remember working out the same kind of thing in relation to the chances of throwing a 6 with two die, and it is not 1/3 but of course 11/36.
    What is of interest to me is how strong the drive to believe in intuition is, and how weak is our ability to process numbers like this unless we follow given rules. We always rely on intuition at some level, it's strange but maybe comforting?

    With the rolling a six with two die, we want to say, hey,two goes much be twice the chance, but if we change the problem slightly, and say, heads or tales, then it becomes twice the chance which means certainty which is obviously nonsense, or if we stay with 6's and have 100 die then do we say we have 100x1/6 chance of rolling a 6, and again it is nonsense.

    A reframing of your set problem is easier if you imagine shooting the children. As they are dead and you cannot kill them twice the intuition becomes easier to accept.

    Good problem

    Graham

    ReplyDelete
  2. I have flipped two 1 euro coins. One of them is swans up. What is the probability that other is map up?

    There are 5860 million euro coins in circulation. There are swans on the obverse of 99.8 million euro coins.

    ReplyDelete
  3. Nice one, Pekka, but I think us non-Eurozone people need a bit more information. Do all euro coins have a map on one side?

    There's something comfortingly consistent about heads or tails...

    ReplyDelete
  4. I've really confused myself, perhaps you can help. I understand how people are getting the answer 13/27 and I also understand that the answers to probability questions can be counter-intuitive but something doesn't seem right.

    Because the gender and day chosen in the question don't matter what if we rephrase the question like this:

    I'm conducting a telephone survey of two-children families. The respondent has to answer my question in the form -

    "I have two children. One of them is a born on a " (where the child I'm given information about is chosen by the respondent at random.)

    I ring up one particular mother but there's a bad line. The answer I get is "I have two children. One of them is a -brzt- born on a -brzt-." What is the probability that both of her children are the same gender?

    I don't see why it's not 1/2 but I also don't see why it's not 13/27.

    Similarly, I now don't know what the answer would be to the simpler question if you rephrased it like this. 1/2 or 1/3?

    Basically I've got myself in a muddle.

    ReplyDelete
  5. Oh I don't think I should have used the brackets I did - the internet seems to have eaten some of my comment.

    The fourth paragraph should have read "I have two children. One of them is a (gender) born on a (day of the week)" (where the child I'm given information about is chosen by the respondent at random.)

    ReplyDelete
  6. Hi Rich,
    I'm afraid I don't quite follow your example. Taking the simpler case without the day of the week, the reason it's 1 in 3 rather 1 in 2 is we don't know which child is being refered to, but we do know that they aren't both girls, so it's a different situation to where we don't receive the information about gender. In the case of your brzt'd call, we don't have any information, so the probability that both children are same gender is 50:50, because the options are BB BG GB and GG (see the diagram in the post) and two of these have children of the same gender.

    ReplyDelete
  7. My problem was that no matter how you fill in the blanks (and the blanks were filled, we just didn't hear how), you end up with a question to which the answer ought to be 13/27. Because:

    1) All children have a gender and a day of birth.

    2) The gender and the day of birth given in the problem above are arbitrary and don't affect the answer.

    So even though we don't know which precise question we're being asked, we do know that it's one of the set of questions to which the answer is 13/27.

    ReplyDelete
  8. So the problem seems to be that in the indecipherable phone call the woman selects the child she's going to tell me about at random, which isn't the same way that the child is picked in the example.

    So the answer to my phone call problem is 1/2. As you say, I don't have any relevant information.

    On the other hand, the answer to this related question WOULD be 13/27:

    'A foreign friend of mine told me a puzzle the other day. Unfortunately my understanding of their language isn't so great; I know the words for the two genders but not which is which and I know the words for the seven days of the week but not which is which. As best as I could make out the puzzle went like this "I have two children. One is a (gender word) born on a (day of the week word). What is the probability that I have two (same gender word as before)?" Despite my linguistic shortcomings, what was the answer to his puzzle?'

    I thought the two examples were equivalent, but they aren't.

    ReplyDelete
  9. Extra information does change probabilities. The classic example is the goats and Ferrari problem.

    I'm sure you know the one. There's a game with three doors, two with goats behind, one with a Ferrari. You can pick any door, and you win what's behind it. You choose a door and announce it. The game show host opens another door and shows you a goat. Should you stick with the door you chose, change doors, or doesn't it not matter?

    The instinctive response is it doesn't matter. You've seen a goat, so there's a 50:50 chance that the Ferrari is behind either door. But actually you should swap, because there's a 2/3 chance the Ferrari is behind the other door.

    The reason the probability is different from expectation is because of information. The game show host knew where the Ferrari was, and opened a door that didn't have a Ferrari behind it. The door you chose always had a 1 in 3 chance of being the Ferrari, so he showed you which door now has a 2 in 3 chance.

    ReplyDelete
  10. Wrong, wrong, wrong.

    Probability is a fraction. The number on the top is the number of "true" INDEPENDANT events - in this case having two boys.

    The number on the bottom is the total number of INDEPENDANT possibilities.

    There is a reason why I capitalise independant. It is because having at least one boy reduces the number of possible events to an "either or" - either both are boys or one is not.

    Winning the lottery this week neither improves or reduces the PROBABILITY of winning it a second week. The first week is independant of the second, the sex of the first child is independant of the second.

    Because probability is a fraction, and because you have changed the denominator, you change the probability from being 75% that one is a boy to 100% that one is a boy. This lets you calculate the AND expression "what is the probability that p1 one is a boy AND p2 the other is a boy". p1 = 1.0, p2 = 0.5; p1 AND p2 = 1 x 0.5 = 0.5.

    You may think that if you win the lottery this week, where say p1 = 0.0000001, that the "probability" you tell yourself of winning it again - AFTER you have collected your winnings - is 0.0000000000001. The truth is once you win it the first week, you have the same chances the next week. It is only if you haven't won anything yet that the "probability" of winning on both weeks is so small.

    Start reading up about the strange world of AND and OR, where if you say "heads AND heads" the p = 0.5 x 0.5 = 0.25, and if you say "heads OR tails" the p = 0.5 + 0.5 = 1.0

    ReplyDelete
  11. I'm not quite sure what you are saying is wrong. If I show you which boy I mean, then you are right, they are independent. But if I don't, I haven't indentified which of the two children I'm referring to, so they aren't independent.

    Similarly with the lottery, the probability of winning the lottery twice is different before the first win and after the first win.

    ReplyDelete
  12. Hi Brian
    In your "siple" example you say prob of two boys is 2/3 "because we dont know whether the child referred to, was the odler or younger. You also say that if it was eg the older then prob is 1/2.
    What ,pray, is the difference between a child being "the older" or being "the one referred to"??
    So we have ref "to child is boy , "other" can be boy or girl.
    So you are unfortunately wrong. The prob is 1/2.

    ReplyDelete
  13. The difference is quite straightforward. If I say 'I have two children and the older one is a boy,' then we are only working out if the younger child is a boy or a girl. But if I just say 'one of my children is a boy' we are considering two possible children. Just take a look at the diagram - that should make it totally clear.

    ReplyDelete