Thursday, 9 June 2011

Putting the cat amongst the pigeonholes

The pigeonhole principle is such an obvious statement that few people realise that it's actually useful. However, a few people (who should know better, given who they are) actually get a bit too overly excited and try to use the pigeonhole principle to demonstrate the truth of things which simply aren't true.

Before getting into the pigeonhole principle itself, it may be interesting to muse first on the nature of truth. Mathematical truth is a strong concept. Things are not "true" in mathematics simply because we can't find couter-examples - things are only "true" in the mathematical sense when we can show via a structured and logical argument that there can be no possible exceptions to the rule.

Of particular interest here is the difference between an events which has a probability of 1 and an event which has a probability of "very nearly 1". In every day life, the two things are considered the same. "99%" sure is pretty much a synonym for "I'm sure"; mathematics however is not so forgiving.

Take a pack of cards. Shuffle the cards. Throw them up in the air. Pick them up of the floor and put the back back together. Common sense tells us that the cards won't be in order. Mathematics tells us simply that it's "very unlikely" that they'd be in order - but it's certainly not impossible.

The pigeonhole principle deals with certainty, not probability. The pigeonhole principle is mathematically strong - it tells us what is true - not just what is very likely to be true. The principle is very simple. Simply put it says that if you have more objects than you have boxes, and you put the objects into the boxes then at least one of the boxes will contain more than one object. It doesn't tell us how many boxes contain more than one object, nor does it tell us how many objects are in each box - it tells us only that at least one, unspecified, box will contain more than one object. Simple. But useful.

The birthday paradox will provide an interesting diversion here. The birthday paradox is a well known result - though not actually a paradox. If you take a random group of people, then you only need 23 people before there's a better than 50% chance that two people share a birthday. This number is lower than most people imagine, and I guess that's why it's sometimes called a paradox. Anyway, in order to see how the pigeonhole principle comes into play, we need to look at the birthday paradox the other way on.

Rather than looking at how many people we need in order to give us a certain chance of a birthday match, let's fix the number of people and look at how likely it is they share a birthday. For the purposes of this, I'm going to take it as read that there are 366 possible birthdays (we need to include 29th February, as it is a birthday for some people...)

If we have two people, then the probability that they share a birthday is 1/366. The more people we add to the group, the higher the chance that they share a birthday. As we've said before, once you get to 23 people you have a better than 50/50 chance of two people sharing a birthday. As you get towards 100 people you are really very very likely to have two people sharing a birthday. But you're still not certain that two people with. Right up until you've got 366 people you can't be certain that two people share a birthday. The chance that no two people in a group that size share a birthday is really very small indeed, but it could happen.

However, once you get to 367 people, the pigeonhole principle rears its head and removes the need for complicated calculations. If you must, imagine 366 boxes with a different date written on each. Take your 367 people and tell them to go and sit in the box with their birthday written on it. You could imagine a situation where 366 people are each sitting in their box but you still have one person left over. They have to go somewhere, and which ever box they sit in has two people - and there are our two people who share a birthday. At this point, we can be absolutely sure that two people in the group share a birthday. Of course, this is obvious - but I'm made the description so rigorous so that I can attempt to explain why two ways in which the pigeonhole principle are regularly applied are not so rigorous.

Let's start with one of Stephen Fry's favourite anecdotes. Everyone has two parents, four grandparents, eight great-grandparents and so on. It doesn't take many generations before we have an enormous number of ancestors. The common argument says that this number of ancestors is great than the population of England at the time at the time of Henry IV, and therefore Henry IV must be an ancestor of everyone who has English blood stretching back that far.

It's a great thought. But it's not true.

Once the number of ancestors is greater than the population, we do have a problem - and the pigeonhole principle can help us solve the problem - but the conclusion is the wrong one. The pigeonhole principle applied to this situation tells us that once we go back far enough, at least two the positions in our family tree must be occupied by the same person. This sounds alarming, but it isn't so. One person may have had children who both married, had offspring and so on for many generations until at some point someone from one family marries someone from the other. The bride and groom in this situation will share an ancestor, but it may be so many generations ago that it's well outside of living memory and doesn't present any problems.

This is interesting, but doesn't show that we are all descended from Henry IV. It just shows that at the time of Henry IV, at least two of the ancestral lines in our family tree must converge into a single ancestor. And it won't be the same common ancestor for everyone - and that common ancestor is very unlikely to be Henry IV.

Of course, given so many years, the genes of Henry IV will have spread themselves wide throughout the population and so many of us will be distantly related to him. But the only way to prove that would be to take DNA samples.

It's not just the lovely Mr Fry who gets carried away with the pigeonhole principle though. Marcus Chown once abused the principle to get to an interesting result.

The Marcus Chown case is a little more complicated, and requires a little extension of the pigeonhole principle to the infinite. We can state the infinite case as follows. If we have a finite number of boxes, and an infinite number of objects then when we put the objects into the boxes, at least one of the boxes will have an infinite number of objects in it.

At this point, I know I'm talking nonsense about infinite objects and putting all of them into some boxes, but bear with me. The objects and boxes are both abstract here, but we can look at a real world example.

There are a infinite number of integers. (I'm going to take that as read). Imagine writing them down in full - one, two, three, four, five, etc.

There are 100 tiles in a Scrabble set. Those 100 letter tiles can be used to spell some numbers. But they can only be used to spell a finite number of numbers. It's quite a lot, but it's definitely finite. If that's obvious, then let's take it as read - if it's not obvious then just consider that there are only a finite number of ways of arranging 100 tiles and only some of those will spell out numbers.

So what does that tell us. Well it tells us that if we have two imaginary boxes, one which contains all the numbers you can spell out with a Scrabble Set and the other box contains the ones you can't. The first box contains a finite number of numbers, and so the second must contain an infinite number of numbers.

So, we've shown that there are an infinite number of numbers which you can't spell using the letters from a Scrabble set. I think that's quite cool even if you're bored to death by now.

So where's this going, I hear you ask. Well Marcus Chown used a similar argument when talking on stage a few years ago.

The first part is going to have to be taken as read, even though it's actually quite counter-intuitive. So - the assertion is that there are only a finite number of possible histories for the Universe. The second assertion is that there are an infinite number of Universes. Go with me here...

Marcus Chown took those things, and said that because there are an infitinite number of Universes and a finite number of possible histories, each history must happen an infinite number of times.

Not true I'm afraid. The pigeonhole principle tells us only that at least one of those histories must exist an infinite number of times - not all of them. And it's very unlikely that the particular history corresponding to our Universe exists an infinite number of times.

The result Marcus Chown was presenting is quite possibly true, but the logic he used to get to it was, I'm afraid, fallacious.

We have strayed dangerously close to quantum physics here, and then as we know - nothing becomes certain. And according to quantum physics, if we put a cat in a pigeonhole it's only decoherence which allows us to be pretty sure it's still going to be there when we go to collect it later.

Hmm. Cats. Quantum Physics. There must be a thought experiment there somewhere, if only I could think of one... ;-)

No comments:

Post a Comment