Thursday, June 6, 2013

Bertrand's Box Paradox

The Bertrand's Box Paradox refers to this puzzle:

You have 3 boxes before you.
  • Box A contains two gold coins.
  • Box B contains one gold coin and one silver coin.
  • Box C contains two silver coins.
Like so:
Box A
Box B
Box C

Question: Suppose you chose a box at random and withdrew one gold coin. What are the chances that the next coin is also gold?

Well, If I withdrew a gold coin from one a random selection of the three boxes, then I must have either Box A or Box B. Since I have two remaining choices: one favors a gold coin and the other favors a silver coin, then the chances of me pulling out a gold coin is 50% (aka 50/50).

Box A
Box B

Seems like it should be, right? Turns out it's wrong.

Here's how I wrangled this problem. I did it by using differing gold and silver coins.

Box A
Box B
Box C

So 3 boxes at 2 coins a piece means there are actually 6 possible outcomes in which I can randomly select a box and pull out coins.  Here they are:

1st coin2nd coin
1 Choose Box A and snag the Gold Eagle first.
2 Choose Box A and snag the Gold Buffalo first
3 Choose Box B and snag the Gold Eagle first
4 Choose Box B and snag the Silver Eagle first
5 Choose Box C and select the Silver Eagle.
6 Choose Box C and grab a Silver coin.

These are my only six options. Per the paradox, I withdrew a gold coin first and not a silver coin. This means that I didn't pick Box C's two possibilities.  It also means that I didn't pick one of two Box B possibilities.

Essentially, I have three possibilities left:

1st coin2nd coin
1 Box A: Gold Eagle first, then Gold Buffalo
2 Box A: Gold Buffalo, then Gold Eagle
3 Box B: Gold Eagle, then Silver Eagle

From here, it's pretty easy to see that my second coin has two out of three chances of being gold and one out of three chances at being silver.

And thus correct answer to Bertrand's Box Paradox is 2/3 or 66.67%.

Why does this talk of probabilities matter to a Manufacturing Sciences team or cell culture engineer?

Well, understanding the math behind bioreactor contamination, or recovery step yields, is one of the foundations in explaining real phenomena. This matters is because your biological system is multivariate.

Not only that, your process steps are sequential: Production cultures come after inoculum cultures; arvests after production cultures; ProA after harvest and so on and so forth. The success of this step often depend on the outcome of the previous step. And CofA attributes measured at a late purification step could be caused by some factor at the production culture stage.

Large-scale biologics manufacturing is complex, far more complex than picking a box with two coins and pulling them out one at a time. Yet the math behind the Bertrand's Box Paradox shows us that we muggles are susceptible to missing the mark when conditional probabilities are involved.

Credits: Images are from the US Mint and therefore in the public domain.


JeffJo said...

I have two children, including at least one boy. What are the chances I have two boys? Think about it for a moment, while I give you some history.

Joseph Bertrand published his "Box Paradox" in 1889 as a cautionary tale, about the dangers of treating probability problems based on "the information" alone. What you need to know is how you got the information. But you didn’t discuss his actual paradox.

If you accept the intuitive answer of 50% for your problem (yes, I know it is wrong), then you would also have to say there was a 50% chance the remaining coin is silver if the coin you withdraw is silver.

But then, if you withdraw a coin and hold it in your hand without looking at it, you would have to say there is a 50% chance that the coin in your hand is the same kind of coin as the coin in the box.

But that means that the chances a random box has two of the same kind of coin are 50%. We know this probability must be 67%. This is the paradox. The solution you gave above is the resolution of the paradox, since it shows that 50% is not correct.

And while that solution is correct for the problem as you stated it, it isn't always enough. Suppose that after you pick the box, I look inside of it and tell you that there is a gold coin instead of you withdrawing a coin. 50% is still wrong, but you can't use the "There are six cases, not three" argument, because there are only four cases. The correct solution is that, if I tell you about only one kind of coin, there is a 100% chance I will tell you about gold if you chose the box with two gold coins, but only a 50% chance if the box had a gold and a silver. That makes the answer (100%)/(100%+50%)=67%.

Most puzzle books that "solve" the Two Child Problem I gave at the start of this comment will say the answer is 33%. There are three possible family types that include a boy, and only one has two. But that's wrong; in fact, the problem is identical to Bertrand's Box Problem if you add a fourth box, and put a gold coin and a silver coin in it. It seems that most of these authors failed to heed Bertrand's cautionary tale, which is quite sad, because many of them also present his [problem in the same books.

Anonymous said...

OMG i was going nuts trying to wrap my head around the whole 2/3 chance remaining after selecting the first coin.....but using seperate coins made this soooooooooooo much easier....makes perfect sense now....cheers

Anonymous said...

Aah, I still don't get it! Which is good. It took me a while to understand the Monty Hall problem, but when I did, it was such a mind blow.

OLY said...

It's a variation of the Monty Hall problem.

JeffJo said...

For June 16th's anonymous: Look at it this way: Considering just the first coin, what is the chance you'll get a gold coin? How about a silver coin? Note that these chances can't be different, so both have to be 50%. But 2/3, or 67%, of the boxes have a gold coin. This is that paradox. It happens because it is possible to draw a silver coin first out of a box that also has a gold coin, so you can't count those. Now consider the second coin, given that the first is gold. And remember that you still can't count all of the boxes that have a gold coin, just those where the first coin drawn is gold.

The Monty Hall problem is a variation of the Box Paradox; in fact, the math is identical. But the actual similarity in the problems is quite subtle, and most people can't find it. The concept that ties them together is that the information you get (there is a gold coin, one child is a boy, Door #3 doesn’t have the desired prize) is always based on a choice between two options that are equivalent except in name - like a gold con vs. a silver coin, or a boy vs. a girl. If you try to describe this choice in terms of car vs. goat, you will fail because the choices aren't equivalent. If you try to describe it in terms of the door numbers you will fail because there are three, and either one or two represent choices than can't be made.

But there is a way to make Monty Hall equivalent to Bertrand's Box Paradox. The box can contain gold coins but not silver, or silver coins but not gold, or both kinds. In the game with Monty Hall, you first need to randomly label the two doors you didn't pick Gold and Silver. Your door is either different than Gold but not Silver, different than Silver but not Gold, or different than both. When Monty Hall opens a door - say it is Silver - to reveal a goat, you find out that your door is different than Gold because one must have the car and one must have a goat. But you don’t find out whether your door is different than Silver. The problems are now identical.

Unknown said...

Oliver - one tiny suggestion to make it a little more clear. Could you state that you pick the second coin from the same box as you picked the first from? "pick a box at random to choose the first coin. It is gold. Then what is the probability that you pick a gold coin if you pick the remaining coin from the box you initially chose."

Maybe I am the only one that didn't get that right away, and if so, feel free to ignore me!