Log in

No account? Create an account
A-level probability question, to bring joy to children at Christmas:… - Sally's Journal
November 19th, 2013
12:01 pm


Previous Entry Share Next Entry
A-level probability question, to bring joy to children at Christmas:

Tesco have an offer on Octonauts figures at the moment (that's true, so if you were hoping to buy some, now you know). The parents of an adorable 2 year old want 5 out of the 8 characters (he owns 3 already). However, the figures are not listed separately, just as 'one supplied.'

How many should they order into store to have a good chance (say >90%) of getting the 5 they want out of the random selection that Tescos send? There's no issue with taking any surplus back, so they could order vast quantities... but that would seem a little absurd.

[I can see 3 similar problems. (a) the simple problem, where 'Tescos' is a box containing all 8 characters, once and once only (b) a finite problem, where Tescos is a finite but large box (say containing 64 octonauts, 8 of each) and (c) an infinite problem, where Tescos is an Octonaut producing machine that can produce as many Octonauts as are ordered and dispenses them at random. I think the actual problem is (c), but I'm quite interested in the solutions to (a) and (b) as well and how it makes the maths different...]

My vague thoughts...

The simple problem would be easy. If you only bought 7 octonauts, there'd be a 5/8 chance that the last octonaut was in the set 'octonauts I wanted' and a 3/8 chance it wasn't. So that's only a 37% chance of a happy toddler... so you buy all 8 and have 100% chance. (is that right? Have I done something wrong?)

I wondered if it's best tackled as 1 - the probability you don't get all the figures you want. I thought briefly this was 1 - (3/8)^n, where n is the number of figures you buy, but actually this is Wrong, because that is the probability you get at least one figure you want, not that you get all five. And you must have to do something a bit like permutations and combinations, because four Pesos is different to a Peso, a Kwazii, a Barnacles, and a Tweak, even if in both situations you have 'four things you wanted'

So the first octonaut comes out of the machine. You have a 5/8 chance it's one you want, and a 3/8 chance it isn't. Gah, I'm going to draw a huge probability tree, which is hard in text. If you got the one you wanted, you now have a 4/8 chance of getting another one you want, and a 4/8 chance of not doing so.

*draws tree diagram up to three Octonauts*

So the number of octonauts at each node of this tree is a binomial distribution ( minus 1). That is, after one octonaut, the nodes are 'have 1 wanted octonaut' or 'have 0 wanted octonauts' And after two octonauts, the nodes are 2, 1, 1, 0. And after 3 it's 3,2,2,1,2,1,1,0 = 1*3,3*2,3*1,1*0.

Oh, this is coin tosses! Every time you pull the octanaut lever it's like tossing a coin, and heads is 'octonaut I want' and tails is 'octonaut I didn't want'. EXCEPT it's a biased coin (the first time you have 5/8ths chance of wanted Octonaut, rather than 1/2) and EXCEPT the bias of the coin changes based on what you got... because if you already have one, you have 4/8th chance of a wanted Octonaut, and if you already have two you have a 3/8ths chance... So, err, not much like coin tosses.

So it doesn't matter how you got to the 'I have two octonauts I want' node, any future path from their is the same. But there are lots of different ways of getting to (eg) the 'I have two octonauts I want' node and their probabilities aren't the same - eg after 3 Octonauts, 1,1,0 is more likely (P = 0.20ish) than 0,1,1 (P = 0.12 ish). Which makes sense, because you are more likely to get an Octonaut you want when you don't have any octonauts you want at all.

Hmm. I could solve it by drawing a Huge Diagram, but I'd get something wrong. My brain is saying 'simulation!' but a) that's not Real Maths, and b) I don't know how to do it in a quick and dirty way. Not in Excel, I guess... ;-)

Any help?

ETA: I _think_ I've solved this now, although in a very cludgy way!

See comments, especially

(33 comments | Leave a comment)

[User Picture]
Date:November 19th, 2013 12:13 pm (UTC)
There was a similar question on facebook a while ago:

I wonder how many facebook friends you need so that you are more like than not to have a friend with a birthday on every single day of the year?

This was solved neatly by Ben:

Under (presumably) the same assumptions as above (uniform distribution of birthdays, ignoring leap years):

Define P(N,D) to be the probability that N people have exactly D different birthdays. Then we want to find the smallest N* such that P(N*,365)>0.5.

I'm thinking induction is the way to go. Suppose we have P(N-1,d) for all d, i.e. we know the chance of N-1 people having exactly d birthdays (for d = 1, 2, ... N-1).

Then to get P(N,D) there are two ways:
1) The N-1 people had D-1 different birthdays and the Nth person has a different birthday. The chance of this is (365-(D-1))/365 = (366-D)/365.
2) The N-1 people had D different birthdays and the Nth person has the same birthday as one of them. The chance of this is D/365.

So the recursion is
P(N,D) = [ P(N-1,D) * D + P(N-1,D-1)*(366-D) ]/365
with initial conditions P(1,1) = 1 and P(1,D) = 0 for D>1.

You can probably solve this with generating functions, but to be honest I don't care that much! A simple bit of code gives N* = 2287. This is pretty close to Vincent's answer. I wonder what happens if Vincent computes the median, and not the mean? Of course, since I said the code was simple, it's also quite possible that I made a mistake!


I think this is probably the solution for this... define P (N,X) as the probability after buying N Octonauts you have X Octonauts that you want? (I did this as P(N,O) with O for Octonaut initially, but it's _awful_ notation, because O looks like 0) Then we could have the recursion as P (N,X) = P(N-1,X) * P(didn't get an octonaut you want, given you already have X) + P (N-1,X-1)* P (did get an octonaut you want, given you already have X-1)

P(didn't get an octonaut you want, given you already have X) is easy, it's (X+3)/8 (so as soon as you have all 5, you never get an octonaut you want). Likewise P (did get an octonaut you want, given you already have X-1) is (6-X)/8

But I don't know how to actually do the recursion and calculate the answer...

Edited at 2013-11-19 12:16 pm (UTC)
[User Picture]
Date:November 19th, 2013 12:17 pm (UTC)
Presumably if I could calculation the answer, the answer is to find the smallest N such that P(N,5) > 0.9 ...
[User Picture]
Date:November 19th, 2013 12:26 pm (UTC)
From a helpful friend's facebook message:

It sounds like a variant on the coupon collector's problem: http://en.wikipedia.org/wiki/Coupon_collector%27s_problem (the variant being you already have 3 of the `coupons'). I haven't read the wiki article in any detail; I just knew the name of the problem.
[User Picture]
Date:November 19th, 2013 01:02 pm (UTC)
Right, that sounds good to me. I assumed this would be easy, but now I think it isn't :)

An alternative way of putting it would be to say "I'm trying to collect N coupons, and every box I order has a 5/8 chance of containing a coupon". I'm not sure if that would be any easier to solve.
[User Picture]
Date:November 19th, 2013 12:45 pm (UTC)
Are all the octonauts produced in equal quantities? Often with children's characters, some are more popular than others and those are produced in greater quantities.

Also, this reminds me of a time when I was small and wanted charmkins, and my mother and aunt, going out shopping, said they would buy me one charmkin, and which did I want (Blossom), and (in case they didn't have her) which would be my second choice (Brown-eyed Susan)?

But as a Lovely Suprise, they in fact bought me both. And although I was grateful, I was also really sad, because why on earth did they assume that the Charmkin I wanted should Blossom not be available was the same as the Charmkin I would want to go alongside Blossom? In fact, I definitely wanted a humanoid Charmkin followed by a non-humanoid Charmkin, and having two humanoid Charmkins was (for reasons I cannot now remember) no better than having one humanoid Charmkin.
[User Picture]
Date:November 19th, 2013 01:05 pm (UTC)
This story is sad.

Also, I had never heard of Charmkins, and have just looked them up and they're so cute!
[User Picture]
Date:November 19th, 2013 12:46 pm (UTC)
If I were Tesco, I'd probably order in 100,000 of Kwazii and 100,000 of Barnacles and so on. And then depending on the eptitude and cost control of my distribution factory I might open the boxes one at a time in order of arrival and fill orders that way. So the answer might be several hundred thousand.

What's the deadline? Christmas? If there's time to iterate I'd order in a preliminary eight and see what I got. And then investigate eBay for low-faff swapping.

Although, last time I counted we had 17 Tescoi in Cambridge. What I'd actually do is cycle to the largest half dozen in order of accessibility and see what they had on the shelves, first.
[User Picture]
Date:November 19th, 2013 12:56 pm (UTC)
^^^ This.
[User Picture]
Date:November 19th, 2013 01:06 pm (UTC)
Yes, that was my worry. But then I became interested in the hypothetical question, rather than the actual problem, which I fear may be as you state it...

_17_ Tescoi? Wow.
Date:November 21st, 2013 11:13 am (UTC)
Toys don't come in boxes of one type, because otherwise a smaller shop which ordered only one or two boxes wouldn't have a good assortment. So opening boxes one at a time actually gives you a fairly good random assortment, depending on whether are are shortpacked.

See: http://tfwiki.net/wiki/Shortpacking

From:Pete Stevens [ex-parrot.com]
Date:November 19th, 2013 10:36 pm (UTC)



my $tests = 16384;
my @trials;
for (my $i = 0; $i < $tests; $i++) {
my $trials = 0;
my @oct = ( 1,1,1,0,0,0,0,0);
while ((eval join '+', @oct) != 8) {
$oct[int(rand() * 8)] = 1; $trials++;
my $c = 0;
for (my $i = 0; $i < 50; $i++) {
if (defined($trials[$i])) { $c += $trials[$i]/$tests; }
print "$i : " . $trials[$i] . "(" . $trials[$i]/$tests . ") - " . $c . "\n";


Cumulative probability distribution

# ./oct.pl
0 : (0) - 0
1 : (0) - 0
2 : (0) - 0
3 : (0) - 0
4 : (0) - 0
5 : 47(0.00286865234375) - 0.00286865234375
6 : 180(0.010986328125) - 0.01385498046875
7 : 350(0.0213623046875) - 0.03521728515625
8 : 545(0.03326416015625) - 0.0684814453125
9 : 729(0.04449462890625) - 0.11297607421875
10 : 840(0.05126953125) - 0.16424560546875
11 : 891(0.05438232421875) - 0.2186279296875
12 : 914(0.0557861328125) - 0.2744140625
13 : 952(0.05810546875) - 0.33251953125
14 : 929(0.05670166015625) - 0.38922119140625
15 : 913(0.05572509765625) - 0.4449462890625
16 : 870(0.0531005859375) - 0.498046875
17 : 791(0.04827880859375) - 0.54632568359375
18 : 800(0.048828125) - 0.59515380859375
19 : 744(0.04541015625) - 0.64056396484375
20 : 642(0.0391845703125) - 0.67974853515625
21 : 633(0.03863525390625) - 0.7183837890625
22 : 521(0.03179931640625) - 0.75018310546875
23 : 459(0.02801513671875) - 0.7781982421875
24 : 387(0.02362060546875) - 0.80181884765625
25 : 370(0.0225830078125) - 0.82440185546875
26 : 361(0.02203369140625) - 0.846435546875
27 : 331(0.02020263671875) - 0.86663818359375
28 : 263(0.01605224609375) - 0.8826904296875
29 : 228(0.013916015625) - 0.8966064453125
30 : 197(0.01202392578125) - 0.90863037109375
31 : 187(0.01141357421875) - 0.9200439453125
32 : 147(0.00897216796875) - 0.92901611328125
33 : 165(0.01007080078125) - 0.9390869140625
34 : 102(0.0062255859375) - 0.9453125
35 : 98(0.0059814453125) - 0.9512939453125
36 : 102(0.0062255859375) - 0.95751953125
37 : 76(0.004638671875) - 0.962158203125
38 : 85(0.00518798828125) - 0.96734619140625
39 : 72(0.00439453125) - 0.97174072265625
40 : 60(0.003662109375) - 0.97540283203125
41 : 53(0.00323486328125) - 0.9786376953125
42 : 41(0.00250244140625) - 0.98114013671875
43 : 47(0.00286865234375) - 0.9840087890625
44 : 33(0.00201416015625) - 0.98602294921875
45 : 30(0.0018310546875) - 0.98785400390625
46 : 25(0.00152587890625) - 0.9893798828125
47 : 25(0.00152587890625) - 0.99090576171875
48 : 24(0.00146484375) - 0.99237060546875
49 : 23(0.00140380859375) - 0.9937744140625
From:Pete Stevens [ex-parrot.com]
Date:November 19th, 2013 10:37 pm (UTC)
I realise the problem said 'don't sledgehammer it, it'd be boring' but I thought about the question for a while and concluding that it would be boring, but not as boring as not doing that :-)
Date:November 22nd, 2013 03:37 am (UTC)
Well, it was good that I was prepared for disappointment - I got 5 different types of Octonauts, but only 2 out of the 5 I wanted. I've now ordered 20 more sets in the hope of a better haul.

(I wanted Dashi, Shellington, Inkling, Tweak and Tunip. I got 4 Kwazii's, 3 Barnacles, Peso, Shellington and Inkling)
[User Picture]
Date:November 22nd, 2013 08:01 am (UTC)
This is sad. Although at least it proves that they don't open one box of 5000 Tunips and fill the whole order from that.

[Also, I wonder if we should adjust our model given that both narratively and based on the data we have, we expect many more Kwazii's and Barnacles...]

Good luck! Do you want me to have a snoop in bar hill tesocs? I'm there most weekends, but maybe posting a Tweak is more faff than it's worth...
[User Picture]
Date:November 22nd, 2013 06:41 pm (UTC)
You really should come to MathsJam one of these days. (The next one is Tuesday December 17th...)
Powered by LiveJournal.com