« Book Report: Netherland | Main | Cross-casting the day »

An Odd Question

So. This is a math problem that I would once have been able to set up and possibly even solve, but at this point I can’t even figure out how to google the answer. So.

In rolling two dice (2D6) the long-run distribution is easy: for every thirty-six rolls, 2 and 12 come up essentially once each, 3 and 11 twice, 4 and 10 three times, 5 and 9 four times, 6 and 10 five, and the last six rolls come up showing 7. In the long run. More or less. You could graph it.

In the short term, of course, longer-odds outcomes will likely come up more often than shorter-odds outcomes. For instance, after one roll, the odds are very good indeed (five-to-one) that something that is not a seven will have come up more often than something that is a seven. After six rolls, the odds are (if I am getting this right, and perhaps I am not) even that you will have rolled a seven at all, which means that the odds are slightly better than even that you will have rolled some longer-odds number more often than a 7 (counting the outcomes where you roll, for instance, one 7 and two 6s in those six rolls.

Is that clear? For any given number, 7 will come up more frequently in the long term, but in the very short term (six rolls), the odds favor some number coming up more often than the 7. If you have ever rolled dice a bunch of times, this will be instinctive, I think; if not, grab a couple of dice or a dice-rolling app and try.

Now, let’s see if I can generalize while remaining clear: if you roll, say, a hundred times and graph the results, you’ll probably have something that more or less approximates the ziggurat shape the distribution would predict. Something like this:


But still there might be more, for instance, 8s than 7s:


Or even more 5s than 7s:


Because 100 rolls really is still a short run of rolls, innit? Let’s try 200:


Look! There are more sevens than anything else. Of course, there are just as many 5s as 6s. I’ll try again.


This time there were more 4s than 5s.


More 10s than 9s.

So, this is my question: It seems to me that for a reasonable number of dice rolls, the odds are in favor of some lower-odds result coming up more frequently than some higher-odds result. I could figure out the odds of more 10s than 9s, and more 4s than 5s, but I don’t know how to figure out the general odds of there being at least one instance of more x than y over n rolls, where the odds of x are longer than the odds of y. And what I’m really wondering is that—it feels to me as if it would be possible to look at that as a function of n, such that as n increases the odds (that is, the odds of at least one such low-odds incidence occurring) decrease, and that therefore there exists an n such that the odds are even, with lower n having greater odds and higher n shorter. I’d like to know how big that n is. But I could be wrong about there being such an n; my instincts for probability functions aren’t all that good.

There’s a lesson about the world in all this, that low-probability things happen all the time, and so forth. Mostly, though, it’s about actual dice: when you shoot craps or play Settlers of Catan or Monopoly or whatnot, don’t expect that over the course of the game there will necessarily be more 8s than 9s. The actual odds are that somebody will get screwed somehow.

Tolerabimus quod tolerare debemus,


The odds of at least one seven coming up in 6 rolls is 1 - (30/36)^6, which is .67.

As for your more complicated scenarios, I wouldn't know how to solve them without simulating it with code.

If you can figure out the odds of more 10s than 9s, it seems like you can use that to figure out more 9s than 8s, since the percentage change in probability is the same for every "step", right? There are five of those steps on each side, so for a 50% chance of each side being in sequence you need to find the number where 87% of the time, 7s are higher than 6s. If you want even odds for both sides to be in sequence, you need .933 at each step (.933^5^2 = .5). However, there's still the chance there that 2s would be lower than 3s but higher than both 12s and 11s, so that doesn't quite do what you want. But perhaps it's a start?

OK, you just need to square it again to account for the fact that each number needs to "beat" two numbers. Find the point where 7s are higher than 6s .966 of the time (.966^2^5^2 = .5)

Sorry to clog the comments... but after further thought I don't think it's that simple. The outcomes are not independent -- if there are fewer 7s than 6s there are more likely to be fewer 7s than 8s since you probably didn't roll a lot of 7s. Also, i think it's to the 18th power rather than the 20th, which actually doesn't change much (.962 vs .966). So my revised guess is somewhere around .95 chance of 7s > 8s to give you 50/50.

Fred—Thank you... I may have forgotten more about probability than I had thought. And it hadn't occurred to me that they weren't independent, but of course they aren't.

In one sense, this seems like a problem that would be frequently set for a college level probability course, but I'm pretty sure I've never seen it worked out. On the other hand, I never actually took probability, and have only glanced at texts.


I'm not going to attempt an analytical solution; it seems like it'd be a long and messy equation.

But what's short and hopefully useful is to get raw data by rolling a few hundred million dice. That's a nice thing about having the entire world's computation might circa 1992 inside my MacBook. I wrote a quick program in Chipmunk Basic, which I'll paste below for your inspection or giggles, and ran it. Here's the scoop:

Rolling the dice 100 times gives you a strict more-of-each-until-7-then-less-of-each only about 1% of the time.
Rolling them 200 times, it happens about 8% of the time.
Rolling them 300 times, it happens about 19% of the time.
Rolling them 400 times, it happens about 29% of the time.
Rolling them 500 times, it happens about 41% of the time.
600 times is where we get to just about even odds.
Rolling them 800 times, it happens about 65% of the time.
Rolling them 1000 times, it happens about 75% of the time.
Even rolling them 2000 times you still only get "ideal" distribution 96% of the time.

Thanks for giving me a good excuse to spend a few minutes programming! This is the code for the final run, with 2000 dice. I just changed line 90 for the other runs.

10 dim tot(12)
20 for i = 1 to 1000
30 let flag = 0
40 for x = 2 to 12
50 let tot(x) = 0
60 next x
70 let p = int(timer()*1000000)
80 randomize (p)
90 for x = 1 to 2000
100 let roll = 2+int(rnd(6))+int(rnd(6))
110 let tot(roll) = tot(roll)+1
120 next x
130 for x = 2 to 6
140 if tot(x) >= tot(x+1) then let flag = 1
150 next x
160 for x = 8 to 12
170 if tot(x) >= tot(x-1) then let flag = 1
180 next x
190 if flag = 0 then let gaussian = gaussian+1
200 if flag = 1 then let nongauss = nongauss+1
210 next i
220 print gaussian/10 "%",,nongauss/10 "%"
230 end

Thank you, Jim! So not only can it be approximated as a simple function, it's pretty much a linear function (OK, no, but a linear function stays pretty damn' close for the first thousand rolls) and the n I was looking for is around 600. That's a lot. If you were to stand at the craps table for two hours, you would still be very safely in the 'unusual things are the expected outcome' range. Or... three games of Monopoly? How many games of Settlers?

Unfortunately, though, I can't see a way to make it work as a bar bet.


Comments are closed for this entry. Usually if I close comments for an entry it's because that entry gets a disproportionate amount of spam. If you want to contact me about this entry, feel free to send me email.