I think I need a little help with math and probability. This is one of those things about choosing colored balls from a bag, but I only sorta remember how do to the math on it—and it isn’t actually about choosing the colored balls from the bag. Alas.
The institution that employs me has a policy of spot-testing for COVID-19. They test 5% of the students every week—let’s say, for the sake of mathematical ease and fairly-good approximation that there are 5,000 students and that they test 250, chosen at random, every Wednesday.
In the first week of this program, those tests reveal one positive case: call it 1/250 or 0.4%. My understanding is that this ratio implies an expectation of 20/5,000 (0.4%), or 19 untested students walking around carrying the disease. Now, I’m using “expectation” here as a mathematical term, and so probably incorrectly, but my understanding is that if there were fewer than, say, 5 students with this disease, it would be a stroke of luck to catch one of them in the 250 tests. My instinct is that it would be surprising if there were fewer than 10 or more than 30 COVID-19-positive students on campus untested. Does that sound right?
However, in the second week, they tested another 250 people and had no positive results at all. 0%.
Now, they are currently displaying that result as 1/500, or 0.2%. That ratio implies 10/5,000 positive—10 students, of whom one was caught-and-treated. But that sounds wrong to me, as a description of what is going on—for one thing the status of individuals changes over the course of a week, so there’s a big difference between a single test of 10% of the populace and two tests of 5% a week apart. Particularly since both the prevalence and the rate of spread are useful things to know, and the rate is perhaps more useful than the prevalence, since it tells us whether our containment practices are, you know, working.
It sounds to me like the first draw got ‘lucky’ in catching not 1/20 of the students who were sick but 1/10 or 1/5—surely the spread rate is at least one-for-one—but then it’s completely plausible mathematically that there are 20 students who would have tested positive at the time of the Week Two test, and that none of them happened to be among the 5% chosen.
I will add: in between the randomized tests for Week One and Week Two, six students tested positive. I don’t know why they were tested—they may have been ‘contacts’ of the one randomly-found positive, or they may have showed symptoms, or perhaps they were tested because of some other contact or requirement. I don’t know precisely when they were tested, either—if they had been tested in either randomized test, would they have tested positive or negative? I have no idea. But the implication is, it seems to me, that there are some students walking around with this disease, such that the Week Two result is not an exactly accurate ratio.
So, here’s my question: what should we be expecting from Week 3? What is the range of possible/probable results? What can we know about the ‘actual’ rate of positives among the students, and what would that rate imply for the range of expected results from the 5% test?
Or, more specifically, if the result of Week Three is, f’r’ex, five positive tests, can we say with some confidence that there is a growing number of COVID-contagious students, now upwards of a hundred, and if we don’t make some changes it’s going to be much, much more? Or would it be every bit as likely that there have been between thirty and fifty the whole time, and the week-to-week variance is not showing a compelling upward trend at all, and that our current containment practices are working pretty well? I feel like it’s important to know which of those things is the case before the numbers actually come out.
Tolerabimus quod tolerare debemus,