|In the mid-1700s, the English clergyman Thomas Bayes figured out how to calculate the probability of A given B when you know just three facts ...|
When A is an event (say, getting a particular infection), and B is a test for event A (say, a test for that infection) then p(B|A), the conditional probability of B given A, is the true positive rate or sensitivity of test B for event A, and p(B|notA) is the test B false positive rate for detecting event A.
(In Bayes's time. gamblers were understandably keen on such insights. Wouldn't it have been lovely for Rev Bayes if he'd known his brilliant little theorem would, in mid-20th century, become the basis of much probabilistic inference?)
His formula for the probability of A given B is ...
P(A) x P(B|A) P(A|B) = ----------------------------------------- ( P(A) x P(B|A) ) + ( P(notA) x P(B|notA)See here for a derivation. In words, the formula says the probability of A given B is ...
Let A be "you have the infection".
Let B be "you tested positive for the disease".
P(A) is the present prevalence of the disease in your population, so...
P(notA) = 1-P(A) = the present rate of non-infection
P(A|B) is what we wish to know, the probability we're infected given that we're asymptomatic and tested positive.
P(B|A) is the true positive rate of the test as above, so...
P(B|notA) is the probability of testing positive when you're actually free of infection, the false-positive rate.
P(A|notB) is the probability of having the disease even though you're asymptomatic and tested negative---the false negative rate.
The formula is a natural for a simple function ...
set global log_bin_trust_function_creators=1; delimiter ; drop function if exists bayes; delimiter go create function bayes( pA decimal(4,3), pBgivenA decimal(4,3), pBgivenNotA decimal(4,3) ) returns decimal(4,3) begin declare ret decimal(4,3) unsigned default 0.0; if pA <= 0 or pBgivenA <=0 or pBgivenNotA < 0 then return ret; end if; return round( pA * pBgivenA / ((pA*pBgivenA) + ((1-pA)*pBgivenNotA)), 3 ); end; go delimiter ; select bayes( .10, .95, .05 ); -- ( returns .68 )That result says when 10% of your population is infected, then a positive result on a test with a 95% sensitivity and a 5% false positive rate gives you a probability of .68 that you have the infection.
You also want to know the probability of a false negative---the probability that the test wrongly reports a negative result. You don't need a fancy formula for that---the false negative rate is just 1 - the true positive rate, for the above example 1 - .95 = 0.05.
That Bayes probability function in hand, we can easily see how base rate, true-positive and false-positive parameters affect the credibility of test results by building a table of combinations of base incidence rate and test true positive and false positive rates with ranges of interest, for example
set @@cte_max_recursion_depth = 500000; drop table if exists bayes; create table bayes( baserate decimal(4,3), truepos decimal(4,3), falsepos decimal(4,3), bayesprob decimal(4,3) ) with recursive cteA as ( select .05 as pA union all select pA+.05 as pA from cteA where pA < .95 ), cteBgivenA as ( select 0.5 as pBgivenA union all select pBgivenA + .05 as pBgivenA from cteBgivenA where pBgivenA < 0.95 ), cteBgivenNotA as ( select .01 as pBgivenNotA union all select pBgivenNotA + .01 as pBgivenNotA from cteBgivenNotA where pBgivenNotA < .5 ) select cteA.pA as baserate, cteBgivenA.pBgivenA as truepos, cteBgivenNotA.pBgivenNotA as falsepos, bayes( cteA.pA, cteBgivenA.pBgivenA, cteBgivenNotA.pBgivenNotA ) as bayesprob from cteA cross join cteBgivenA cross join cteBgivenNotA ;That builds a table of 5,700 rows. Queries against it can illustrate how baserate, true positive and false positive test performance rates affect the Bayesian probability that a positive test result means what it says.
For example here's the curve we get for queries against the