Question 1

What is Bayes theorem?

Accepted Answer

Bayes theorem calculates the conditional probability P(A|B) — the probability of event A given that B has occurred — using prior knowledge of P(A) (prior probability), P(B|A) (likelihood), and P(B|not A) (false positive rate). The formula is: P(A|B) = P(B|A) times P(A) divided by P(B). Where P(B) = P(B|A) times P(A) plus P(B|not A) times P(not A). It is the mathematical foundation of Bayesian inference and is used to update beliefs with new evidence.

Question 2

How do you calculate posterior probability with Bayes theorem?

Accepted Answer

Posterior probability P(A|B) = P(B|A) times P(A) divided by P(B). P(B) is computed as P(B|A) times P(A) plus P(B|not A) times (1 minus P(A)). Example: disease prevalence P(A) = 0.01, test sensitivity P(B|A) = 0.95, false positive rate P(B|not A) = 0.10. P(B) = 0.95 times 0.01 plus 0.10 times 0.99 = 0.1085. Posterior = 0.0095 divided by 0.1085 = 0.0876 or 8.76%. Despite 95% sensitivity, a positive test only means 8.76% chance of disease when prevalence is 1%.

Question 3

What is the medical test application of Bayes theorem?

Accepted Answer

For a medical diagnostic test: Sensitivity (true positive rate) = P(positive test | disease present). Specificity = P(negative test | disease absent). Positive Predictive Value (PPV) = P(disease | positive test) — this is the Bayesian posterior probability. NPV = P(no disease | negative test). PPV depends heavily on disease prevalence. A 95% accurate test with 1% prevalence gives PPV of only about 16%. This is the base rate fallacy and why Bayes theorem is essential in medical diagnosis.

Question 4

What is PPV and NPV in a medical test?

Accepted Answer

PPV (Positive Predictive Value) is the probability that a person actually has the disease given a positive test result. NPV (Negative Predictive Value) is the probability that a person does not have the disease given a negative test result. PPV = Sensitivity times Prevalence divided by (Sensitivity times Prevalence plus (1 minus Specificity) times (1 minus Prevalence)). Both PPV and NPV depend on disease prevalence, unlike sensitivity and specificity which are fixed properties of the test.

Question 5

What is the base rate fallacy in Bayes theorem?

Accepted Answer

The base rate fallacy is the error of ignoring disease prevalence when interpreting test results. Example: a test with 99% sensitivity and 99% specificity is used to screen for a disease with 0.5% prevalence. Most people expect a positive result to mean 99% probability of disease. Actual calculation: PPV = (0.99 times 0.005) divided by (0.99 times 0.005 plus 0.01 times 0.995) = 33.2%. Two out of three positive results are false positives. Low prevalence makes false positives dominate even with highly accurate tests.

Question 6

What is the likelihood ratio in Bayes theorem?

Accepted Answer

Positive likelihood ratio LR+ = Sensitivity divided by (1 minus Specificity). It measures how much a positive test result increases the odds of disease. LR+ greater than 10 is strong evidence. LR+ of 2 doubles the odds. Negative likelihood ratio LR- = (1 minus Sensitivity) divided by Specificity. Small LR- means a negative test strongly rules out disease. LR- less than 0.1 is strong evidence against disease. Likelihood ratios allow Bayesian updating without needing exact prevalence figures.

Question 7

What is Bayesian updating and how does it work?

Accepted Answer

Bayesian updating is the process of revising probability estimates as new evidence arrives. Start with a prior probability P(A). Observe evidence B. Calculate posterior probability P(A|B) using Bayes theorem. The posterior becomes the new prior for the next piece of evidence. Example: initial disease probability is prevalence. After a positive test, posterior PPV becomes the new prior. After a second positive test from an independent test, apply Bayes again using PPV as the new prior.

Question 8

What is the difference between sensitivity and specificity?

Accepted Answer

Sensitivity = P(positive test | disease present) = true positive rate. High sensitivity means the test rarely misses disease. A highly sensitive test is good for screening. Specificity = P(negative test | disease absent) = true negative rate. High specificity means the test rarely gives false positives. A highly specific test is good for confirming disease. Sensitivity and specificity are fixed test properties. PPV and NPV are derived from these plus disease prevalence using Bayes theorem.

Question 9

How does Bayes theorem apply to spam detection?

Accepted Answer

Naive Bayes spam filter: prior P(spam) is the base rate of spam emails. Evidence is the presence of certain words in an email. P(word | spam) is the likelihood a spam email contains that word. P(word | not spam) is the false positive rate. Posterior = P(spam | word) is computed for each word. For multiple independent words, multiply likelihoods and apply Bayes theorem iteratively. This is the mathematical basis for Gmail and other email spam filters.

Question 10

What is prior probability in Bayes theorem?

Accepted Answer

The prior probability P(A) is your initial belief about how probable event A is before observing any evidence. In medical testing, the prior is disease prevalence in the population being tested. In other contexts, it could be your initial best estimate based on background knowledge. The prior is updated by the likelihood ratio of the observed evidence to produce the posterior probability. Choosing an appropriate prior is one of the most important and debated aspects of Bayesian inference.

Question 11

Why does disease prevalence matter so much in Bayes theorem?

Accepted Answer

Prevalence (prior probability) determines how many true positives vs false positives exist in a population. With low prevalence, even a tiny false positive rate generates many false positives because the healthy population is large. Example: 1% prevalence and 1% false positive rate in 10000 people gives 100 true positives (TP) but also 99 false positives (FP). PPV = 100 divided by 199 = 50.3%. Half of positives are false. This is why mass screening for rare diseases requires very high specificity.

Question 12

What is the confusion matrix in Bayes theorem?

Accepted Answer

A confusion matrix for a diagnostic test shows the four outcomes for a given population size. For n people with known prevalence: True Positives (TP) = prevalence times n times sensitivity. False Negatives (FN) = prevalence times n times (1 minus sensitivity). False Positives (FP) = (1 minus prevalence) times n times (1 minus specificity). True Negatives (TN) = (1 minus prevalence) times n times specificity. PPV = TP divided by (TP plus FP). NPV = TN divided by (TN plus FN). The confusion matrix makes abstract probabilities concrete and understandable.

Measure	Formula (Bayes)	Interpretation
PPV (Positive Predictive Value)	P(Disease \| Positive Test)	Probability you have disease given positive result
NPV (Negative Predictive Value)	P(No Disease \| Negative Test)	Probability you are disease-free given negative result
LR+ (Positive Likelihood Ratio)	Sensitivity / (1−Specificity)	How much a positive result increases disease odds
LR− (Negative Likelihood Ratio)	(1−Sensitivity) / Specificity	How much a negative result decreases disease odds
Sensitivity	P(Positive \| Disease)	Fixed test property — independent of prevalence
Specificity	P(Negative \| No Disease)	Fixed test property — independent of prevalence

Bayes Theorem Calculator

Sources & Methodology

Bayes Theorem — Posterior Probability, Medical Tests & Complete Guide

The Bayes Theorem Formula — Full Derivation

PPV and NPV — Medical Test Interpretation via Bayes Theorem

Why Disease Prevalence Changes Everything — The Base Rate Fallacy

Missing a Statistics Calculator?