Every formula, worked example, and free calculator for descriptive statistics, probability, distributions, hypothesis testing, regression, confidence intervals, and 200+ more statistical calculations — all in one place. From mean and standard deviation to z-scores, p-values, ANOVA, and Bayesian inference.
Summarize and describe the key characteristics of a dataset — the foundation of all statistical analysis.
The three measures of central tendency each describe the "center" of a dataset differently. The mean is the arithmetic average — sum all values and divide by count. The median is the middle value when data is sorted, making it resistant to outliers. The mode is the most frequently occurring value, most useful for categorical or discrete data.
Mean (x̅) = (x1 + x2 + ... + xn) / n
Median = middle value if n is odd; average of two middle values if n is even
Mode = value(s) that appear most frequently in the dataset
Midrange = (Max + Min) / 2
Variance and standard deviation quantify how spread out data is around the mean. Use sample formulas (divide by n-1) when your data is a subset of a larger population — this applies to most real-world scenarios. Use population formulas (divide by N) only when you have every member of the population.
Sample variance (s²) = ∑(xi - x̅)² / (n - 1)
Population variance (σ²) = ∑(xi - μ)² / N
Standard deviation = √variance
Relative Standard Deviation (RSD%) = (s / x̅) x 100
The five-number summary — minimum, Q1, median, Q3, maximum — gives a complete picture of a distribution's shape and spread. The interquartile range (IQR = Q3 - Q1) measures the spread of the middle 50% and is used to detect outliers via Tukey's fences.
IQR = Q3 - Q1
Lower fence = Q1 - 1.5 x IQR
Upper fence = Q3 + 1.5 x IQR
Outlier: any value below lower fence or above upper fence
The coefficient of variation (CV) expresses standard deviation as a percentage of the mean, enabling comparisons of spread across datasets with different units or scales. Skewness measures asymmetry: a positive value means a right tail (mean > median), negative means a left tail.
CV = (standard deviation / mean) x 100%
Pearson's skewness = 3(mean - median) / standard deviation
Mean absolute deviation (MAD) = ∑|xi - x̅| / n
Sum of squares (SS) = ∑(xi - x̅)²
The range (max - min) is the simplest spread measure but highly sensitive to outliers. Mean squared error (MSE) measures prediction accuracy in models. Standard error of the mean (SEM) estimates how precisely your sample mean estimates the population mean.
| Measure | Formula | Use Case |
|---|---|---|
| Range | Max - Min | Quick spread estimate |
| IQR | Q3 - Q1 | Robust spread, outlier detection |
| Std Dev (s) | √(SS / n-1) | Typical spread from mean |
| CV | s / x̅ x 100% | Comparing different-scale datasets |
| MAD | ∑|xi - x̅| / n | Robust alternative to std dev |
| MSE | ∑(actual - predicted)² / n | Model accuracy |
| SEM | s / √n | Precision of sample mean |
Calculate probabilities for single events, combined events, conditional outcomes, and Bayesian updates.
Probability measures the likelihood of an event occurring, expressed as a value between 0 (impossible) and 1 (certain). The fundamental rules govern how probabilities combine across multiple events.
P(A) = favorable outcomes / total outcomes
P(A AND B) = P(A) x P(B) — if A and B are independent
P(A OR B) = P(A) + P(B) - P(A AND B) — addition rule
P(A | B) = P(A AND B) / P(B) — conditional probability
P(not A) = 1 - P(A) — complement rule
Bayes' theorem updates a prior probability given new evidence. It is the mathematical foundation of Bayesian statistics and is used in medical testing, spam filtering, and machine learning. The key insight: the probability of a hypothesis after observing evidence depends on how likely the evidence was under each hypothesis.
P(A | B) = P(B | A) x P(A) / P(B)
Expanded: P(A | B) = P(B | A) x P(A) / [P(B|A)P(A) + P(B|not A)P(not A)]
Post-test probability = (Sensitivity x Prevalence) / P(Positive test)
In medical testing, the positive predictive value (PPV) — the probability a positive test is a true positive — depends heavily on disease prevalence. A test with 99% sensitivity and 99% specificity still has only a 50% PPV when disease prevalence is 1%. This is the false positive paradox.
| Probability Concept | Formula | Application |
|---|---|---|
| Joint probability | P(A ∩ B) | Both events occur |
| Conditional | P(A | B) = P(A∩B)/P(B) | A given B occurred |
| Implied probability | 1 / decimal odds | Betting markets |
| Expected value | ∑(P(xi) x xi) | Average outcome over many trials |
| Sensitivity | TP / (TP + FN) | True positive rate |
| Specificity | TN / (TN + FP) | True negative rate |
| PPV | TP / (TP + FP) | Precision of positive result |
Normal, binomial, Poisson, t, chi-square, exponential, and all major distributions — PDF, CDF, and inverse calculations.
The normal distribution is the most important distribution in statistics due to the Central Limit Theorem: sample means are normally distributed for large enough samples, regardless of the underlying population's shape. Defined by mean μ and standard deviation σ, the normal curve is symmetric and bell-shaped.
Z-score = (X - μ) / σ
Empirical rule: μ ± 1σ = 68.3% | μ ± 2σ = 95.4% | μ ± 3σ = 99.7%
Standard normal: N(0, 1) — mean=0, std dev=1
P(a < X < b) = Φ(b) - Φ(a) using z-table or CDF
Use binomial when counting successes in a fixed number of independent trials, each with the same probability of success. Examples: number of heads in 10 coin flips, number of defective items in a batch.
P(X = k) = C(n,k) x p^k x (1-p)^(n-k)
Mean = np | Variance = np(1-p)
where n = trials, k = successes, p = probability per trial
Use Poisson when counting events occurring in a fixed interval of time or space, where events occur independently at a constant average rate. Examples: calls per hour, defects per square meter.
P(X = k) = (λ^k x e^-λ) / k!
Mean = λ | Variance = λ
where λ = average rate (events per interval)
| Distribution | Use When | Key Parameter |
|---|---|---|
| Normal | Continuous, symmetric, bell-shaped data | μ, σ |
| Binomial | Fixed trials, success/failure outcomes | n, p |
| Poisson | Count of rare events in interval | λ |
| t-distribution | Small samples (<30), unknown σ | df |
| Chi-square | Goodness of fit, independence tests | df |
| Exponential | Time between events (Poisson process) | λ |
| Uniform | Equal probability over a range | a, b |
| Geometric | Trials until first success | p |
| Beta | Modeling probabilities (Bayesian) | α, β |
| Weibull | Reliability, failure time analysis | k, λ |
Z-tests, t-tests, chi-square tests, ANOVA, and non-parametric tests — with p-values, critical values, and power analysis.
Every hypothesis test follows the same five steps: (1) State null (H0) and alternative (H1) hypotheses. (2) Choose significance level α (typically 0.05). (3) Collect data and calculate the test statistic. (4) Find the p-value or compare to critical value. (5) Reject H0 if p < α or test statistic exceeds critical value.
Use a z-test when the population standard deviation is known or when sample size n > 30. Use a t-test when the population standard deviation is unknown and the sample is small. The t-distribution has heavier tails than the normal, accounting for additional uncertainty from estimating σ.
Z-statistic = (x̅ - μ0) / (σ / √n)
T-statistic = (x̅ - μ0) / (s / √n) with df = n - 1
Two-sample t: t = (x̅1 - x̅2) / √(s²1/n1 + s²2/n2)
Chi-square: χ² = ∑(Observed - Expected)² / Expected
F-statistic (ANOVA) = variance between groups / variance within groups
A Type I error (false positive, rate = α) occurs when you reject a true null hypothesis. A Type II error (false negative, rate = β) occurs when you fail to reject a false null. Statistical power = 1 - β is the probability of correctly detecting a real effect. Power of 0.80 (80%) is the conventional minimum standard for research.
| Test | Use When | Assumptions |
|---|---|---|
| One-sample z-test | Compare sample mean to known μ, large n | Known σ, normal data |
| One-sample t-test | Compare sample mean to known μ, small n | Unknown σ, approx. normal |
| Two-sample t-test | Compare two group means | Independent samples |
| Paired t-test | Before/after or matched pairs | Paired observations |
| Chi-square χ² | Test independence or goodness of fit | Categorical data, expected >5 |
| ANOVA (F-test) | Compare 3+ group means | Normal, equal variance |
| Mann-Whitney U | Compare two groups, non-normal | Non-parametric |
| Wilcoxon signed-rank | Paired non-normal data | Non-parametric |
Construct 90%, 95%, and 99% confidence intervals for means, proportions, and differences.
A 95% confidence interval does not mean "there is a 95% chance the true parameter is in this interval." The correct interpretation: if you repeated the study many times and constructed a CI each time, 95% of those intervals would contain the true population parameter. For any single interval, the true value is either in it or it is not.
CI for mean (large n): x̅ ± z* x (σ / √n)
CI for mean (small n): x̅ ± t* x (s / √n), df = n-1
CI for proportion: p̂ ± z* x √(p̂(1-p̂)/n)
Margin of error E = z* x (s / √n)
Critical values: 90% CI z*=1.645 | 95% CI z*=1.960 | 99% CI z*=2.576
Linear, polynomial, exponential, and logistic regression — plus Pearson, Spearman, and Kendall correlation coefficients.
Linear regression finds the best-fit line y = a + bx by minimizing the sum of squared residuals (ordinary least squares). The slope b tells you how much y changes per unit increase in x. R² (coefficient of determination) measures the proportion of variance in y explained by x.
Slope: b = [n∑(xy) - ∑x∑y] / [n∑x² - (∑x)²]
Intercept: a = y̅ - b x̅
Pearson r = ∑[(xi - x̅)(yi - y̅)] / [(n-1) sx sy]
R² = r² = 1 - (SS_residual / SS_total)
Residual = Observed y - Predicted y
| |r| Value | Strength | Notes |
|---|---|---|
| 0.0 – 0.1 | Negligible | No meaningful linear relationship |
| 0.1 – 0.3 | Weak | Small effect; may be meaningful in large samples |
| 0.3 – 0.5 | Moderate | Noticeable relationship |
| 0.5 – 0.7 | Strong | Clear predictive relationship |
| 0.7 – 0.9 | Very strong | High predictability |
| 0.9 – 1.0 | Near perfect | Almost deterministic relationship |
Permutations, combinations, factorials, and counting principles — with and without repetition.
The critical distinction: permutations count arrangements where order matters (e.g., ranking 3 people from 10). Combinations count selections where order does not matter (e.g., choosing 3 people from 10 for a committee). The number of combinations is always less than or equal to the number of permutations for the same r and n.
Permutation P(n,r) = n! / (n - r)!
Combination C(n,r) = n! / [r! x (n - r)!]
Permutation with repetition = n^r
Combination with repetition = C(n+r-1, r)
Password combinations (no repeat): P(n, r) where n = character set, r = length
Determine the right sample size, calculate standard error, and analyze sampling distributions.
Sample size depends on three factors: the desired margin of error E, the confidence level (which determines z*), and the population standard deviation σ. A smaller margin of error requires a much larger sample — halving the margin of error quadruples the required n.
For a mean: n = (z* x σ / E)²
For a proportion: n = z*² x p(1-p) / E²
Worst-case proportion (p=0.5): n = z*² x 0.25 / E²
At 95% confidence, 5% margin of error: n = 1.96² x 0.25 / 0.05² = 385
Finite population correction: n_adj = n / (1 + (n-1)/N)
Power analysis determines the sample size needed to detect an effect of a given size. Cohen's d is the standardized effect size for t-tests: d = (μ1 - μ2) / σ. Small effect d=0.2, medium d=0.5, large d=0.8. With power = 0.80, α = 0.05, and medium effect (d=0.5), a two-sample t-test requires approximately 64 participants per group.
Frequency distributions, histograms, box plots, stem-and-leaf plots, and percentile tools for exploring data.
Convert between decimal, fractional, and moneyline odds — plus dice, coin flip, and roulette probability tools.
Betting odds are expressed in three formats. Decimal odds (2.50) are the payout multiplier including your stake. Fractional odds (3/2) show profit relative to stake. Moneyline (American) odds (+150 or -200) show how much you win from $100 or must bet to win $100. All three encode the same information — implied probability.
Decimal odds: Implied probability = 1 / decimal odds
Fractional odds (a/b): Implied probability = b / (a + b)
Moneyline (+): Implied probability = 100 / (odds + 100)
Moneyline (-): Implied probability = |odds| / (|odds| + 100)
Vig (overround) = sum of all implied probabilities - 1
Non-parametric tests, Bayesian methods, A/B testing, epidemiology statistics, and process capability tools.
Sort and rank numbers, decimals, and datasets — percentile ranks, ascending/descending order tools.
All formulas, rules, and reference values on this guide are sourced from authoritative statistical references: