Run a chi-square goodness of fit test or a test of independence on a contingency table. Enter your observed frequencies and get the chi-squared (χ²) statistic, degrees of freedom, p-value, Cramér’s V effect size, per-cell contributions, and a plain-English verdict — with full step-by-step working.
📌 Goodness of Fit: Test whether your observed counts match expected proportions for one categorical variable. Formula: χ² = ∑ (O − E)² / E df = k − 1
Category Name
Observed
Expected %
Expected % must sum to 100. Leave blank for equal distribution.
📌 Test of Independence: Test whether two categorical variables are associated. Enter observed counts in the contingency table. Formula: χ² = ∑ (O − E)² / E df = (rows − 1) × (cols − 1)
Chi-Square Statistic
—
📊 Per-Category / Per-Cell Contributions
📐 Step-by-Step Working
⚠️ Disclaimer: Chi-square tests require independent observations, adequate sample size (expected freq ≥ 5 per cell), and categorical data. Results are for educational and research support. Verify critical decisions with a qualified statistician.
Was this calculator helpful?
✓ Thanks for your feedback!
Sources & Methodology
✓Chi-square formulas verified against NIST/SEMATECH and OpenStax Statistics. P-values computed via chi-square distribution CDF using the regularized incomplete gamma function. Cramér’s V uses the standard formula with minimum(r−1, c−1) denominator.
Authoritative source for chi-square statistic formula, degrees of freedom, and p-value calculation method. NIST is a US federal government measurement standards body.
Open-access Rice University textbook. Source for test of independence, expected frequency formula E = (row total × column total) / grand total, and df = (r−1)(c−1).
Chi-Square Test — Goodness of Fit, Independence, Cramér’s V & Full Guide
The chi-square test (χ²) is a non-parametric statistical test for categorical data that compares observed frequencies to expected frequencies. It is one of the most widely used tests in statistics, covering two fundamentally different questions: whether data fits a theoretical distribution (goodness of fit) and whether two categorical variables are related (test of independence).
What is the Chi-Square Formula and How Does It Work?
The chi-square statistic measures the overall discrepancy between observed and expected counts. Large χ² means the observed data differs substantially from what we expected under the null hypothesis.
χ² = ∑ (O⊂i; − E⊂i;)² / E⊂i;
Goodness of Fit worked example (fair die test):
Roll a die 100 times. Expected: 100/6 = 16.67 per face. Observed: [18, 12, 22, 15, 20, 13]
χ² = (18−16.67)²/16.67 + (12−16.67)²/16.67 + (22−16.67)²/16.67 + ...
= 0.107 + 1.307 + 1.707 + 0.107 + 0.534 + 0.694 = 4.456
df = 6−1 = 5 | p = 0.486 | p > 0.05 → Fail to reject H0: die appears fair
Chi-Square Test of Independence — Contingency Tables Explained
The independence test uses a contingency table (cross-tabulation) to test whether two categorical variables are related. The expected frequency for each cell is: E = (row total × column total) / grand total.
Independence test example (gender vs. product preference):
n = 150 people. Observed: Male(A=30, B=20, C=10) | Female(A=20, B=40, C=30)
Row totals: Males=60, Females=90. Col totals: A=50, B=60, C=40. Grand N=150
E(Male,A) = (60×50)/150 = 20. E(Male,B) = (60×60)/150 = 24. E(Male,C) = (60×40)/150 = 16
χ² = (30−20)²/20 + (20−24)²/24 + (10−16)²/16 + ... = 5.00 + 0.67 + 2.25 + ...
df = (2−1)×(3−1) = 2 | p < 0.05 → Gender and product preference are associated
Cramér’s V — Effect Size for Chi-Square Independence Tests
Statistical significance tells you IF an association exists. Cramér’s V tells you HOW STRONG it is. Unlike χ², Cramér’s V is not affected by sample size and always ranges from 0 to 1.
Cramér’s V = √( χ² / (N × min(r−1, c−1)) )
Convention for interpretation (Cohen): V = 0.10 small | V = 0.30 medium | V = 0.50 large
Cramér’s V = 0 means no association. V = 1 means perfect association.
Chi-Square Reference Table: Critical Values
df
α=0.10
α=0.05
α=0.01
Common Use Case
1
2.706
3.841
6.635
2-category GOF, 2×2 table
2
4.605
5.991
9.210
3-category GOF, 2×3 table
3
6.251
7.815
11.345
4-category GOF, 2×4 table
4
7.779
9.488
13.277
5-category GOF, 2×5 / 3×3 table
5
9.236
11.070
15.086
6-category GOF, fair die test
6
10.645
12.592
16.812
3×4 contingency table
9
14.684
16.919
21.666
4×4 contingency table
12
18.549
21.026
26.217
4×5 contingency table
💡 Expected frequency rule: All expected cell frequencies must be at least 5 for the chi-square approximation to be valid. If any cell has expected frequency < 5: combine categories to increase counts, use Fisher’s exact test for 2×2 tables with small n, or collect more data. A warning appears in this calculator when this assumption is violated.
Frequently Asked Questions
Chi-square test is a hypothesis test for categorical data comparing observed to expected frequencies. Two types: (1) Goodness of fit — one categorical variable tested against a theoretical distribution (is this die fair?). (2) Test of independence — two categorical variables in a cross-table to test if they’re related (is gender related to job preference?). Requirements: count data, independent observations, expected freq ≥ 5 per cell.
χ² = ∑ (O − E)² / E. Sum across all categories (GOF) or all cells (independence). O = observed count. E = expected count. For GOF: E = total_n × expected_proportion. For independence: E = (row total × col total) / grand total. The statistic measures total discrepancy between observed and expected. Larger χ² = bigger discrepancy = smaller p-value.
Goodness of fit: df = k − 1 (k = number of categories). Example: 4 categories → df = 3. Test of independence: df = (rows − 1) × (columns − 1). Example: 3×4 table → df = 2×3 = 6. Degrees of freedom determine which chi-square distribution to use for the p-value. More df = heavier-tailed distribution = need larger χ² for significance.
Cramér’s V = √(χ² / (N × min(r−1, c−1))). Ranges 0 to 1. V near 0 = no association. V = 1 = perfect association. Cohen conventions: <0.10 negligible, 0.10–0.30 small, 0.30–0.50 medium, ≥0.50 large. Unlike χ², V doesn’t increase with sample size, making it ideal for comparing association strength across different studies or sample sizes.
All expected cell frequencies must be ≥ 5 for the chi-square approximation to be valid. If expected freq < 5: (1) combine adjacent categories to increase counts; (2) use Fisher’s exact test for 2×2 tables; (3) collect more data. The chi-square distribution approximation breaks down with very small expected frequencies, leading to inflated false positive rates. This calculator warns you when this rule is violated.
Goodness of fit: one categorical variable vs. theoretical distribution. You specify expected proportions. Examples: fair coin, Mendelian ratios, uniform distribution. Test of independence: two categorical variables in a cross-table. Expected frequencies calculated from the data itself (row × col / n). Examples: gender vs. preference, treatment vs. outcome. Different df formulas: GOF uses k−1; independence uses (r−1)(c−1).
GOF null: the observed distribution matches the expected distribution (the data fits the theoretical model). Alternative: it doesn’t fit. Independence null: the two categorical variables are independent — knowing one tells you nothing about the other. Alternative: the variables are associated. If p < α you reject the null hypothesis. Chi-square tests are always two-sided — you test for any deviation from the null, not a specific direction.
Four: (1) Independence — each observation belongs to exactly one cell; no repeated measures. If violated, use McNemar’s test. (2) Expected frequency ≥ 5 in all cells — combine categories if needed. (3) Categorical data — counts, not means or percentages. (4) Random sampling. Also: chi-square should not be applied to proportions directly — convert to counts first.
Per-cell contribution = (O − E)² / E. It shows which specific categories or cells are driving a significant result. Cells with large contributions (> 2) are the main sources of deviation from the null hypothesis. Standardized residual = (O − E) / √E. Values beyond ±2 indicate cells that are significantly higher or lower than expected. This calculator shows both, helping you interpret not just WHETHER the test is significant but WHERE the differences are.
Classic examples: Fair die test — roll a die 120 times, expect 20 per face, compare to observed. Genetics — Mendelian inheritance predicts 3:1 phenotype ratio; observed progeny tested against this. Customer distribution — test if customers are equally distributed across weekdays. Marketing — test if survey responses match hypothesized demographic breakdown. Any situation comparing one set of observed counts to hypothesized proportions.
No. Chi-square tests only detect association or difference, not causation. A significant test of independence means the two variables are related — not that one causes the other. Confounding variables may explain the association. To establish causation you need experimental design with random assignment. Chi-square is also sensitive to sample size: with very large n, even trivial associations become statistically significant. Always report effect size (Cramér’s V) alongside p-values.
Medicine: testing if treatment group (drug/placebo) is independent of outcome (improved/not improved). Testing if smoking status is associated with lung disease diagnosis. Clinical trials: comparing observed adverse event rates to expected rates. Biology: Mendelian genetics testing observed vs expected offspring ratios. Hardy-Weinberg equilibrium testing. Ecology: testing if species distribution across habitats differs from expected. Any study with two categorical variables measured on independent subjects.
Use Fisher’s exact test when you have a 2×2 contingency table AND any expected cell frequency is < 5, or when your total sample size n < 20. Fisher’s exact test computes the exact probability directly without relying on the chi-square approximation, so it is valid for small samples. For larger tables with small expected frequencies, consider combining categories or collecting more data before running chi-square.