Paste your dataset to test if leading digit frequencies conform to Benford’s Law — the logarithmic first-digit distribution used in fraud detection and data auditing. Get observed vs expected frequencies for all 9 digits, a chi-square goodness-of-fit test, p-value, and a plain-English verdict on whether your data looks natural or manipulated.
Minimum 100 numbers recommended. Decimals OK. Negative signs ignored (leading digit of absolute value used).Please enter at least 20 numbers to run the test.
Chi-Square Statistic
0.000
Based on Benford’s Law chi-square goodness-of-fit test, df=8
📊 Digit-by-Digit Frequency Breakdown
Digit
Observed Count
Observed %
Expected % (Benford)
χ² Contribution
Deviation
⚠️ Disclaimer: Non-conformance with Benford’s Law does not prove fraud — it flags data for further investigation. Conformance does not rule out fraud. Results depend on data type and size. Always consult a forensic accountant for legal purposes. Minimum 100 numbers recommended; fewer may produce unreliable results.
Authoritative mathematical reference for the Benford’s Law formula P(d) = log₁₀(1+1/d), its logarithmic derivation, scale invariance properties, and applications in fraud detection.
Peer-reviewed research confirming Benford’s Law’s broad applicability across multiple languages and data types. Supports the chi-square goodness-of-fit test methodology used in this calculator.
Formula used (Benford 1938, verified NIST): P(d) = log₁₀(1 + 1/d) for d = 1 to 9. Expected frequencies: P(1)=30.103%, P(2)=17.609%, P(3)=12.494%, P(4)=9.691%, P(5)=7.918%, P(6)=6.695%, P(7)=5.799%, P(8)=5.115%, P(9)=4.576%. Statistical test: chi-square goodness-of-fit with df=8. Critical values: α=0.10: 13.362, α=0.05: 15.507, α=0.01: 20.090. Leading digit extracted from absolute value of each number; zeros excluded.
Benford’s Law — First Digit Distribution, Fraud Detection & Complete Guide
Benford’s Law, also known as the First Digit Law or the Law of Anomalous Numbers, is one of the most counterintuitive discoveries in statistics. In most naturally occurring numerical datasets, the leading digit is not uniformly distributed. Digit 1 appears as the leading digit about 30% of the time — six times more often than digit 9 which appears less than 5% of the time. This predictable logarithmic distribution is the basis of modern forensic accounting and fraud detection.
The Benford’s Law Formula and Expected Digit Frequencies
The formula was first published by Simon Newcomb in 1881 and independently rediscovered by physicist Frank Benford in 1938 across 20,000 data points from 20 different sources. The probability that a number begins with digit d is:
Why does this work? On a logarithmic scale, the interval [1, 2) covers log₁₀(2)−log₁₀(1) = 0.301 = 30.1% of any decade. The interval [9, 10) covers only log₁₀(10)−log₁₀(9) = 0.046 = 4.6%. Data spanning many orders of magnitude samples this logarithmic scale, producing Benford’s distribution. Source: Benford (1938), Wolfram MathWorld
Chi-Square Test for Benford Conformance: How Fraud Detection Works
The statistical test for Benford conformance is the chi-square goodness-of-fit test with 8 degrees of freedom (9 digits minus 1 for the constraint that all frequencies sum to 100%). For a dataset of n numbers:
Digit
Benford Expected %
Why 1 is Most Common
χ² Critical Value (α=0.05, df=8)
1
30.103%
Widest logarithmic interval
15.507 Above = non-conforming
2
17.609%
Second widest log interval
3
12.494%
Decreasing log intervals
4
9.691%
Decreasing log intervals
5
7.918%
Decreasing log intervals
6
6.695%
Decreasing log intervals
7
5.799%
Decreasing log intervals
8
5.115%
Decreasing log intervals
9
4.576%
Narrowest log interval
Where Benford’s Law Applies — and Where It Doesn’t
Benford’s Law applies most reliably to data spanning several orders of magnitude from natural, multiplicative processes. It has been validated on: financial transactions, stock prices, river lengths, population figures, physical constants, death rates, electricity bills, census data, and scientific measurements. It does NOT reliably apply to restricted-range data (ages, heights, temperatures), assigned sequential numbers (ZIP codes, phone numbers), small samples under 100 numbers, or datasets deliberately constructed to follow a specific distribution.
Benford’s Law in Forensic Accounting & Court Evidence
Since Mark Nigrini demonstrated its application to tax fraud detection in 1996, Benford’s Law has become a standard tool in forensic accounting. It is used by the IRS to screen tax returns, by the SEC to detect financial statement fraud, by government auditors to find procurement irregularities, and by insurance companies to identify fraudulent claims. Evidence based on Benford’s Law analysis is admissible in US federal, state, and local courts. The underlying logic is powerful: people who fabricate numbers intuitively distribute digits too uniformly, not realizing that real-world numbers follow a logarithmic pattern.
💡 Fraud Detection in Practice: Auditors compare leading digit distributions of financial records against Benford’s expected distribution. Fraudsters who invent numbers tend to make them “look random” — distributing digits 1–9 more uniformly than Benford predicts. This creates too many numbers starting with 5, 6, 7, 8, or 9. A significant chi-square test result flags the data for deeper investigation. Since 1996, Benford’s Law evidence has been admissible in US federal, state, and local courts.
Frequently Asked Questions
Benford’s Law states that in many naturally occurring datasets, the leading digit is not uniformly distributed. Digit 1 appears about 30% of the time, digit 2 about 17.6%, down to digit 9 at just 4.6%. The formula is P(d) = log₁₀(1+1/d). It arises because naturally occurring numbers span many orders of magnitude, and on a logarithmic scale, smaller digits occupy more space. It is used in forensic accounting, tax auditing, and election integrity analysis.
P(d) = log₁₀(1 + 1/d) for d = 1 to 9. Results: P(1)=30.1%, P(2)=17.6%, P(3)=12.5%, P(4)=9.7%, P(5)=7.9%, P(6)=6.7%, P(7)=5.8%, P(8)=5.1%, P(9)=4.6%. These sum to 100%. The formula follows from the logarithmic scale: digit 1 covers log₁₀(2)−log₁₀(1) = 30.1% of any decade, while digit 9 covers only 4.6%.
Forensic accountants compare the observed leading digit distribution of financial data against Benford’s expected distribution using a chi-square test. People who fabricate numbers tend to distribute digits more uniformly, creating too many 5s, 6s, 7s, 8s, and 9s. A significant deviation (p-value < 0.05) is a red flag for manipulation. Benford analysis is accepted as evidence in US federal courts and is widely used by the IRS, SEC, and international audit firms.
Best applications: financial transactions, invoice amounts, stock prices, population figures, river lengths, physical constants, census data, and tax return amounts. Does NOT apply to: restricted ranges (ages 0–120), assigned sequential numbers (ZIP codes, phone numbers, SSNs), very small datasets (under 100), or data from a single order of magnitude. The key requirement is data spanning multiple orders of magnitude.
Use the chi-square goodness-of-fit test with 8 degrees of freedom. For each digit d: expected count = n × P(d). Chi-square contribution = (observed − expected)² / expected. Sum all 9 contributions. At α=0.05, the critical value with df=8 is 15.507. If computed chi-square exceeds this, the data significantly deviates from Benford. The p-value gives the exact probability of observing this deviation by chance if the data were truly Benford-distributed.
No. Non-conformance flags data for investigation but cannot prove fraud. Legitimate reasons for non-conformance: restricted data range, too-small dataset, data from a single category, or naturally non-Benford processes. Conformance also cannot rule out fraud — sophisticated manipulators can preserve Benford distributions. It is a screening tool that identifies anomalies requiring deeper forensic examination, not a standalone proof.
Minimum 100 numbers; 1,000+ for reliable results. With fewer than 100, the expected count for digit 9 (4.6%) falls below 5, violating the chi-square test assumption that all expected counts be at least 5. Small samples have too much random variation to reliably distinguish natural from artificial distributions. Professional forensic accounting typically uses thousands of transactions.
With df=8: at α=0.10 critical value is 13.362; at α=0.05 it is 15.507; at α=0.01 it is 20.090. Choose your significance level based on the stakes: use α=0.05 for standard analysis, α=0.01 for high-stakes audit work where you want strong evidence before flagging data. A computed chi-square above the critical value indicates statistically significant non-conformance.
Simon Newcomb noticed in 1881 that earlier pages of logarithm books (smaller numbers) were more worn, suggesting smaller digits appeared more often. Frank Benford independently rediscovered and tested the phenomenon in 1938 across 20,000+ data points from 20 different sources including river areas, atomic weights, newspaper figures, and street addresses. The law is named after Benford. In 1996, Mark Nigrini demonstrated its application to tax evasion detection.
Yes. The second-digit distribution also follows a (less extreme) logarithmic pattern, and digits 0–9 each have a specific expected frequency. The second-digit Benford test (2BL-test) is used in election forensics. For large datasets (10,000+ numbers), second-digit analysis adds statistical power. However, the first-digit test remains the most commonly used because it produces the strongest signal and requires the least data.
Forensic accounting: detecting embezzlement in corporate accounts. IRS: identifying suspicious tax returns. SEC: spotting manipulated financial reports. Election auditing: testing vote count integrity. Insurance: detecting fraudulent claims. Government procurement: finding kickback schemes. Anti-money laundering: flagging unusual transaction patterns. Scientific integrity: checking for data fabrication in research. Benford analysis has caught fraudsters in healthcare billing, vendor payments, and expense reports worldwide.
The deep mathematical reason: Benford’s distribution is the only distribution that is scale-invariant (does not change when all values are multiplied by a constant) and base-invariant (holds in any number base). This makes it a natural attractor for data from multiplicative random processes. T.P. Hill (1998) proved rigorously that the distribution of first digits of a random distribution of distributions converges to Benford’s distribution — explaining why it appears across so many diverse datasets.