An Introduction to Benford's Law

The law is named after Benford but it was actually Newcomb who discovered in 1881 that logarithm tables were more worn out in the pages starting with smaller digits. Benford rediscovered this fact only in 1938. The law says that in many collections of numerical data, the probability that the first significant digit is $d$ is given by $\log_{10}(1+1/d)$. It has been generalized in the sense that the probability that the first significant digit is $d_1$, the second $d_2$ and so on until the $m$th is $d_m$ is given by $\log_{10}(1+(\sum_{j=1}^m 10^{m-j}d_j)^{-1})$. So the probability that the first digit is 1 is about 30.1%, and it decreases to 17.6% for a 2, etc., till only 4.6% for 9. Software code to test the law on a set of 10000 Fibonacci numbers can be found at the rosettacode.org site.

However not all sets of data satisfy the law. For one thing, the law is asymptotic, which means that it holds for an infinite set of data. But even then, not all data sets need to satisfy the law. An obvious example are telephone book numbers, but also square roots do not follow the law. So it is important to know to what extent the law is satisfied or not. This can be measured by a number Δ that is for example the maximal deviation from the theoretical probability by one of the digits considered.

Because the law is simple and counter intuitive, it has been popular among mathematical hobbyists. But a proper study of the phenomenon requires rigorous mathematics defining random variables, probability spaces, stochastic processes, etc. All this is introduced in this book and the reader should be willing to assimilate all of it to get a better grip on the phenomenon.

The book has numerous tables and examples illustrating the law, each time with a Δ-value. The approach taken here illustrates that the law is catching on and gets a broader and more serious attention. The authors have compiled a database of literature on the subject.

The law has been applied in fraud detection, justice, research data, game theory, etc. However, before the conclusion should have any guarantee of correctness, the rules should be clearly understood. There is a chapter on applications in this book but it is not the main objective. More can be found in another book with research papers that is appearing simultaneously. See the accompanying review of Benford's Law: Theory and Applications (Princeton University Press, 2015) edited by S.J. Miller. Another recent book by A.E. Kossovsky is Benford's Law (World Scientific, 2014) but this is much less mathematical and more speculative.

If you want to understand all the mathematics behind the law and are prepared to accept all the necessary theory, this is a marvelous and excellent introduction you might be looking for. If you are less patient and/or are already well trained in the mathematics, you might want to read a chapter in the book by Miller mentioned above where the authors give a 40 page summary of the theory in this introduction.

One idiosyncratic particularity took some time for me to get used to. Floating point numbers are only given with 4 significant digits and written with equality signs, even though they are not strict equalities. For example π=3.141 meaning that π has a value between 3.141 and 3.142. For Δ, only 2 digits are used. So Δ = 0.00 means that Δ is between 0.00 and 0.01.

Reviewer: 
Adhemar Bultheel
Book details

This is a sound mathematical introduction to the statistics of Benford's law.

Author: 

Publisher: 

Published: 
2015
ISBN: 
9780691163062 (hbk)
Price: 
USD 75.00
Pages: 
256
Categorisation

User login