# Benford's Law

In 1881 a result was published, based on the observation that the first pages of logarithm books, used at that time to perform calculations, were much more worn than the other pages. The conclusion was that computations of numbers that started with 1 were performed more often than others: if d denotes the first digit of a number the probability of its appearance is equal to log(d + 1).

The phenomenon was rediscovered in 1938 by the physicist F. Benford, who confirmed the “law” for a large number of random variables drawn from geographical, biological, physical, demographical, economical and sociological data sets. It even holds for randomly compiled numbers from newspaper articles. Specifically, Benford’s law, or the first-digit law, states, that the occurrence of a number with first digit 1 is with 30.1%, 2 with 17.6%, 3 with 12.5%, 4 with 9.7%, 5 with 7.9%, 6 with 6.7%, 7 with 5.8%, 8 with 5.1% and 9 with 4.6% probability. In general, the leading digit d ∈ [1, ..., b−1] in base b ≥ 2 occurs with probability proportional to log_b(d + 1) − log_b(d) = log_b(1 + 1/d).

First explanations of this phenomena, which appears to suspend the notions of probability, focused on its logarithmic nature which implies a scale-invariant or power-law distribution. If the first digits have a particular distribution, it must be independent of the measuring system, i.e., conversions from one system to another don’t affect the distribution. (This requirement that physical quantities are independent of a chosen representation is one of the cornerstones of general relativity and called covariance.) So the common sense requirement that the dimensions of arbitrary measurement systems shouldn’t affect the measured physical quantities, is summarized in Benford’s law. In addition, the fact that many processes in nature show exponential growth is also captured by the law, which assumes that the logarithms of numbers are uniformly distributed.

So how come one observes random variables following normal and scaling-law distributions? In 1996 the phenomena was mathematically rigorously proven: if one repeatedly chooses different probability distribution and then randomly chooses a number according to that distribution, the resulting list of numbers will obey Benford’s law. Hence the law reflects the behavior of distributions of distributions.

Benford’s law has been used to detect fraud in insurance, accounting or expenses data, where people forging numbers tend to distribute their digits uniformly.