Figure_5.tif (378.23 kB)
Download file

The distribution of number words follows an inverse power function in both English and Spanish.

Download (0 kB)
posted on 2013-02-20, 16:23 authored by Michael Ramscar, Melody Dye, Hanna Muenke Popick, Fiona O'Donnell-McCarthy

Panel A shows the relative frequency with which the numbers 1–7 are used to describe sets of nouns in spoken English and Spanish (r = .999) [53], [54]. To ensure that the striking similarity in set mentions we found in the distribution of each language was not influenced by our weighted estimate of “uno,” we also examined the relationship between the frequency of number-word+noun sequences in English and the raw frequency counts for Spanish number words. Panel B is a graph of the relative frequency with which the numbers 1–7 are used to describe sets in spoken English [53] plotted against the relative frequency of the numbers 1–7 in the 100 million word Corpus Del Español [54]. Again, the same pattern and correlation (r = .999) was observed. These findings suggest that the distributions of number words in English and Spanish conform well to Benford's law, which holds that lists of numbers from real-life sources of data will inevitably show an inverse power distribution [55]. We should note, however, that the probability distribution of numbers is somewhat more complex than this captures: because the decimal system – base ten – is employed for most everyday purposes, multiples of 10, 100, 1000, etc., tend to be used much more frequently than Benford's law would predict, and similar, albeit smaller, peaks in usage frequency can be observed for multiples of five.