Generation of Fat Tail Distributions  [presentation, ppt, 1347 kB]

Luccio R.
Dipartimento di Psicologia "G. Kanizsa", University of Trieste, Trieste, Italy

Several quite different phenomena distribute according to few different functions, which share in gross sense a particular shape that has induced to call them “fat (heavy, long) tail distributions”. Well-known examples are the so-called Benford’s law (originally stated by Newcomb, 1881), according to which the probability that the first digit in a series of statistical data is d is given by a log function of the digit. Other well-known examples are Bradford’s law (about the distribution of scientific journals), Heap’s law (vocabulary growth and text size), Lotka’s law (number of authors and number of contributions), and so on. In economics, Lorentz’ law and Pareto’s law (on inequality of incoming) are well known. In psycholinguistics the most celebrated is undoubtedly Zipf’s law, on relation between number of words and their rank of frequency, originally stated in 1925, a power (quasi-hiperbolic) law. According to Zipf, it could be explained in force of an economic psychological principle: more frequent are the words, more easily they come to consciousness. In this study I have investigated the alliterations, that is the relationship between number of words interposed between two words sharing the same first letter in the first syllable (x) and number of occurrences of each given x, that is n(x). Analysing different excerpts of texts of different authors (Italian, French and American novelists like D’Annunzio, Invernizio, France and James, or essayists like Leopardi), I found invariably an excellent fit to Lorentz’ law, with an R-squared always above .93, and a remarkable stability of parameters within each author, rather than between them. This induces to consider using this regularity in the studies on attribution of authorship. Some hypotheses about the generating mechanisms of such distributions are advanced.