Zipf's law Zipf (1949) may be one of the most enigmatic and controversial regularities known in linguistics. It has been alternatively billed as the hallmark of complex systems and dismissed as a mere artifact of data presentation.^{1} The simplicity of its formulation, its experimental universality, and its robustness starkly contrast with the obscurity of its meaning. In its most straightforward form, it states that if the words of a language are ranked in order of decreasing frequency in texts, the frequency is inversely proportional to the rank,

where *f*_{k} is the frequency of the word with rank *k*. As a typical example, consider a log-log plot of frequency vs. rank in Fig. 1. It is calculated from a frequency dictionary of the Russian language compiled by S. Sharoff 1949; Sharoff (2002). The dictionary is based on a corpus of 40 million words, with special care Sharoff (2002) taken to prevent data skewing by words with high concentrations in particular texts (like the word *hobbit* in a Tolkien novel).

Zipf's law is usually presented in a generalized form where the power law exponent may be different from −1,

According to Ferrer i Cancho (2005), where an extensive bibliography is presented, various subsets of the language obey the generalized Zipf's law (Equation 2). Thus, while the value of *B* ≈ 1 is typical for single author samples and balanced frequency dictionaries, different values, both greater and less than 1, characterize the speech of schizophrenics and very young children, military communications, or subsamples consisting of nouns only.

Here we concentrate on the “normal” language approximated by balanced corpora and do not consider the above-mentioned special cases and subsets. Neither do we attempt to generalize our treatment to other power laws found in various domains such as computer science or economics (a comprehensive bibliography compiled by Wentian Li is available at http://www.nslij-genetics.org/wli/zipf/). The purpose of this work is to demonstrate that inverse proportionality (Equation 1) can be explained on purely linguistic grounds. Likewise, we do not pay special attention to the systematic deviations from the inverse proportionality at the low-rank and high-rank ends, considering them second-order effects.

It is not possible to review the vast literature related to the Zipf's law. However, it appears that the bulk of it is devoted to experimental results and phenomenological models. There are not very many models that would aim at explaining the underlying cause of the power law and predicting the exponent. We briefly review models of this type in the first section. In section 2, we discuss the role in the language of words/meanings having different degrees of generality. In section 3, we show that Zipf's law can be generated by a particular kind of arrangements of word meanings over the semantic space. In Section 4, we discuss the evolution of word meanings and demonstrate that it can lead to such arrangements. Section 5 is devoted to numerical modeling of this process. Discussion and prospects for further studies constitute section 6. In the Appendix we present some evidence to support the assumption that a word's frequency is proportional to the extent of its meaning.