## 1. Introduction

The question of how semantic memory is structured has been tackled in recent years through graph-theoretic analyses of semantic networks constructed from word association norms (De Deyne & Storms, 2008b; Steyvers & Tenenbaum, 2005). However, semantic networks based on word association norms do not unambiguously reflect the structural form of the networks of individual participants. The problem is that these networks are based on word association data sets that aggregate across the responses of many people. The structural properties of these aggregate networks need not resemble the properties of an individual’s network because various forms of biases can occur when combining data over individuals. In this article, we investigate how the individual’s associative semantic network is structured. To this end, we present a new experimental procedure for sampling the associative semantic networks of individual participants and analyze the statistical structure of the resultant individual networks. Each of these individual maps of semantic structure contains thousands of unique words that were sampled during approximately 6 weeks of 1-hr daily sessions. Moreover, we reanalyze existing semantic networks based on group data to examine how their structure compares with the structure of the networks of individuals.

The question of how an individual’s enormous reservoir of semantic knowledge is structured is not only interesting in its own right but it also has significant implications for the processes that operate on the structure of semantic memory. In the words of Herbert Simon (1986, p. 299):

A central concern in describing any symbol-processing system is to characterize the structure (…) of its memories. When we know these facts about a brain or a digital computer, we know a great deal about its capabilities and methods of operation.

For instance, the structure of the individual’s semantic network can elucidate the processes through which memory grows or develops over the life span. It has been argued by a number of researchers that the structural properties of networks are important consequences of the characteristic ways that network systems grow over time (e.g., Amaral, Scala, Barthélémy, & Stanley, 2000; Barabási & Albert, 1999; Watts & Strogatz, 1998). As not all mechanisms of semantic growth may give rise to one type of memory structure, the statistical structure of the individual’s semantic network can help restrict the search space of possible growth mechanisms.

### 1.1. Structural properties of aggregate semantic networks

We start by defining some basic terminology from network analysis and by introducing the structural properties that have been shown to characterize semantic networks based on group data. In an aggregate network of word associations, a set of words is represented as nodes joined by links or connections that represent a nonzero probability of a word being named as an associate by many people in response to a cue word. Two nodes are said to be neighbors if they are connected. A network can be treated as having directed or undirected links, where a directed link represents the direction of the association between two words and an undirected link leaves the direction unspecified. A network with only directed links is said to be directed and a network with only undirected links is said to be undirected.

Networks constructed from word association norms have been shown to have a small-world structure, characterized by high local clustering and short global distances between words (De Deyne & Storms, 2008b; Steyvers & Tenenbaum, 2005). On the one hand, these networks are highly structured locally, having clusters of words that are densely connected to each other by associative relations. On the other hand, there are words that connect semantically distant clusters to one another, making it possible to connect any pair of words by traversing only a few connections. A semantic network with these two properties is called a “small world” because almost every word in the network is somehow “close” to almost every other word, even those that could be thought of as being very distant in semantic relatedness.

More formally, small-world networks are considered in terms of how they compare with random networks with the same type of links (i.e., directed or undirected), number of nodes, and average number of links across nodes. In a random network, a node is arbitrarily connected to nodes that can lie anywhere. The comparison is made regarding two statistical properties: the average shortest path length *L* and the clustering coefficient *C*. The average shortest path length *L* refers to the average of the shortest path lengths (i.e., the minimum number of links) that separate all pairs of words in the network.^{1} The clustering coefficient *C* represents the probability that two neighbors of a randomly chosen word will themselves be neighbors (Watts & Strogatz, 1998). If the *k*_{i} neighbors of a given word *i* were part of a fully connected neighborhood, there would be *k*_{i} (*k*_{i}* *− 1)/2 possible connections between them. The clustering coefficient of word *i* is the ratio between the number *T*_{i} of connections that actually exist between the *k*_{i} neighbors of word *i* and the number of possible connections between them,

The clustering coefficient *C* of the whole network is calculated by taking the average of the *C*_{i}’s across all words *i*. Because the definitions of *T*_{i} and *k*_{i} are independent of whether the links are directed, the clustering coefficient for a directed network and the corresponding undirected network are equal. Let *L*_{g} be the average shortest path length of the real network *G* and *C*_{g} its clustering coefficient, and let *L*_{random} and *C*_{random} be the equivalent quantities for the corresponding random networks. *G* is said to be a small-world network if *L*_{g} ≥ *L*_{random} and *C*_{g} >> *C*_{random} (Watts & Strogatz, 1998). Whereas the short distances allow for connecting two words chosen at random via a chain of only a few intermediaries, the high clustering implies that, on average, a word’s neighbors are more likely to be connected than two words chosen at random.

In addition to being characterized as small-world networks, semantic networks based on group data have been claimed to possess degree distributions that follow a power law (De Deyne & Storms, 2008b; Steyvers & Tenenbaum, 2005). A network’s degree distribution represents the probability that a randomly chosen word will have *k* neighbors (i.e., will have degree *k*). The distribution can be estimated based on the frequencies of word degrees found throughout the network. When the associative network is directed, researchers have focused on the number of incoming links to a word (i.e., the word’s in-degree). When the network is undirected, a word has a certain degree *k,* which is simply the number of links that it has. Network degree data can best be represented by plotting a cumulative degree distribution showing the probability that a randomly chosen word has an (in-) degree equal to or greater than *k* (Newman, 2005). Researchers have claimed that aggregate semantic networks, either directed or undirected, are characterized by link distributions across nodes that follow a power law, with most words having relatively few connections joined together through a smaller number of words with many connections (De Deyne & Storms, 2008b; Steyvers & Tenenbaum, 2005). Networks with power-law degree distributions are sometimes referred to as scale-free networks because power laws have the property of having the same functional form at all scales (Barabási & Albert, 1999). Nevertheless, a methodological remark is warranted here. Studies that claimed that aggregate semantic networks are scale-free used the common method of fitting power laws to binned histograms by performing a least-squares fitting (De Deyne & Storms, 2008b; Steyvers & Tenenbaum, 2005). This method has been shown to generate poor estimates of parameters for power-law distributions and, in addition, gives no indication of whether another distribution might give a fit as good as or better than the power law (Clauset, Shalizi, & Newman, 2009). Therefore, the result that aggregate semantic networks are scale-free needs to be validated by principled statistical methods for detecting power-law behavior in empirical data.

### 1.2. Aggregate and individual semantic networks

Current associative semantic networks are based on word association data sets that aggregate across the associates of many people. These aggregate networks are a way of representing semantic knowledge that is shared among different speakers of a language. However, aggregate networks need not preserve the statistical properties of the networks of individuals. The degree distribution, in particular, can take a different shape when data are combined over individuals. Simulation work has shown that the power-law model provides a good fit to degree distributions that result when averaging across multiple individual degree distributions, none of which follows a power law (Grünenfelder & Müller, unpublished data). Moreover, combining data over individuals can influence other connectivity properties (e.g., the average degree *k*, the average shortest path length *L*) that depend strongly on how the word association data were collected, in particular on how many individuals generated associates for each cue word. In short, the structure of aggregate semantic networks does not unambiguously reflect the structural form of the individual’s semantic network.

In this article, we map and study the structure of individuals’ semantic memories. Specifically, we present a new experimental procedure for sampling words from individuals’ semantic memories and analyze the statistical structure of the resultant networks. Our structural analyses focus on whether the networks of individuals display small-world properties and whether their degree distributions are better described by a power law than by alternative functions. We characterize the degree distributions by applying a statistical framework developed by Clauset et al. (2009) that involves comparing the power law with alternative statistical models. The model comparison can shed light on the formation of human semantic networks, as different degree distributions typically arise from different processes of network development (e.g., Amaral et al., 2000). Moreover, we reanalyze the degree distributions of existing aggregate semantic networks using the same method. This will allow for examining whether aggregate networks are indeed scale-free, as has been previously claimed, and how the connectivity structure of these networks compares with the structure of the individual networks. Finally, we present a computer implementation of the new method for sampling the semantic networks of individuals. Our aim in modeling the new sampling method is to examine whether it yields semantic networks that are representative of the individuals’ true semantic networks, thereby revealing their structural characteristics.