#### 2.1. Raven's Progressive Matrices

There are several variations of the RPM; the Standard and Colored versions are generally used to test children or lower performing adults, whereas the Advanced is used to differentiate average/above-average subjects. In our work, we focus on the Advanced version.

Fig. 1 depicts an example of a simple Raven's-style matrix.^{1} The matrix is shown at the top with one blank cell, and the eight possible answers for that blank cell are given below. In order to solve this matrix, the subject needs to generate three rules: (a) the number of triangles increases by one across the row, (b) the orientation of the triangles is constant across the row, (c) each cell in a row contains one background shape from the set {circle, square, diamond}. Subjects can then determine which element belongs in the blank cell by applying the rules to the third row (i.e., there should be 2 + 1 = 3 triangles, they should be pointing towards the left, and the background shape should be a circle, since square and diamond are already taken). Once they have generated their hypothesis as to what the blank cell should look like, they can check for a match among the eight possible answers. Not all subjects will explicitly generate these exact rules, and their route to the answer may be more roundabout, but they do need to extract equivalent information if they are to correctly solve the problem.

Despite the test's broad use, there have been few computational models of this task. The model of Carpenter, Just, and Shell (1990) accurately recreates high-level human data (e.g., error rates), but it does not reflect the flexibility and variability of individual human performance nor take into account neurologic data. In addition, Carpenter et al.’s model has no ability to generate new rules; the rules are all specified beforehand by the modelers. This limitation of their model reflects a general lack of explanation in the literature as to how this inductive process is performed. More recently, models have been developed by Lovett, Forbus, and Usher (2010) and McGreggor, Kunda, and Goel (2010). The latter employs interesting new techniques based on image processing, but it is not intended to closely reflect human reasoning and is limited to RPM problems that can be solved using visual transformations. The Lovett et al. (2010) model takes an approach more similar to our own and has the advantage of more automated visual processing, but like the Carpenter et al. model it is targeted only at high-level human data and relies on applying rules defined by the modelers.

Previous assumptions regarding the origin of subjects’ rules in the RPM are that people are either (a) born with, or (b) learn earlier in life, a library of rules. During the RPM, these preexisting rules are then applied to the current inductive problem. Hunt described this theory as early as 1973 and also pointed out the necessary conclusion of this explanation: If RPM performance is dependent on a library of known rules, then the RPM is testing our crystallized intelligence (our ability to acquire and use knowledge or experience) rather than fluid intelligence (our novel problem-solving ability). In other words, the RPM would be a similar task to acquiring a large vocabulary and using it to communicate well. However, this is in direct contradiction to the experimental evidence, which shows the RPM strongly and consistently correlating with other measures of fluid intelligence (Marshalek, Lohman, & Snow, 1983), and psychometric/neuroimaging practice, which uses the RPM as an index of subjects’ fluid reasoning ability (Gray, Chabris, & Braver, 2003; Perfetti et al., 2009; Prabhakaran, Smith, Desmond, Glover, & Gabrieli, 1997). A large amount of work has been informed by the assumption that the RPM measures fluid intelligence yet the problem raised by Hunt has been largely ignored. Consequently, there is a need for a better explanation of rule induction; by providing a technique to dynamically generate rules, we remove the dependence on a past library and thereby resolve the problem.

In contrast to the paucity of theoretical results, there has been an abundance of experimental work on the RPM. This has brought to light a number of important aspects of human performance on the test that need to be accounted for by any potential model. First, there are a number of learning effects: Subjects improve with practice if given the RPM multiple times (Bors, 2003) and also show learning within the span of a single test (Verguts & De Boeck, 2002). Second, there are both qualitative and quantitative differences in individuals’ ability; they exhibit the expected variability in ‘‘processing power’’ (variously attributed to working memory, attention, learning ability, or executive functions) and also consistent differences in high-level problem-solving strategy between low-scoring and high-scoring individuals (Vigneau, Caissie, & Bors, 2006). Third, a given subject's performance is far from deterministic; given the same test multiple times, subjects will get previously correct answers wrong and vice versa (Bors, 2003). This is not an exhaustive list, but it represents some of the features that best define human performance. In the Results section, we demonstrate how each of these observations is accounted for by our model.

#### 2.2. Vector encoding

In order to represent a Raven's matrix in neurons and work on it computationally, we need to translate the visual information into a symbolic form. Vector Symbolic Architectures (VSAs; Gayler, 2003) are one set of proposals for how to construct such representations. VSAs represent information as vectors and implement mathematical operations to combine those vectors in meaningful ways.

To implement a VSA, it is necessary to define a binding operation (which ties two vectors together) and a superposition operation (which combines vectors into a set). We use circular convolution for binding and vector addition for superposition (Plate, 2003). Circular convolution is defined as

where

- (1)

Along with this, we employ the idea of a transformation vector *T* between two vectors *A* and *B*, defined as

- (2)

where *A*′ denotes the approximate inverse of *A*.

With these elements, we can create a vector representation of the information in any Raven's matrix. The first step is to define a vocabulary, the elemental vectors that will be used as building blocks. For example, we might use the vector [0.1,−0.35,0.17,…] as the representation for *circle*. These vectors are randomly generated, and the number of vectors that can be held in a vocabulary and still be distinguishable as unique ‘‘words’’ is determined by the dimensionality of those vectors (the more words in the vocabulary, the higher the dimension of the vectors needed to represent them).

Once the vocabulary has been generated it is possible to encode the structural information in a cell. A simple method to do this is by using a set of *attribute* ⊗ *value* pairs: *shape* ⊗ *circle* + *number* ⊗ *three* + *color* ⊗ *black* + *orientation* ⊗ *horizontal* + *shading* ⊗*solid*, and so on, allowing us to encode arbitrary amounts of information. As descriptions become more detailed it is necessary to use more complex encoding; however, ultimately it does not matter to the inductive system how the VSA descriptions are implemented, as long as they encode the necessary information. Thus, these descriptions can be made as simple or as complex as desired without impacting the overall model.

VSAs have a number of other advantages: They require fewer neural resources to represent than explicit image data, they are easier to manipulate mathematically, and perhaps most importantly the logical operation of the inductive system is not dependent on the details of the visual system. All that our neural model requires is that the Raven's matrices are represented in some structured vector form; the visual processing that accomplishes this, although a very difficult and interesting problem in itself (see Meo, Roberts, & Marucci, 2007 for an example of the complexities involved), is beyond the scope of the current model. This helps preserve the generality of the inductive system: The techniques presented here will apply to any problem that can be represented in VSAs, not only problems sharing the visual structure of the RPM.

#### 2.3. Neural encoding

Having described a method to represent the high-level problem in structured vectors, we now define how to represent those vectors and carry out the VSA operations in networks of simulated spiking neurons. There are several important reasons to consider a neural model. First, by tying the model to the biology, we are better able to relate the results of the model to the experimental human data, both at the low level (e.g., fMRI or PET) and at the high level (e.g., nondeterministic performance and individual differences). Second, our goal is to model human inductive processes, so it is essential to determine whether a proposed solution can be realized in a neural implementation. Neuroscience has provided us with an abundance of data from the neural level that we can use to provide constraints on the system. This ensures that the end result is indeed a model of the human inductive system, not a theoretical construct with infinite capacity or power.

We use the techniques of the Neural Engineering Framework (Eliasmith & Anderson, 2003) to represent vectors and carry out the necessary mathematical operations in spiking neurons. Refer to Fig. 2 throughout this discussion for a visual depiction of the various operations. To encode a vector *x* into the spike train of neuron *a*_{i} we define

- (3)

*G*_{i} as a function representing the nonlinear neuron characteristics. It takes a current as input (the value within the brackets) and uses a model of neuron behavior to output spikes. In our model we use Leaky Integrate and Fire neurons, but the advantage of this formulation is that any neuron model can be substituted for *G*_{i} without changing the overall framework. *α*_{i}, , and are the parameters of neuron *a*_{i}. *α*_{i} is a gain on the input; it does not directly play a role in the encoding of information, but rather is used to provide variety in the firing characteristics of the neurons within a population. is a constant current arising from intrinsic processes of the cell or background activity in the rest of the nervous system; it plays a similar role to *α*_{i}, providing variability in firing characteristics. represents the neuron's preferred stimulus, that is, which inputs will make it fire more strongly. This is the most important factor in the neuron's firing, as it is what truly differentiates how a neuron will respond to a given input. In summary, the activity of neuron *a*_{i} is a result of its unique response (determined by its preferred stimulus) to the input *x*, passed through a nonlinear neuron model in order to generate spikes.

We can then define the decoding from spike train to vector as

- (4)

where * denotes standard (not circular) convolution. This is modeling the current that will be induced in the postsynaptic cell by the spikes coming out of *a*_{i}. *a*_{i}(*x*) are the spikes generated in Eq. 3. *h* is a model of the postsynaptic current generated by each spike; by convolving that with *a*_{i}(*x*), we get the total current generated by the spikes from *a*_{i}. *φ*_{i} are the optimal linear decoders, which are calculated analytically so as to provide the best linear representation of the original input *x*; they are essentially a weight on the postsynaptic current generated by each neuron.

We have defined how to transform a vector into neural activity and how to turn that neural activity back into a vector, but we also need to be able to carry out the VSA operations (binding and superposition) on those representations. One of the primary advantages of the NEF is that we can calculate the synaptic weights for arbitrary transformations analytically, rather than learning them. If we want to calculate a transformation of the form *z* = *C*_{1}*x* + *C*_{2}*y* (*C*_{1} and *C*_{2} are any matrix), and *x* and *y* are represented in the *a* and *b* neural populations (we can add or remove these terms as necessary to perform operations on different numbers of variables), respectively, then we describe the activity in the output population as

- (5)

where *c*_{k}, *a*_{i}, and *b*_{j} describe the activity of the *k*th, *i*th, and *j*th neuron in their respective populations. The *ω* are our synaptic weights: , and . Referring back to our descriptions of the variables in Eqs. 3 and 4, this means that the connection weight between neuron *a*_{i} and *c*_{k} is determined by the preferred stimulus of *c*_{k}, multiplied by the desired transformation and the decoders for *a*_{i}. To calculate different transformations, all we need to do is modify the *C* matrices in the weight calculations, allowing us to carry out all the linear computations necessary in this model. For a more detailed description of this process, and a demonstration of implementing the nonlinear circular convolution (Eq. 1), see Eliasmith (2005).