• Open Access

Consciousness and the structuring property of typical data



The theoretical base for consciousness, in particular, an explanation of how consciousness is defined by the brain, has long been sought by science. We propose a partial theory of consciousness as relations defined by typical data. The theory is based on the idea that a brain state on its own is almost meaningless but in the context of the typical brain states, defined by the brain's structure, a particular brain state is highly structured by relations. The proposed theory can be applied and tested both theoretically and experimentally. Precisely how typical data determines relations is fully established using discrete mathematics. © 2012 Wiley Periodicals, Inc. Complexity, 2012


In neuroscience, the neural correlates of consciousness provide an important empirical base for consciousness but not a theoretical one. To clarify, a theoretical base is a predictive theory that is free from empirical methodology while usually appealing to, and revealing aspects of, the innate mathematical properties of what is being studied. In contrast, the neural correlates of consciousness at some stage rely on obtaining information about a person's experience by asking them or by considering their sensory input. Subsequently a given experience can be associated with the aspects of a person's neurological state that are always observed for that experience. To exemplify the difference, compare Newtonian mechanics with astronomical predictions based on astronomical tables. Importantly, it is expected that the neural correlates of consciousness alone cannot provide a satisfactory explanation of consciousness, because this would invoke some unknown agency that can discover the external cause of a particular neurological state within the brain so as to associate that state with an appropriate experience. Hence, an important requirement of a theoretical base for consciousness is that it should avoid the use of any prior knowledge of what stimulates the senses. We should expect the brain itself to fully define conscious experience all be it having been stimulated by the senses. To assess whether a particular theory meets this requirement, we also need a clear notion of what consciousness is. Although consensus in this regard will be hard to come by, it can be argued that one fundamental aspect of consciousness is the role played by relations such as those that define geometric content or the individuality of objects, their relationships and type such as visual and auditory. We, therefore, postulate that our conscious experience could largely be a mathematical structure defined by relations. In this case, the principle underlying how the brain simultaneously defines all the required relations is needed. For example, As the part of conscious experience that correlates with the state of the primary visual cortex is of a metric space viewed from a particular position, we expect that the primary visual cortex ought to define relations between neurons, or other identifiable nodes, that result in a metric space. This article proposes a theory that may satisfy these requirements while being theoretically and experimentally amenable to the scientific method. Of course, the scientific literature does already include important contributions toward establishing a theoretical base for consciousness. Perhaps, the most prominent of these is the theory of consciousness as integrated information proposed by Tononi [1]. Tononi had previously worked with Gerald Edelman the Nobel Prize-winning immunologist and subsequent neuroscientist. Together, they wrote a book entitled A Universe of Consciousness, [2], which provides significant scientific insight toward an account of consciousness. However, whilst the importance of relations is evident in their work, their emphasis does not suggest how the content of consciousness might be defined by the brain. A review of the book by Ascoli, [3], points out that the authors focus on the properties of the neural process such as integrated activity in the highly reentrant dynamic core, where the dynamic core is a large part of the thalamocortical system, and also on the properties of consciousness such as unity, privateness, coherence, and informativeness. In Ascoli's view, the book does not address the question of why a sensation corresponds to a specific state of the dynamic core as opposed to another one. In this respect, I support the view that the relations defined by the brain are important. It can be seen from [4] that the brain defines relationships between certain patterns of activity occurring in various sensory regions of the brain. For example, for a given pattern of activity in the visual cortex, we can ask whether it is typical for another particular pattern of activity to be present at the same time in the auditory cortex. If so, then the given pattern is related to the latter pattern. Consider how such a relationship might be contributing to the experience of seeing a picture of Albert Einstein while hearing the name Albert as opposed to hearing the noun apple. For now, the experience associated with a particular pattern of activity may be known from the neural correlates of consciousness. However, the relationships that the brain defines between patterns allows more to be derived about a person's experience than that associated with the patterns in the sensory regions of the brain alone. Hence, we should try to move down from this higher semantic level replacing neural correlates of consciousness with derivations involving relations as we go if possible. I do not, however, doubt the enduring relevance and importance of Edelman and Tononi's work such is the knowledge and insight it provides.

The mathematics in this article is straightforward involving binary relations, matrix tables, and a small amount of graph theory. The relevance of such mathematics for the brain has been noticed before particularly in the study of anatomical and functional connectivity, [5], which is a different, and yet associated, purpose to that of this article concerning consciousness.

We will start by considering the following properties of the brain that are available for consciousness, noting that the list is not intended to be exhaustive:

  • 1the brain has a large number of identifiable nodes by which we mean neurons in this article, but more generally possibly cortical columns;
  • 2the brain is capable of a large number of states where a brain state is a possible and probable aggregate state of all the brain's nodes;
  • 3to some extent, there is some type of ordering on the collection of brain states, because the brain has some of the properties of an endofunction, all be it under perturbation by the senses.

In this article we will mainly be considering 1 and 2 of the above. In this respect, Definition 1 will be useful where, when applied to the brain, the elements of S are the neurons. Merely to keep things simple, we will mainly restrict our selves to nodes that have a two-state repertoire.

Definition 1.

Let S be a nonempty finite set, n: = #S. Then a set, for an arbitrary index label i,

equation image(1)

will be called a data element for S. The set of all data elements for S is denoted ΩS so thatS = 2n. If a particular subset T ⊆ ΩS has been associated with S then we will call T the typical data for S. Further, in such cases, we will refer to S as the carrier set. An element SiT will be called a typical data element.

Before we consider the brain, the following motivating example will be useful.

Example 1 We will consider what could appropriately be called: The definitive player problem. The purpose of this simple example is to introduce the idea that typical data can define a structure on a carrier set which in turn gives an interpretation of each typical data element. Consider a library of compact disks and suppose that these disks have all been made to a generic template in the sense that the locations of the bits, either 0 or 1, are the same for all disks. Further suppose that the disks all produce highly structured output on some standard player which always reads off the bits in the same order relative to the generic template. In the language of Definition 1, the generic template is the carrier set S and the library is the typical data T. Now suppose we have two of these disks S1,S2T where, on the standard player, S1 is Beethoven and S2 is Elgar. On some nonstandard player where the order in which the bits are read is different to the standard player it could be that S1 is Mozart and S2 is something else, possibly white noise, depending on the reading order. Therefore, a single disk on its own is almost meaningless. However, by requiring highly structured output, each disk Si in the library defines a subset of the set of all players. By taking the intersection of all these subsets, we will be left with relatively few players including the standard player. If the library is large enough and we could measure how structured an output is then the typical data might determine a definitive player and hence, in the context of the library, S1 is Beethoven and S2 is Elgar.

The definitive player in this example is essentially a relation between the bits on the generic disk template, that is, the carrier set, such that almost every bit is related to two other bits so as to form a sequence up to a choice of direction. When a disk from the library is played on the definitive player the output has relatively few abrupt transitions in output frequency and so there is some similarity between the relation on the carrier set and what is written on the disks.

We finish this example by mentioning that there are plenty of different choices of typical data, that is, libraries, available and in particular many more than there are players. If there are n bit locations on the generic disk template, so that # S = n, then there are n! different players by which we mean n! different sequences of these bit locations. Further, the number of different disks that can be written is 2n, that isS = 2n. Therefore, the number of different subsets of ΩS is 2math image, and it is straightforward to show by induction that 2math image > n! for all nequation image.

In the next section, we will see that the appropriate relation to put on the carrier set, if unique, is explicitly determined by the typical data itself. Suppose in Example 1 that instead of the data points on the disks having a two-state repertoire, bits, there were as many states as output frequencies or that the nodes on the generic disk template are the bytes instead of the bits. Then, the theory in the next section would apply to Example 1, and there would not be a problem concerning how to measure the quantity of structure of an output. Moreover, toward the end of this article, we will argue that the theory presented solves what is known as the binding problem.


We will refer to Table 1 several times in this section. In Table 1, the carrier set has four elements, S = {a,b,c,d}. There are 24 different sequences, that is, one-dimensional arrangements, of the elements of S, and these appear in the column headings of the table. There are 16 different binary data elements for S and each row of Table 1 gives a particular data element under the 24 different one-dimensional arrangements. Now let T: = {S5,S10,S13} be the set of typical data for S. Let us try to arrange the elements of S in a way that achieves something similar to that exemplified by the definitive player problem. We can consider which sequence, or other arrangement, of the elements of the carrier set gives the most structured, transition free, interpretation of the typical data elements. The sequence acdb and its reverse bdca satisfy this requirement, because under these arrangements, for each typical data element, the zeros and ones are unmixed. In the sequel, we introduce relations to show how the typical data determines the structure on the carrier set. As this structure is given by a symmetric relation, as opposed to an antisymmetric relation in the case of total orders, the problem of whether T gives acdb or bdca as the definitive arrangement of the carrier set will be solved. We begin with the following standard definitions that will be particularly useful here.

Table 1. One Dimensional Arrangements of Four Bit Data Elements
inline image

Definition 2

Let S be a nonempty set. A binary relation on S is a subset RS2 where S2: = {(a,b) : aS, bS}. For a,bS we say that a is R-related to b, and write aRb, precisely when (a,b) ∈ R, We say that R is:

  • 1reflexive if (a,a) ∈ R for all a ∈ S;
  • 2symmetric if for every (a,b) ∈ R we also have (b,a) ∈ R
  • 3antisymmetric if for every pair of distinct elements a,b ∈ S at most one of (a,b) and (b,a) is an element of R;
  • 4transitive if for every triple of elements a,b,c ∈ S with (a,b) ∈ R and (b,c) ∈ R we also have (a,c) ∈ R;
  • 5an equivalence relation if R is reflexive, symmetric, and transitive.

There is a strong connection between the theory of relations on a set and graph theory. In the following definition, we use some graph theory terminology.

Definition 3

Let S be a nonempty finite set and R ⊆ S2 a symmetric relation on S. For a,b ∈ S a walk from a to b, if one exists, is a finite sequence (ki)i∈{1,…,n}, nequation image is odd, such that:

  • k1 = a and kn = b;

  • we have ki ∈ S if i is odd and ki ∈ R if i is even;

  • for i even we have ki = (ki-1,ki+1).

For a,b ∈ S let Ka,b denote the set of all walks from a to b. The R-distance between two elements a,b ∈ S is

equation image(2)

Lemma 1

As R is symmetric, the R-distance dR defined in Definition 3 is either a metric or an extended metric on S. By extended metric we mean a metric that takes non-negative values on the extended real line, [-∞, ∞].

Proof: One checks the four standard metric axioms.

Remark 1

Let S be a nonempty finite set and n: = #S. Then, S2 is the equivalence relation on S with just one equivalence class. Although the graph diagram of a graph need not be unique, by applying uniformity principals for the lengths of edges and angles between adjacent edges, many graph diagrams are unique. For example, the graph diagram of S with the relation S2 is given by the edges and vertices, nodes, of the n-1 dimensional regular simplex, for example, for n = 4 the simplex is a tetrahedron.

In the sequel, the following metric will also be useful.

Lemma 2

Let S be a nonempty finite set and let 2math image be the set of all binary relations on S, noting that this is the power set of S2. Then,

equation image(3)

is a metric on 2math image where RΔR′: = (RR′)\(RR′) is the symmetric difference of R and R′. We call dΔ the symmetric difference metric on 2math image.

Proof: Standard for S2 finite.

The following example shows how typical data determines a structure on the carrier set.

Example 2

With reference to Table 1, again let T = {S5,S10,S13} be the set of typical data for S = {a,b,c,d}. With reference to Definition 1, we note that each typical data element Si = {(a,fi(a)):aS, fi:S →{0,1}} defines an equivalence relation on S of the form

equation image(4)

Hence, from T we obtain the relation tables in Figure 1. Note that for numerical cell values use 1-|fi(a) − fi(b)| for a,bS.

Figure 1.

The relation tables defined by the elements of T.

Now, we aggregate the relation tables in Figure 1 into a single weighted relation table RT by calculating the mean number of dots per table cell. Hence, for a,bS, RT shows the proportion of equivalence relations defined by the elements of T that have a related to b. The table RT is shown in Figure 2. Now, for a threshold value of 0.5, we round the cell values of RT such that values greater than 0.5 are rounded up to 1 and values less than or equal to 0.5 are rounded down to 0. This results in the relation RS. We note that a relation obtained in this way will always be symmetric, but in general it need not be transitive. In particular, RS is not transitive, but as it is symmetric, it defines a metric or an extended metric on S by Lemma 1. Hence, we will refer to S with RS, defined by T, as the carrier space.

Figure 2.

The structure on the carrier set S determined by T.

The graph diagram of S with the relation RS is given by GS in Figure 2. Arguable GS is one dimensional and we note that it agrees with our discussion at the beginning of Section 2 since being a nondirected graph.

We note that as theory develops it might be useful to retain the weighted relation RT instead of only working with RS. In particular, one can obtain a hierarchy of relations from RT by varying the rounding threshold. However, there are good reasons for choosing a rounding threshold of 0.5. In particular, RS is such that the mean of the distances between RS and the elements R(Si) obtained from T is minimized, that is,

equation image(5)

In general, RS need not be unique in this respect if the value 0.5 appears in the relation table for RT. We will shortly relate RS to something we will call float entropy that also supports a rounding threshold of 0.5.

The following example uses typical data, which defines a structure on the carrier set that is not one dimensional.

Example 3

With reference to Table 1, let T′ = {S6,S9,S16} be the set of typical data for S′: = S where S is the carrier set of Example 2. Following the theory introduced in Example 2 gives the results presented in Figure 3.

Figure 3.

The structure on the carrier set S′ determined by T′.

2.1. Float Entropy

In this short subsection, we will discuss the notion of float entropy. Let S be a carrier set, T ⊆ ΩS the typical data of S and R a relation on S. Suppose we consider T to be the set of possible messages that can be sent to a receiver. In standard information theory, the receiver would also have a copy of T, so that sending a message only involves sending enough information to identify the intended element. Instead of this, suppose that the receiver only has a copy of S and R. For SiT, if the relation R(Si) is relatively close to R with respect to dΔ, then the number of bits that need to be sent to the receiver to specify Si will be relatively small. In this case, Si is highly compressible, carries little information and is highly structured relative to R. We summarize this situation by saying that Si has low float entropy relative to R. The extreme case of minimum float entropy occurs when R(Si) = R, which is possible if R is an equivalence relation. With reference to Definition 1, we can quantify float entropy relative to a given relation R as follows,

equation image(6)

This is a measure in bits of the amount of information required to specify Si under the assumption that what is being specified ought to be highly structured relative to R. We can consider some values for Examples 2 and 3. Recall that in Example 2, we have T = {S5,S10,S13} and in Example 3 T′ = {S6,S9,S16}. For RS from Example 2, we have fe(RS,S10) = 1 and fe(RS,S5) = fe(RS,S13) = 2.58 to two decimal places whereas in contrast fe(RS,S9) = 4. We will denote the mean of the float entropies for the elements of T with respect to RS by fe(RS,T) and extend this notation to T′ and RS from Example 3 accordingly. Working to 2dp throughout gives fe(RS,T) = 2.06 and fe(RS,T′) = 2.58 whereas fe(RS,T′) = 3.55 and fe(RS,T) = 3.87. Hence, we see that the relations obtained by the method shown in the examples are, relative to their respective typical data, a good choice to minimize the mean float entropy.

Now let S be the set of neurons of a brain and T the set of brain states where a brain state is a possible and probable aggregate state of all the brain's neurons. If we are trying to approximate T then ideally T will be selected such that, as a random variable restricted to T, the brain has a uniform distribution over T. Further ideally T should be large enough so that the probability of the brain being in a state that is close to at least one of the elements of T is high. Under these conditions, we note, by Eq. (5), that setting R: = RS is a good choice to minimize the expected float entropy.

In the next section, we will to some extent consider the possible relevance of the theory in Section 2 to the brain. We will also extend the theory to what we will call objects.


Although our theory is to be considered for typical data elements of the state of the whole brain, we begin this section by considering the relevance of the theory to the primary visual cortex, V1. Associating the retina with the unit disk of the complex plain and similarly embedding the flattened cortical sheet of V1 into the complex plain, we note that the retino-cortical mapping to V1 on a given side of the brain is approximately logarithmic and is therefore far from being an isometry [6, 7]. Hence, the geometry of V1 cannot account for the perceived geometry of monocular vision. Furthermore, the right side of each retina is mapped to the right side of the brain, whereas the left side of each retina is mapped to the left side of the brain. Hence, the signals from a given retina go to two different brain areas. Despite this, the perceived geometry produces a seamless isometric version of the image on the retina. Such facts underline the need for a theory such as that initiated in this article, as we need to explain how perceived geometry is defined by the brain.

Let S be the set of neurons in V1. Further, let a′ and b′ be two distinct points that are fixed relative to the eye in a person's field of view as depicted in Figure 4.

Figure 4.

Two fixed points in a person's field of view.

Let a be a neuron in V1 that is stimulated by the retina when there is stimulation of the retina from a′. Similarly, let b be a neuron in V1 that has the same relationship with b′. Consider the typical data T for V1. We note that abrupt transition lines between light and dark or regions of different color are relatively sparse in the field of view. In a somewhat simplified analysis, suppose that there are usually no more than n abrupt transition lines in the field of view. As depicted in Figure 4, let l be the length of the line through a′ and b′ crossing the field of view and d the viewable distance between a′ and b′. Suppose that all n transition lines intersect the line through a′ and b′. Then, the probability Pn that there is one or more transition lines between a′ and b′ is

equation image(7)

We note that limd→0Pn = 0. Hence, if d is small, then a will be in the same state as b in the majority of the typical data elements of T. On the other hand, if d is large, then arguably a and b will rarely be in the same state. Therefore, the relation RS on S defined by T ought to correspond well with the structure of the field of view. This claim is supported below by the results of a study using digital photographs to test how well the theory establishes relative pixel positions.

First, though we note that evidence has been found for V1 that supports the bienenstock, cooper and munro (BCM) version of Hebbian theory [8, 9]. Hebbian theory implies that if a′ and b′ are close together then stimulation of a and stimulation of b from within V1 ought to usually happen together. Therefore, the typical data is typical of the states that V1 can internally generate by itself. Hence, V1 defines RS and by doing so it defines the interpretation of the current state of V1. Although this is the case in theory, further investigation is required when the full complexity of the visual system is considered.

Now a study was conducted using 105 digital photographs taken of everyday scenes using the same seven megapixel digital camera. A computer program centered a 5 × 5 grid of sampling points over each photograph and recorded to which brightness class each point belonged. Here, the grid points are the elements of S, whereas an element of T is given by the values obtained for one of the photographs so that #T = 105. Two parameters are involved the first being the grid point spacing in pixels of adjacent grid points and the second being the number of brightness classes used. The second parameter is, therefore, the node repertoire and, apart from the fact that the repertoire was not restricted to two, everything proceeded as per Examples 2 and 3. Results showed that RS was close, with respect to dΔ, to the relation for the grid provided that the parameters used corresponded to a point on the curve in Figure 5. Now, suppose we numerate the elements of T from 1 to 105 and calculate RS after the first n elements for n ∈ {1,5,10,15,…,105}. Figure 6 shows how the acquired relation converged toward the relation for the grid as n increased. The parameters used for Figure 6 are indicated by the point p in Figure 5. Further, Figure 7, left, shows the graph diagram of the relation for the grid and, right, the edges given by the relation RS for n = 105. Clearly, convergence would be obtained for large enough #T. This works, because while the content of the world around us is very varied it is nevertheless highly structured relative to the underlying geometry of the space. Brightness classes were used in the study, so that the nodes, the grid points, would represent neurons in V1 that respond to rod cells in the retina. We should note that the rod cells are arranged more in the form of a hexagonal lattice than a grid. Further, it would be interesting to repeat this study with each grid point split into three separate nodes giving one for each cone cell type so that #S = 75. The cone cells respond either to red, green, or blue. The resulting relation RS may suggest a solution to the binding problem for color perception. Finally, we should consider what might determine the repertoire of a neuron. The brain itself should define this. For example, if a small change in the output frequency of a neuron has no affect on the system, then with respect to the system the neuron's state is the same. Similarly, if switching over the outputs of two different neurons would have no affect on the system, then with respect to the system the neurons are in the same state. This last point is just a suggestion. Note that such a definition of relative node state may result in the relation R(Si) for SiT no longer being transitive. We will now move onto our discussion concerning objects.

Figure 5.

Established parameter options.

Figure 6.

Convergence to the relation for the grid.

Figure 7.

The grid edges compared with the edges given by RS.

3.1. Relations between Objects Defined by Typical Data

We start this subsection with a definition.

Definition 4

Let S be a nonempty finite set with typical data T and the relation RS defined on S by T. Let X be some other finite set with #X ≤ #S. We say that

equation image(8)

is an object of S if there is some Si ∈ T, Si = {(a,fi(a)):a ∈ S, fi:S → {0,1}} with relation

equation image(9)

and an injective map Λji:Xj → Si, given by Λji((a,xj(a))): = (λji(a),fiji(a))) where λji(a) ∈ S, such that for all (a,xj(a)),(b,xj(b)) ∈ Xj we have:

  • 1xj(a) = fiji(a));
  • 2((a,xj(a)),(b,xj(b))) ∈ Rmath image if and only if (Λji((a,xj(a))), Λji((b,xj(b)))) ∈ Rmath image.

We say that the object Xj embeds into Si and denote the set of all objects of S by equation image.

We will now show that typical data T defines a relation equation image on the set of objects of S as follows. For Xjequation image let Tmath image: = {Si:Xj embeds into Si where SiT}. Note, by Definition 4, that Tmath image is not empty. Now, the relation equation image is given by

equation image(10)

We note that in general equation image need not be symmetric or transitive and that it is the relation obtained by applying a rounding threshold of 0.5 to the weighted relation equation image given in Figure 8.

Figure 8.

The weighted relation table for equation image on equation image determined by T.

Similar to the situation in Example 2, one can obtain a totally ordered hierarchy of relations on equation image by varying the rounding threshold applied to equation image. Turning our attention to the topic of float entropy that we began in Subsection 2.1, we note that if the receiver not only has a copy of S and RS but also has a copy of equation image then the elements of T should be even more compressible and are even more structured relative to the relations available to the receiver. Finally, we note that the theory in this article easily generalizes to cases where the neurons, or other nodes, have more than a two-state repertoire, that is, we can allow fi to take more than two values in the definition of a data element Si given in Definition 1. In this case, one also makes a similar adjustment to the definition of an object Xj of S.


There are different ways in which this theory can be developed. From a purely theoretical perspective, it is interesting to establish the range of structures that can be defined by typical data comprised of comparable nodes noting for example that functions can be defined by relations. This general theory can then be applied to any dynamical system comprised of comparable nodes, for example, networks. More practically, the wealth of established knowledge concerning brain function offers an interdisciplinary approach to theoretical development. Furthermore, the theory needs to be tested. In this respect, functional MRI with high spatial resolution and other brain imaging technologies could be used. For example, functional magnetic resonance imaging (FMRI) has already been used as a way of obtaining information about the state of V1 that is sufficient for image reconstruction [10]. However, due to spatial distortion of the retino-cortical mapping and restricted FMRI voxel resolution, and perhaps other factors, it is not possible to recognize viewed stimulus from FMRI images directly. Reconstruction often uses methods from linear mathematics and probability where knowledge of the visual stimulus used is necessary during the setup stage. Taking the elements of S to be the voxels covering V1, it is interesting to know whether typical data would give rise to a geometric relationship between the voxels, differing from their FMRI image positions, such that the viewed stimulus would be recognizable from the repositioned voxels. Two methods could be tried when establishing the geometry on S. The first would follow the theory as presented in Section 3. For the second, the distances between the voxels could be obtained from the map d:S2 → [0,1], d(a,b) := 1 − RT(a,b). In both cases, each relation R(Si) should have numerical cell values, because the similarity of voxel states can be quantified in the range 0–1. One type of visual stimulus to try would have a single transition line placed at random in the field of view.

4.1. Conclusion

We have already mentioned in Section 3 that the BCM version of Hebbian theory provides evidence of how the brain itself defines typical data. We mentioned the evidence in the case of the primary visual cortex V1, but there is also evidence for the relevance of BCM theory regarding the hippocampus [11]. In particular, the typical data that V1 defines should be typical of the states induced by signals from the retina. In Section 3, it is shown, at least in theory and up to a good analogy using a digital camera, that for appropriate parameters such typical data defines a relation on the set of neurons of V1 that gives the perceived geometry for monocular vision. The relation is defined by the typical data by being special in the sense that it minimizes the expectation of the float entropy of the system. However, our theory is intended to be applied to typical data for the whole brain, so that such a relation also determines how the states of other sensory regions are perceived. For example, the relation on the auditory cortex might define how we perceive the relationship between the pitches of the chromatic scale. Of course, more work is required to determine the extent to which this theory can account for how the brain defines the various aspects of consciousness.

However, at the higher semantic level, it is fairly clear that the typical data for the brain defines relationships between objects in the way described in Subsection 3.1. For example, a good impressionist painting provides V1 with just enough of a particular stimulus such that V1 produces the same state as that induced by a photograph of the same subject. This ability of the brain is widely known as filling-in and shows that typical data defined by the brain will determine a strong relationship between certain objects. Furthermore, it is well known that certain parts of the thalamus act as a relay between different parts of the cortex including different sensory regions. This and other connections can arguably result in the brain defining typical data that determines relationships between objects arising from different sensory regions of the cortex, [4] is of relevance here. Further the states of the brain during dreaming, visualization with the eyes closed, and inner sound are all instances of typical data produced by the brain itself independent of the senses at the time.

We will now turn our attention to what is known as the binding problem. In short, the binding problem can be summarized by the following observation and question. The visual content of our conscious experience correlates with the state of the visual cortex, whereas the sound content of out conscious experience correlates with the state of the auditory cortex. How therefore can the state of two quite distinct and spatially separated brain regions give rise to a single unified conscious experience? If the theory presented in this article is correct, then the answer is quite straightforward. The content of consciousness is defined by the state of the brain interpreted in the context of the relations, such as those discussed earlier, defined by the brain's typical data. The typical data is determined by the brain's structure. Hence, consciousness is a property of the brain as opposed to being an output of some algorithmic procedure or relying on some homunculus concept. A compact disk on its own is almost meaningless, but in the context of a sufficiently large CD library it is a specific piece of music, Beethoven, for example, or Mozart perhaps. Similarly a brain state on its own is almost meaningless but in the context of the brain's typical data it is a moment of consciousness by which we mean the brain state with the relations defined on it by the typical data and this is, for example, the view of the coffee cup with the sound of the radio and the taste of the coffee all together.

Finally, this article if correct still leaves many questions unanswered and the lack of an attempt to answer them in the context of this initial proposition of the theory is rightful cause for some criticism. Here, a few of these questions are:

  • 1Can the theory explain the conscious experience of the color red or does the theory need to be extended?
  • 2What are the other relations that typical data define?
  • 3What connections are there, if any, between our theory and the theory of consciousness as integrated information as proposed by Tononi [1]?
  • 4Although the neurons are an obvious candidate for the elements of the carrier set S, are they the right candidate?
  • 5Let Si be the data element for a given brain state. Is all of the relation equation image contributing to consciousness regarding Si or is only a subset
    equation image(11)
  • 6Is it useful to also consider a carrier set where the elements are time-dependent neurons over a short time interval or some discrete version of the same involving short finite sequences?


The author thanks the School of Mathematical Sciences, University of Nottingham, Nottingham NG7 2RD, UK for providing continued access to facilities during the period after my PhD, while I was still registered as a student and writing this article.