Do we see facts?

Philosophers of perception frequently assume that we see actual states of affairs, or facts. Call this claim factualism . In his book, William Fish suggests that factualism is supported by phenomenological observation as well as by experimental studies on multiple object tracking and dynamic feature-object integration. In this paper, I examine the alleged evidence for factualism, focusing mainly on object detection and tracking. I argue that there is no scientific evidence for factualism. This conclusion has implications for studies on the phenomenology and epistemology of visual perception.


| INTRODUCTION
Our visual scene is parsed into coherent units called visual objects (Feldman, 2003;Green, 2018;Scholl, Pylyshyn & Feldman, 2001). What kind of entities are visual objects? Many philosophers contend that we see actual states of affairs or facts (e.g., Armstrong, 1997Armstrong, , p. 95, 2010Johnston, 2006;McDowell, 1996). A fact is a complex and ontologically heterogeneous entity constituted by a particular plus properties forming a non-mereological unity. I call the thesis that we see facts factualism. Suppose you see a red rose. If factualism is true, then what you see is a complex entity made by a particular (the rose) having a property (being red). Factualism has several implications. For example, since visual objects are the targets of perception-based demonstrative thought, of perceptual attention, and are the "basic units" (Fish, 2009, p. 52) of visual perception, it follows that such states are individuated by means of entities belonging to two different categorical kinds: properties and particulars. Furthermore, facts are worldly items that "mimic" the structure of judgments (Johnston, 2006, p. 290), hence, factualism seems to entail that visual object perception has a sentence-like structure (Armstrong, 1997, p. 96;Textor, 2009). Factualism, however, is not unchallenged. Alternatively, one may construe visual objects as property-complexes or bundles.
Many philosophers take factualism as a truism. Fish's (2009) work represents an interesting exception. In his book, he makes a case for factualism based on two claims, which I call the phenomenological and the scientific claim. The former is that there would be phenomenological evidence that we see facts. The scientific claim is more interesting. Fish thinks that scientific studies on object detection and tracking support factualism. I set out to show that the scientific claim does not support factualism, and provides indirect support for an alternative, bundle view of visual objects.
I set the stage in Section 2, clarifying terms and concepts, and introducing Fish's argument for factualism. In Section 3, I elaborate on the alleged evidence on which the scientific claim rests. In Section 4, I argue that scientific evidence does not support factualism.

| Seeing and facts
I take a state of seeing to be a conscious mental state that visually presents or manifests some mind-independent entities to the perceiver. Three caveats are in order. First, I will only discuss cases of genuine perception. Second, I remain largely neutral about the nature of perception. Third, I remain neutral about what makes such states or contents conscious.
What kinds of entities are made manifest by states of seeing? Most researchers agree that states of seeing make manifest a cluster of visual properties or features (Wolfe, 1998). Features are constitutive of the contents of states of seeing, that is, we always see things as being a certain way (Siegel, 2010, p. 45;Block, 2014). Although it is uncontroversial that we see objects (O'Callaghan, 2016;Rosch, Mervis, Gray, Johnson & Boyes-Braem, 1976), there is little agreement about how the cognitive system achieves object perception. According to a widely shared view, our visual system singles out something as an object if it satisfies principles like cohesion, boundedness, rigidity, and no action at a distance (Spelke, 1990, pp. 48-51;Burge, 2010, p. 464). But what kind of things are visual objects? Assuming that there are properties-construed in an ontologically liberal way-there are two possible answers to this question. One is that visual objects are complex entities made by a property-bearer or "particular" plus properties. Another possible answer is that visual objects are exhaustively ontologically analyzed by their properties. Call the first option factualism, and the latter the bundle-view. In the remainder of this paper, I will use the term "object" in a metaphysically neutral way (for either bundles or facts).
In the philosophical literature, facts are understood either as true propositions or as actual states of affairs (Armstrong, 1997;Betti, 2015;Reicher, 2009;Vallicella, 2000). It is only the latter concept that will be discussed here. Facts, in this sense, are taken to be semantically idle, complex entities that constitute the building blocks of the world. Facts can be of two types (Armstrong, 1997, pp. 28-29) either particulars exemplifying properties-such as "a being F" (the rose's being red); or two particulars exemplifying a relation-as in "a having R to b" (the rose is to the left of the perceiver) (Mulligan, Simons & Smith, 1984). 1 Following Fish (2009, p. 22), I will only discuss facts of the former type. Facts are categorically heterogeneous entities (Betti, 2015, pp. 20-22), because they comprise entities that belong to distinct ontological categories with different ontological statuses, a particular (the rose) and an abstract entity (a property, e.g., "being red") (Smith, 1989, p. 422) that form a unity that is more ("over and above") than the simple mereological sum of its constituents. Facts thus form a non-mereological and non-spatial unity over and above their constituents (Armstrong, 1989, p. 88). However the properties are construed (whether as universals or tropes), properties and particulars are glued together by means of a non-relational tie (Armstrong, 1997(Armstrong, , p. 118, 2010Devitt, 1997, p. 98). A particular considered without its properties is called a "thin particular", whereas a particular clothed with its properties is called a "thick particular" (or simply a "fact"). According to Armstrong, the world is constituted by facts, all particulars instantiate (or exemplify) some properties, and all properties are instantiated by some particulars, hence thin particulars cannot be found in the world, we only obtain them by means of a process of intellectual abstraction (1989, p. 88, 1997, pp. 123-126). 2 Notice that the claim that the world is constituted by facts (Armstrong, 1997;Russell, 1986, p. 163) is logically independent from the claim that we see facts. In this paper, I will only explore the latter claim.

| Factualism, the bundle-view, and Fish's argument
Factualism (FT) is the claim that visual objects are facts: FT: Visual objects are complex entities, that is, facts, whose categorically heterogeneous constituents are particulars instantiating properties that form a non-mereological unity over and above their constituents.
An alternative to FT is the bundle view (BV): BV: Visual objects are property bundles. Pautz (2007) describes an instance of BV, the property-complex theory: "P is a complex property iff, necessarily, x instantiates P iff x has parts x 1 , x 2 , x 3 , which have properties P 1 , P 2 , P 3 … and stand in relations R 1 , R 2 , R 3 …" (p. 498). 3 On BV, in contrast with FT, perception does not predicate properties of an ontologically distinct particular. There is no co-presence of a feature and its "fundament" (the particular) in seeing (Mulligan et al., 1984, p. 308).
FT and BV differ in structure and ontological scope. 4 The contrast between FT and BV touches also on the issue of how to construe conditions of accuracy, if perception is 2 It has been pointed out to me by a reviewer that this seems to suggest that facts may be ontologically basic, while properties and particulars may be ontologically derivative. Alternatively, one might construe facts as derivative entities that exist in addition to properties and particulars. This is a genuine metaphysical issue, but resolving it either way does not bear on my argument. 3 Textor (2009) has suggested an alternative to FT that goes in the direction of BV: "Seeing x is constituted by seeing features, states or changes of x and additional factors … I see x in virtue of seeing its features, states or changes" (p. 141). There is of course a difference between the claim that visual objects are constituted by properties (and perhaps states and changes), and the claim that seeing a visual object is constituted by properties as well as other factors (arguably, attention, etc.;Driver, Davis, Russell, Turatto & Freeman, 2001;Scholl, 2001). I focus on the former view. 4 Nominalists might claim that we see objects, properties being derivative entities. One could undermine FT by simply denying ontological credit to facts. Here, I set out to argue that, even if the world is constituted by facts, there is no reason to believe that we see facts.
representational. If FT is true, perceptual states are satisfied iff they represent particulars having properties. If BV is true, content is satisfied iff there is a corresponding property instantiation (Pautz, 2007, p. 499).
Before I proceed, let me fend off two possible confusions. The first is that BV entails a possible, but ultimately wrong conception of visual perception. According to an old empiricist view, perception only delivers an array of features, but it is thought (concepts) that carve-up the visual field into objects (Dickie, 2010, p. 214;Lewis, 1966) 5 . There are good empirical reasons for rejecting this view: The visual scene is parsed into objects already at a pre-attentive and preconceptual level (Dickie, 2010;Raftopoulos, 2009). Notice however that BV does not per se entail the old empiricist view (not without further assumptions).
The second potential mistake is to confuse FT with what Dretske called visual "fact-awareness" (1979,2010). For Dretske, one is fact-aware that a is F iff one has the concept of F and applies it to a. Still, one may be able to see (simple seeing) something even without the relevant concept. So, to use his example, at a time t I may see that x is an armadillo (seeing-that) thanks to my possession of the concept "ARMADILLO"; while previously, lacking the relevant concept, I might have simply seen x (simple seeing). FT is distinct from "fact-awareness" as the relevant notion of "fact" in the two cases is different: actual states of affairs (FT), and true propositions (fact-awareness).
There are some reasons that make factualism unpalatable. Firstly, visual objects are generally defined in mereological terms: Treisman (1986) calls them "complex wholes" and Di Lollo "coherent, unified wholes" (2012, p. 317) (also Feldman, 2003;O'Callaghan, 2016). But facts as we have seen are non-mereological units. Therefore, if a single visual object is a fact, it seems difficult to account for its spatial-mereological structure (Mulligan, 1999). Secondly, it seems that object-directed states of seeing should be individuated by, and make consciously manifest, two kinds of entities which form a non-mereological unity.
Fish's argument for factualism consists of two claims. First, Fish believes that reflection upon our everyday perceptual phenomenology supports factualism. He invites us to consider in this vein Firth's (1965) observation: "[T]he qualities of which we are conscious in perception are … presented to us … as the qualities of physical objects" (p. 222). Call this the phenomenological claim. Second, Fish maintains that experimental studies on object perception support factualism. In particular, he refers to studies on multiple object tracking (MOT) (Blaser, Pylyshyn & Holcombe, 2000), and to Matthen's (2005) discussion of dynamic feature-object integration: Matthen has also argued that certain empirical results are adequately explained only on the assumption that we do not see properties or qualities simpliciter, but rather see objects bearing properties (Fish, 2009, p. 51).
Call this the scientific claim. If the passage is meant to support FT, then clearly Fish construes "objects" as belonging to a distinct ontological category, that is, what I called "particulars," and he must be assuming that objects and properties form a single non-mereological categorically heterogeneous entity (Section 3.4). What we see therefore are complex entities, particulars having properties. 5 A more recent statement of this view can be found in Spelke: "Perceptual systems do not package the world into units.
The organization of the perceived world into units may be a central task of human systems of thought" (quoted in Dickie, 2010).
Let me briefly turn to the phenomenological claim. That we see things being some way or another is nothing more than a truism that does not reveal anything interesting about the things we see. Ultimately, one could frame Firth's quotation within a nominalist outlook, or opt for a bundle account of objects. 6 The assumption that observation will reveal the metaphysical nature of visual objects is, I believe, unwarranted. States of seeing do not manifestly reveal the ontological constituents of what we see (Ayers, 2004, p. 255). Were it not so, it would be difficult to explain why, on the basis of phenomenological evidence alone, philosophers have harbored very different intuitions about the ultimate nature of objects. States of seeing are metaphysically opaque about the nature of content. 7 The scientific claim provides a non-phenomenological way to assess the question of the nature of visual objects. Is there really scientific evidence that we see facts?

| Binding and places
The alleged experimental evidence for factualism is drawn from the literature on the problem of feature-object binding, object detection, and tracking. The former consists in explaining how different visual properties are attached to the same individual (Clark, 2000;Treisman, 1996;Jackson, 1977, p. 64). The problem can be studied from different perspectives. Philosophers interested in object perception should provide a solution to the representational binding problem, and explain the nature of visual objects. This is different from articulating an account of how the brain binds different features into one coherent whole, a single visual object. In order to bring the problem into sharper focus, consider the following: 1. S sees something red. 2. S sees something triangular. 3. S sees something both red and triangular (a red triangle). Clark (2000) pointed out that (3) does not follow from (1) and (2). Seeing something red and seeing something triangular is different from seeing a red triangle. The problem is compounded if we introduce a further visual object, say, a blue circle. How does the cognitive system sort the properties in the right way, that is, blue with circular and red with triangular? In order to solve this problem, Clark proposed a theory of sensory individuals (Cohen, 2004). According to this theory, properties must be attached to or predicated of specific individuals.
What are sensory individuals? For Clark (2000, p. 164), sensory features or properties are predicated of places. S sees redness and triangularity here, and blueness and circularity there. To borrow Evans' (1982) terminology, location in space provides the "fundamental ground of difference" (p. 107) that allows featural attribution to the right individual (Clark, 2004, pp. 136-144). Binding sensory features to places is an elegant solution. However, it is often 6 Many philosophers frame perception in terms of property predication. When I see a red rose, my visual system predicates the property (redness) of the particular (the rose). So, one could think that this alone supports FT. (Thanks to Alexander Staudacher for pointing this out to me.) Yet, nothing in this frame forces us to accept factualism. For instance, the visual system might predicate properties of bundles. 7 To put things slightly differently, philosophical disagreement about the nature of visual objects seems only plausible in virtue of some metaphysical opacity of seeing. considered untenable. It has been argued that the feature-place hypothesis is at variance with experimental evidence (Cohen, 2004;Matthen, 2004Matthen, , 2005Pylyshyn, 2007;Siegel, 2002a). On the basis of this evidence some researchers conclude that sensory individuals must be objects (Cohen, 2004, p. 480). Fish's scientific claim is based on two challenges against the featureplacing hypothesis: the problem of co-located objects (Blaser et al., 2000; Section 3.2.1) and the problem of dynamic feature-object binding (Matthen, 2005; Section 3.2.2).

| Superimposed objects
A problem for Clark's feature-placing hypothesis is that it is apparently unable to account for binding of co-located objects. This is shown by a series of experiments conducted by Blaser et al. (2000) on multiple object tracking (Pylyshyn, 2003(Pylyshyn, , 2004Scimeca & Franconeri, 2015). Blaser and collaborators investigated the visual system's ability to track distinct items within the same spatio-temporal trajectories. In the experiments, subjects observed two circular striped Gabor patches transparently layered upon one another, without noticeable separation in depth (2000). The Gabors underwent different changes, for example spinning clockwise and then counterclockwise, or changed saturation, from gray and black stripes to red and black stripes. The featural changes occurred without any change in location, thus testing whether object perception essentially involves the location of features.
Blaser and colleagues found that the observers reported that the Gabors were perceptually segregated, in a way similar to figure-ground segmentation. The attended Gabor stood out in the foreground, whereas the distractor Gabor receded in the background. Moreover, the experimenters found that featural attention enhanced processing of the Gabor's features as a whole. From these results, Matthen (2005) concludes that the observers "were attending to features by attending to the objects to which these features were attributed, and not by attending to the features directly" (p. 281).
Since subjects were able to discriminate the two superimposed but distinct Gabor patches, it would follow that sensory individuals cannot be places: Featural binding seems to be objectcentered (Matthen, 2005).

| Dynamic feature-object binding
A further problem for the feature-placing hypothesis is that it seems at variance with dynamic feature-object binding. Matthen (2005, p. 282) contends that the perception of change or of motion demands an identity that underlies change, and locations cannot provide such an identity (Siegel, 2002a). He illustrates this point with reference to the φ-phenomenon.
The φ-phenomenon is a paradigmatic example of illusory movement (Dennett, 1991, p. 114). In this experiment a subject observes a screen upon which an image-say, a white dot-is shown on the left-hand side. A second image-say, an identical white dot-is then shown on the opposite side of the screen. In two different experiments, the researchers change the interstimulus interval between the offset and the onset of the two dots. The suitable interstimulus interval depends on the spatial separation of the two items, but the phenomenon is often tested between 50 and 200 msc (Arstila, 2016). In a first experiment, with an interval of c. 50 msc subjects will likely see two flashing dots. However, if the interstimulus interval is increased up to c. 150 msc, subjects will likely experience illusory motion, where a single white dot appears to be moving from the left to the right side of the board. Kolers and Von Grünau (1976) devised an interesting variation of this experiment by changing the color of the second dot. If the interstimulus interval is c. 150 msc, subjects will see one single dot moving from left to right and changing color halfway, say, from white to red (Kolers & Pomerantz, 1971;Scholl, 2007, pp. 573-574).
Matthen contends that this phenomenon brings further evidence against Clark's featureplacing hypothesis. Our visual systems are sensitive to motion, but motion cannot be attributed to places. Suppose that features are place-indexed, like "red and circular here", where "here" is the sensory individual: How can we make sense of the sensory individual moving? Regions of space do not move, therefore cannot be sensory individuals. And if places cannot be sensory individuals, the best alternative option is to conclude that vision is committed "to an ontology of material objects" (Matthen, 2005, p. 281). Matthen defines a material object as a "spatiotemporally confined and continuous entity that can move while taking its features with it" (Matthen, 2005) (also O'Callaghan, 2008. He thereby concludes that vision "attribut [es] features to material objects" (Matthen, 2005, p. 280). 8 Furthermore, he suggests, since we can track an object in spite of featural change-as shown by the foregoing experimentsmaterial objects must be substances or substance-like, as Aristotle suggested: It seems most distinctive of substance that what is numerically one and the same is able to receive contraries. In no other case could one bring forward anything, numerically one, which is able to receive contraries. For example, a color which is numerically one and the same will not be black and white … A substance, however, numerically one and the same, is able to receive contraries. (Categoriae 5, 4a10; in Aristotle, 1963, p. 11;Matthen, 2005, p. 280).
Matthen's point is that features are attributed to objects rather than places. According to Fish, evidence gathered from Sections 3.2.1 and 3.2.2 makes a scientific case for FT.

| What sensory individuals do
Before turning to a critical examination of the scientific claim, we should further specify what roles sensory individuals are supposed to play. This is important because my argument in Section 4 will critically examine FT from the point of view of what sensory individuals do.
We have already seen that there are good reasons to believe that sensory individuals are material objects. I accept this reading. Sensory individuals perform three mutually interdependent functions: to serve as the unifiers to which multiple properties are attached; to remain constant through featural change; and to provide the fundamental ground of reference that secures detection and tracking.
8 It is unfortunate to call singular perceptual referents "material" objects. Some ephemera like shadows or rainbows may also be tracked and perceived as visual objects even though their "material" status is questionable. I stick to Matthen's terminology for two reasons. First, because it is widely adopted. Second, because, in order to avoid unnecessary complications, I consider only the fairly unproblematic cases of material objects perception.
Constancy through featural change is supposed to explain how objects can be re-identified across a given spatio-temporal continuity (Scholl, 2007) and in spite of featural change (Pylyshyn, 2007, pp. 34-37). Experimental studies have shown that featural change does not significantly alter object tracking (Bahrami, 2003;Scholl, Pylyshyn & Franconeri, 1999). Given that objects may undergo featural changes, it has been proposed that the tracked item may be a pure "this", an a-qualitative individual (Skrzypulec, 2018). The "this" would then serve as unifying basis for multiple properties. After an initial pre-attentive and pre-conceptual parsing of the visual scene into proto-objects, visual tracking provides a reference for mental object-related processing to things in the world. On an influential account, once an object is detected, indexing connects such an object with an object-file (Green & Quilty-Dunn, 2017;Kahneman, Treisman & Gibbs, 1992;Recanati, 2012).
Several theories have been proposed to account for object individuation, indexing, and tracking. On Pylyshyn's account, the linkage between items in the world and object-files is maintained through the assignment of a visual index or a bare visual demonstrative (Matthen, 2012;Pylyshyn, 2003Pylyshyn, , 2007, that he calls "FINST"-from "FINgers of INSTantiation". Put succinctly, the theory claims that we possess a limited number of FINST mechanisms (four or five; but cf., Franconeri, Alvarez & Cavanagh, 2013) that detect and track material objects in the environment. Pylyshyn sometimes calls the sensory individuals "FINGs"-from "FINSTed THINGs" (2007, p. 56)-another name for Matthen's material objects (Section 3.2). FINGs are said to grab a FINST in virtue of some causally relevant properties (p. 68; Section 4.2). Such mechanisms are pre-representational and pre-conceptual: They merely register or detect the presence of an object in virtue of a causal, non-representational relation (pp. 74-75, 94). Causalreference is possible thanks to a set of non-encoded (i.e., non-represented) properties (2003, p. 219). For Pylyshyn tracking mechanisms play the role of fixing reference in a way that resembles that of demonstratives like "this" or "that." Importantly, they do so in a non-descriptive manner (2007, p. 95). In a way analogous to the sentence "This is red", the visual system can be said to first assign the index "this", with the predicate "red" assigned in subsequent processing. Consequently, as Pylyshyn (2003) puts it, in vision "[p]roperties are predicated of things" (p. 201). 9 Pylyshyn believes that features cannot fix reference because, among other reasons, he thinks that they are only processed at later stages of visual information-processing. Not everyone agrees. There is some evidence that processing of features begins very early, on the retina itself, where color-related and motion-related information is extracted and further processed in a series of topographic maps that preserve, to an extent, the spatial arrangement of the proximal stimulus (Op de Beeck, Haushofer & Kanwisher, 2008;Silver & Kastner, 2009;Somers & Shermata, 2013). But Pylyshyn's FINST theory is by no means the only psychological account of object individuation and tracking. Other options include Leslie et al.'s (1998) object-indexing theory, Ballard et al.'s (1997) deictic-codes theory (Raftopoulos, 2009, p. 98). Pylyshyn's account falls squarely under what Recanati (2012, pp. 3-14) calls the "singularist" camp. Singularists argue for an acquaintance-based theory on which reference is fixed thanks to a direct causal link between material objects and perceivers via a set of non-represented properties. Descriptivists, on the contrary, take as a requirement for reference-fixing that a given set of the target's properties must be represented. 9 Notice that sensory reference is a distinct problem from that of the role of (mostly conscious) perception in fixing demonstrative reference (Siegel, 2002b).
In the next pages-although I mainly refer to Pylsyhyn's account for expository reasons-I advance some arguments against FT that are compatible with all the foregoing options. This, I believe, makes my case against FT even more efficacious.

| Making sense of the scientific claim
It is helpful to provide some terminological clarity. A sensory individual is any entity x that fulfills the three roles identified in the previous section. The feature-placing hypothesis has it that sensory individuals are places, but most researchers today maintain-for reasons examined in Section 3.2-that sensory individuals must be material objects, that is, spatio-temporally confined and continuous entities that can move and retain their properties. Material objects thus construed are worldly things. Visual objects, meanwhile, are understood as the coherent units that appear in our visual field (Feldman, 2003). 10 In order to claim that scientific evidence supports FT (a thesis about visual objects), Fish must be making several assumptions that we will now examine.
Fish's first assumption seems to be that material objects must be facts' particulars. Otherwise, it would not be clear why Fish thinks that we see "object-properties couples" that should be understood as Armstrongian facts (Fish, 2009, p. 22). A second assumption is that material objects, understood as facts' particulars, and their features (properties) are glued together via a non-relational tie (Section 2.1), and that they form a non-mereological unity over and above their constituents. The third assumption is that, given states of seeing are conscious mental states (Section 2.1), both the particulars and properties which form such a unity are somehow consciously visually accessible. Fish interprets the scientific evidence in light of these assumptions. I will argue that this reading is untenable or simply unwarranted.
One way to refute FT would be to show, against Fish's third assumption, that even if material objects are indeed facts' particulars, they are not consciously accessible. Consider an example. In contrast to perceptual singularism-the claim that perception is a relation of sensorily entertaining a singular proposition specified by object and properties-defenders of general Russellianism argue that perceptual content is constituted by a general proposition of the form 9x(Fx) (Tye, 2000). Such propositions are possible states of affairs, but the perceiver, on this view, is only aware of properties, not of objects (facts' particulars). General Russellianism outlines an alternative to FT that we might call weak BV: W-BV: Visual objects are constituted by properties predicated of an unseen categorically heterogeneous particular. 11 W-BV and FT are incompatible, for the latter claims that, in addition to properties, we also see particulars (Mulligan et al., 1984). Nevertheless, W-BV is compatible with Fish's first assumption. Alternatively, one could argue for strong BV: 10 The exact ontology of visual objects will also depend on our theory of perception. For a naïve realist visual objects might be constituted by worldly entities; for a Fregean representationalist visual objects may be constituted by modes of presentations. My claims are orthogonal to this debate. 11 For brevity's sake, I drop the further qualification "forming a non-mereological unity over and above the fact's constituents." S-BV: Visual objects are constituted by properties all the way down.
"All the way down" captures the idea that-at the conscious, unconscious, and pre-attentive stages of perception-perceivers are sensorily related only to the properties of material objects. Both W-BV and S-BV have it that visual objects are constituted by properties alone. But S-BV is more radical than W-BV as it denies, against Fish's first assumption, that material objects should be interpreted as facts' particulars.
FT and BV render ontologically distinct interpretations of visual objects and of sensory individuals. In the remainder of this paper, I simply accept that sensory individuals are material objects in the sense specified by Matthen. However, as I pointed out, Matthen's definition of material objects leaves open their exact ontological interpretation. In order to support FT, Fish, as we have seen, must make the assumption that material objects are facts' particulars, and that particulars plus properties form a further non-mereological unity. I shall challenge this assumption and thus provide indirect support for S-BV. By granting that material objects qua sensory individuals must play the roles mentioned above (Section 3.3), we arrive at a standpoint from which we can examine FT.
FT obviously assumes that there are worldly facts. For argument's sake, I shall concede this. But recall what I said earlier (Section 2.1): Even if we live in a world of facts, that does not entail that we see facts. In order to assay FT, I will examine whether there are good reasons to believe that the visual system detects and tracks facts. 12 If there are not, we undercut the scientific evidence for FT. Before we proceed, we need to specify some ontological criteria.

| Two ontological criteria
We can think of an ontological inventory as an exhaustive catalogue of ontological kinds. The scope of an inventory depends on our region of interest; in our case, the ontology of visual objects and sensory individuals. S-BV and FT require different ontological inventories. S-BV's inventory includes only properties, whereas FT's inventory includes properties, particulars, and their unity in facts. Determining whether our inventory is complete or not for a given domain depends on the specific set of problems that we need to account for (Betti, 2015, p. 62). Some ontological criteria are needed; otherwise we might end up adding kinds arbitrarily. Betti (2015, pp. 62-63) puts forward two elegant criteria: C1: Is the problem we are called to solve genuine or not? C2: If it is genuine, can it be solved without enlarging our ontological inventory? C1 states that some problems may be the result of wrong theoretical assumptions; if that is the case, there is no need to enlarge our inventory. If the problem is genuine, and it cannot be solved within our current inventory, we can vouchsafe the new kind a place in our inventory.
Let us start with C1. In order to deny that our problem is genuine, we would have to rethink the problems of feature-object binding, sensory individuals, object detection, and tracking. 12 I pass in silence the two further roles of sensory individuals, as property unifiers, and as providing a constant base through featural change, since every metaphysical theory of objecthood accounts, in some way, for these roles.
There are two options. We can deny that sensory individuals are material objects, or we can deny that we need sensory individuals. Going for the first option, one may try to salvage Clark's feature-placing hypothesis. The second option is more radical: Perhaps we can just dispense with sensory individuals. One way of doing so could be to reject the binding problem as illposed (Di Lollo, 2012;Garson, 2001). If any of the two strategies succeeds, Fish's scientific claim collapses. 13 For the sake of argument, I accept the current state of the art, the problem we are dealing with is a genuine one.

| Detecting and tracking facts?
Fish assumes that material objects are facts' particulars (Section 3.4). Facts' particulars can be construed as either "thin" or "thick" (Section 2.1), this gives us two options:

| Tracking thin particulars?
Thin particulars are very similar to bare particulars (Loux, 1978;Sider, 2006), that is, particulars shorn of every property. The main difference between thin and bare particulars is that, within the ontology of facts, thin particulars are a mere abstraction; fact ontology rests on the assumption that there are no bare particulars (Armstrong, 1989, p. 88). We can only get at thin particulars via an intellectual process (Section 2.1). This may give us two possible ways of understanding FT1: Perhaps, sensory individuals are thin particulars singled out directly by detection mechanisms; or, they are detected indirectly thanks to a specific set of conceptual skills.
The first option strikes one as obviously implausible. Thin particulars can only be given in thought, hence they cannot fix sensory reference. But let us suppose that, for argument's sake, something like thin or bare particulars can be found in the environment, and that our perceptual systems are somehow sensorily related to them. Assuming fact ontology, the perceiver's surroundings are populated by many different facts. But how can detection mechanisms fix reference to something shorn of every property? (Campbell, 1990, p. 7). By way of analogy, consider the case of vision in a Ganzfeld (Avant, 1965). As we know, exposure to an unstructured, qualitatively homogeneous visual field results in a "mist of light" or "empty field" visual experience. Nothing can be seen within a Ganzfeld because the deployment of discriminatory capacities demands qualitative discontinuities, and hence, properties (Pylyshyn, 2003, p. 210). Just as one cannot detect objects in a Ganzfeld, so too it is difficult to explain how bare particulars could be detected: Precisely because they lack properties, they cannot provide the required "fundamental ground" of reference (to use Evans' terminology).
We now turn to the second way of understanding FT1: Thin particulars may be detected via the deployment of a specific set of conceptual capacities. The only way of making sense of this claim is to place the action of conceptual capacities at a very early stage of visual processing. It is implausible that all concepts-whatever they are-will be deployed at such an early stage; furthermore, it is likely that such concepts will be of a different kind to that of higher-level cognitive capacities. 14 A version of conceptualism of this sort may be found in Xu's (1997) sortal theory of physical objects.
On Xu's influential account, the concept "PHYSICAL OBJECT" is a sortal because it provides identification, counting, and persistence conditions even under dramatic changes (e.g., Spelke-objects, Spelke, 1990;Xu & Carey, 1996). Xu is committed to a specific form of conceptualism (Heck, 2000), according to which object identification is possible only in virtue of possession and exercise of the relevant concept. 15 However, conceptualism is not universally accepted. Pylyshyn, for example (2007, pp. 93-94;Fodor, 2008, pp. 218-219), argues that FINST mechanisms (Section 3.3) do not single out items by means of concepts (Raftopoulos, 2009, pp. 89-118).
It is not necessary to further articulate these options here. It suffices to show that concept deployment at early stages of vision is not meant to first detect worldly facts and then decomposing them. Instead, on these views concepts are deployed for the identification of objects, and the parsing of the visual scene into objects. Notice that this is only possible after the visual system has already detected a worldly item, whether represented (as descriptivists claim) or not (as acquaintance theories would have it) (Section 3.3). 16 This worldly item may be something like a Spelke-bundle (properties like cohesiveness, boundedness, etc.) or other kinds of properties. At this point, one may claim that, perhaps, there should also be another set of primitive concepts, concepts that allow the visual system to detect the thin particular behind the object's properties, once the material object has already been detected. However, this seems unnecessary baroque: Why would the visual system need to track such an item if sensory reference can already be fixed thanks to a specific set of properties?
We have seen that an object is only detected in virtue of (some of) its properties. At this juncture, friends of FT may raise an obvious concern: Nothing I have said so far bears on their thesis. Indeed, as Armstrong forcefully states, fact ontology explicitly excludes bare particulars. Nor is there any reason, as we have seen, to suppose that thin particulars are tracked or identified as such in visual perception. FT1 is thus not the right way to articulate factualism, and so the best way to do so is by opting for FT2. 14 For example, Quine (1960) argues that parsing the world into objects requires a fairly advanced conceptual armamentarium that is only acquired via language. On this view, combined with BV, we may get something like the "old empiricist view" (Section 2.2). 15 Conceptualism about perceptual content may be construed in different ways. On McDowell's (1996) version, concepts act in receptivity. However, on this account perceptual content is constituted by demonstrative concepts that allow for property (and object) re-individuation, so they are not meant to perform the complex function of identifying thin particulars. 16 One possible objection comes from studies like Peterson's (1994Peterson's ( , 2019 who suggests that top-down factors (like familiarity with an object) may influence object individuation. What this would show, if anything, is that perceptual processes at early stages might be cognitively penetrable, with top-down factors helping to fix visual object formation from among multiple interpretations (Green, 2018, p. 25).

| Tracking thick particulars?
Perhaps sensory individuals are thick particulars, and thick particulars just are facts (Section 2.1). Before I proceed, we must first stop and say something about such properties. When we sensorily relate to an item in the world, that relation is constitutively a relation to numerically distinct and unrepeatable entities: I am now perceptually related to this laptop, seeing this copy of Collingwood's The principles of art, and so forth (Schellenberg, 2016). However, properties are construed, it seems clear that at least in genuine cases of perception we are related to property instances or instantiated properties. With "property instances" I mean simply a non-repeatable instance of a property in a given target object, whether it be a trope or a universal instantiated by a fact (Armstrong, 1997, p. 126).
What kind of property instances are needed to secure sensory reference? There is little agreement in the scientific literature. Pylyshyn (2001, p. 145) suggests that after an initial preattentive and pre-conceptual parsing of the visual scene into proto-objects, tracking mechanisms are grabbed by some properties of the proto-objects. In several works (Pylyshyn, 2003(Pylyshyn, , 2007also Bahrami, 2003;Blaser et al., 2000; he urges that, since we can track objects in spite of featural change, and that features are only processed at later stages of visual processing, sensory reference cannot be secured by features. 17 On this ground, Pylyshyn thinks we ought to operate a distinction between represented (as Pylyshyn says, "encoded") features-that is, the properties that are manifest in our states of seeing (like colors, shape, etc.)-and a distinct class of physical properties that mainly serve the role of connecting a sensory individual to a tracking mechanism. Physical properties, on Pylyshyn's account, are not represented but interact causally with the FINSTs, 18 and it is largely an empirical matter to determine which physical properties may grab a FINST mechanism (Pylyshyn, 2007, p. 211). Pylyshyn's functional distinction between two kinds of properties does not fit squarely with the empirical evidence, however. In most cases, it seems that spatio-temporal properties are required to detect an object at a pre-attentive stage (Driver et al., 2001;Pylyshyn, 2001;Scholl & Leslie, 1999). There is some evidence that temporal synchrony, continuity, and proximity usually override featural criteria of object detection (Carey & Xu, 2001;Pylyshyn, 2003;Scholl & Leslie, 1999). This leads to the plausible conjecture that there must be some mechanisms sensitive only to spatio-temporal information (Raftopoulos, 2009, p. 94;also Carey & Xu, 2001;Xu & Carey, 1996). In a famous study, Spelke et al. (1995) showed that infants normally use spatiotemporal information to individuate objects. Yet, it has also been shown that infants may use features like shape and color to individuate objects when spatio-temporal information does not provide sufficient criteria of individuation (Tremoulet, Leslie & Hall, 2000). This seems to provide some prima facie reason to reject Pylyshyn's distinction between physical properties and features.
Once again, I take an ecumenical approach: I will not opt for a specific theory of sensory reference. Given that on FT2 sensory individuals are facts, the next question to consider is what 17 In a passage, Pylyshyn actually states that the speed of objects' motion or the rate at which they change direction seem to play a role in fixing sensory reference (2007, p. 68, fn. 2). However, he adds that these properties also "do not appear to be encoded" (i.e., represented). 18 A reviewer has pointed out to me that the distinction between features and physical properties is quite obscure.
Following Pylsyhyn, I assumed that the distinction is grounded on the different functional roles such properties play relative to the perceptual system. Perhaps, at least some physical properties may be interpreted as impure nonqualitative properties, like "being next to item y" or "being brighter than y" (Cowling, 2015). My argument against FT, however, does not depend in any way on the plausibility of this distinction. kind of properties clothes the particular. The first option is to entertain a thesis which preserves Pylyshyn's distinction between physical properties and features. This can be translated as follows: FT2*: Sensory individuals are particulars instantiating both physical properties and features.
FT2* accommodates a broad spectrum of options for fixing sensory reference, on the assumption that we can make sense of the distinction between features and physical properties. One can follow Pylyshyn and claim that only the latter fix sensory reference, or accommodate the intuition that, while keeping this distinction, features might occasionally contribute to fixing sensory reference. But suppose that someone casts doubt on this distinction. We might thus formulate a second option in which sensory individuals are particulars instantiating multiple properties, only a subset of which will be perceptually relevant: The difference between the two options is meant to reflect the disagreement in perception studies about what sort of properties may fix sensory reference; plus, it reflects my catholic approach to this matter.
Consider FT2*. One may accept this if one believes that features cannot, or sometimes cannot, fix sensory reference, for all that FT2* claims is that not only sensory individuals are facts, but that their relevant properties are the kind of entities that may fix sensory reference under a broad spectrum of possibilities. Hence, one may contend that sensory reference is fixed by means of physical properties alone, although such properties may neither be represented nor accessible in states of seeing. Pylyshyn is not always clear about what is exactly being tracked: "if the FINST was captured by a property P, what the FINST refers to need not be P, but the bearer of P (the [sensory individual] that has property P)" (2007, p. 96); "I take the view that objects are indexed directly, rather than via their properties or their locations" (2003, p. 202). 20 Following Fish's first assumption, one might think that the bearer of P is a fact's particular. Might this be a fact's thin particular? As I have argued (Section 4.2.1), the claim that the visual system may track thin particulars is untenable. Since we need properties, such "bearers of P" might be thick particulars (facts) or property bundles. Either option would work, provided that they instantiate the right sort of properties, that is, properties that the visual system might exploit for purposes of detection and tracking. One may amend Pylyshyn's contention and claim that although a tracking mechanism is grabbed by property P, reference is fixed to the fact as a whole in virtue of some other properties, or by means of some other properties of the bundle. 19 The construct "perceptually relevant properties" is meant to capture the following ideas: first, that not all of an object's properties are relevant for visual perception (we do not see an object's weight); second, it makes room for a conception of sensory reference and perception that does not rely on Pylshyn's distinction between different kinds of properties. The Tremoulet et al. (2000) study goes in this direction, as the authors maintain that features may also play a role in sensory-reference fixing. I thus prefer to speak of "perceptually relevant properties," allowing such properties to be manifest in seeing as well as helping to fix sensory reference. I thank a reviewer for suggesting this option. 20 These passages are ambiguous, but Pylyshyn (2003) also stresses that, "there must be some properties that cause index assignment and that make it possible to keep track of certain objects visually-they may just constitute a very heterogeneous set and may differ from case to case" (p. 213).
In general, the claims I make are fully compatible both with Pylshyn's commitment that detection and tracking mechanisms are non-descriptive (acquaintance theory), and with the rival view that such mechanisms are representational (descriptivism).
But notice that, if what I say is correct, it strongly suggests, in compliance with Betti's second criterion, that all we need to single out and track items in the world are property instances! Notice also that the true metaphysical nature of such bearers is beyond the purview of perception. The only requirement mandated by the scientific evidence is that the material object must possess the right sort of properties (physical properties or features, or an indistinct set of perceptually relevant properties). Thus, determining whether the material object, qua worldly thing, is a fact or property-bundle depends on metaphysical grounds that have nothing to do with evidence gathered from the science of perception. 21 Now, suppose that Pylyshyn is wrong, and that sometimes both features and physical properties fix sensory reference. If this is correct, then once again it seems that property instances alone are required to fix sensory reference, and in virtue of Betti's second criterion, there is no need to invoke any facts' particulars, let alone complex entities forming non-mereological unities (Fish's second assumption).
I turn now to FT2**. FT2** hinges on the suspicion that there is no legitimate distinction to be drawn between features and physical properties. But as the reader, by now, will easily understand, the same considerations developed about FT2* will apply to FT2** as well. If properties are all we need to fix sensory reference and tracking, then the scientific evidence does not license any factualist reading.
Under all interpretations, Betti's second ontological criterion suggests that we find all the indispensable ontological tools in property-instances, however such properties are instantiated. The latter point is important, for it seems clear that perceptual psychology does neither reveal the metaphysical nature of property instances (tropes or universals), nor the metaphysical composition of material objects. For all we know, we are related to property instances of (material) objects in the world, and it is only thanks to some further metaphysical argument that we may call them instantiated universals, or tropes. If this is true, it follows that we do not have scientific reasons to introduce "particulars" to fix sensory reference, let alone construe visual objects as non-mereological units constituted by particulars plus properties. Material objects-that is, things in the world-are detected and then seen in virtue of their properties, and properties alone. It is important to reiterate that my claim is not meant to reveal anything about the "true nature" of material things in the world, but only of visual objects. Things in the world may be, among other things, facts or trope-bundles.

| Tracking without facts
I have shown that, contrary to Fish's assumptions, the scientific claim is inconclusive. But at this point, one might cast doubt on the plausibility of my considerations on the basis of the experiments reviewed earlier (Sections 3.2.1 and 3.2.2). Blaser et al. (2000) have nicely illustrated that features alone cannot always account for objects detection and tracking precisely because features might change (the Gabors changed saturation, motion direction, etc.). In other words, a change in features does not disrupt tracking (Scholl, 2001). A feature P (say, "the rose's redness") may first grab a FINST (or any other tracking mechanism) and allow object detection or individuation, but as Pylyshyn pointed out (2007, p. 96; Section 4.2.2), the tracking mechanism needs not refer to P-as P might change or disappear, like in the Blaser et al. (2000) experiment-but to the bearer of P. Fish seems to interpret this bearer as a fact's particular of which P is predicated. Fixing sensory reference to such a categorically distinct entity seemingly has the virtue of explaining the fact that sensory reference is fixed on a property bearer. However, for the reasons examined earlier (Section 4.2), factualism is either unwarranted by perceptual psychology or plainly at odds with it. How can we make sense of keeping track of material objects (sensory individuals) while being only sensorily related to their property instances?
To see how this might work, let us first recall that object tracking seems to occur not only via its qualitative properties or features, but also thanks to spatio-temporal information (Raftopoulos, 2009, p. 94). Xu and Carey (1996), for instance, have shown that 10-month old infants can employ spatio-temporal information to infer the existence of occluded objects, whereas 12-month infants are able to exploit both spatio-temporal and featural information, especially when spatio-temporal information is ambiguous (Tremoulet et al., 2000). This is most evident in the Blaser et al. (2000) experiment, where the superimposed Gabor patches changed features, but were nonetheless perceived as spatially segregated and standing in a figure-ground relation. Once the sensory individual has been detected, thanks to some property instance P, tracking might occur. Moreover, notice that tracking is a relatively high-level capacity of the visual system, and this means that, at this point, a fairly good amount of information will have been sampled into the corresponding object file. This means that the perceptual system might rely on multiple properties to keep track of the sensory individual, with some of them (perhaps) being featural and spatio-temporal in nature. (Whether such properties are all in principle consciously accessible, or even represented, is an interesting and important question, but it goes beyond my argumentative aims.) Let us consider a sensory individual, call it object, a red rose. Object first grabs a FINST in virtue of its property (a feature, or physical property, or simply perceptually relevant property) P 1 (say, "being red"). At t 1 , object possesses the following property instances: Object at t 1 : {P 1 , P 2 , P 3 , P 4 , P 5 } (Notice that object will likely possess some non-perceptually relevant properties as well, for instance the object's "having x atoms", etc., these are not included in the set.) Since object has now been detected, several of its properties, like shape, orientation, and spatio-temporal information are being sampled in the object file. The tracking system (a FINST, deictic codes, or other putative tracking mechanisms) now has a pool of properties to rely on. It can track P 1 and P 2 , or P 2 and P 3 , or any other conjunction of multiple property instances. Now, let us assume object loses P 1 at t 2 , and gains P 6 (say, "blueness"), and so we get: Object at t 2 : {P 6 , P 2 , P 3 , P 4 , P 5 } Clearly, if tracking were to rely exclusively on P 1 , the object would be "lost" to the perceptual system. However, as I pointed out, the visual system exploits multiple properties to keep track of items. In our case, it can keep track of P 2 and P 3 , or of P 3 and P 4 , and so on. Thus, tracking can continue over time in spite of featural or property change. Notice also that in my example, the properties in curly brackets need not be features like shape and colors, but may be any perceptually relevant properties (spatio-temporal ones (Section 4.2.2)).
In my example the sensory individual is an object, a "spatio-temporally confined and continuous entity that can move while taking its features with it" (Matthen, 2005, p. 281), it is capable of receiving "contraries" (say, red, and later blue) (Section 3.2.2), regardless of its ontological status, it can be a fact, or a bundle (if we accept the existence of properties). For detecting and tracking, the perceptual system only needs (some of) object's property instances. All of these details thereby lend support to the truth of S-BV (Section 3.4). Notice that in the account I have just sketched out, the sensory individual still plays all its functions: It remains constant through featural change (it is always the same item, even though it changes properties), 22 it provides the fundamental ground of reference in virtue of its property instances, and it provides an external "unifier" to which multiple properties are attached and/or predicated of.

| CONCLUSION
A closer scrutiny of the empirical literature reveals that, contrary to what Fish suggests, we have neither scientific nor phenomenological evidence for FT. Perhaps, one might establish the truth of FT on some other grounds: This is why the paper's title has an interrogative form, suggesting that the claim that we see facts may not be as obvious as some philosophers might assume, but stands in need of argumentative support. I leave it to my readers to elaborate upon the further implications, epistemological, and phenomenological, of my claims.