SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References

It is sometimes said that there has been a ‘paradigm shift’ in the field of assessment over the last two or three decades: a new preoccupation with what learners can do, what they know or what they have achieved. It is suggested in this article that this change has precipitated a need to distinguish two conceptually and logically distinct methodological approaches to assessment that have hitherto gone unacknowledged. The upshot, it is argued, is that there appears to be a fundamental confusion at the heart of current policy, a confusion occasioned by the demand to know what learners know and compounded by a failure to recognise what this properly entails for assessment methodology.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References

It is widely accepted that the last two or three decades have seen a ‘paradigm shift’ (Gipps, 1995, p. 1) in the field of assessment, a shift away from a culture of testing for purposes of prediction, comparison and selection, towards a conception of assessment centred on what learners have achieved, what they can do, what they know. This shift is reflected in the very language of assessment, with terms such as ‘attainment’, ‘skills’, ‘outcomes’, ‘competences’ and ‘criteria’ signalling a markedly different approach to that which went before, an approach that is evident in almost every sector of education today, in the primary school and the vocational training college alike.

As regards schools in the UK, this shift has its origins in the 1988 Education Reform Act which introduced the requirement to determine what pupils ‘have achieved in relation to the attainment targets [ATs]’ (§2, (2)c, p. 2) of the National Curriculum, this generally being understood to mean that ‘assessment should be criterion-referenced, since its function is to ascertain achievement in relation to the ATs’ (Brown, 1991, p. 215). Characterised as the ‘antidote to “norm referenced” assessment which ranks students’ (Goldstein and Noss, 1990, p. 4), criterion-referenced assessment is now ubiquitous in schools, not only with SATs (Standard Assessment Tests) but with GCSEs (General Certificate of Secondary Education) where the express intention was to give ‘an accurate indication of what a candidate “knows, understands and can do” ’ (Radnor, 1988, p. 45). Similarly, in the context of vocational education, the late 1980s /early 1990s saw the setting up the UK's National Council for Vocational Qualifications and the emergence of a new framework of national vocational qualifications structured around ‘performance criteria’, specifically intended to identify for assessment purposes ‘the essential aspects of performance necessary for competence’ (NCVQ, 1991, p. 3). Today, both in the UK and increasingly elsewhere, competence-based methods predominate in almost every occupational sector and are gaining an increasing foothold in the professions.

Now, however laudable this aspiration to determine what learners know—and few could have any objection to the ambition stated thus—the methods used to achieve this end have been greeted with less than universal approval. The use of criterion-referenced assessment for high stakes purposes—as a supposed indicator of the relative effectiveness of schools—has been criticised for resulting in the assessment of ‘thin skills’ rather than ‘rich knowledge’ (Davis, 1995, passim). Similarly, competence-based assessment has been attacked on the grounds that it is ‘intrinsically behaviouristic’ (Hyland, 1997, p. 492, original emphasis) and thus neglectful of knowledge and understanding. Of course, to those who automatically associate these methods with their purported aims, such critics might appear to stand in some perverse opposition to the ambition of determining what learners know or what they can do, seeming perhaps to favour a return to the previous culture, or the abandonment of educational assessment altogether. A more plausible reading, however, is that the complaint is not with the ambition but with the presumed means of achieving it. And by implication this would seem to suggest that there are other means, other kinds of assessment better suited to that end. The question then is what kind of assessment would best be suited to that task, or put another way, if we have a choice about the kind of assessment we use to determine what people know, what they can do or what they have achieved, then what exactly is the nature of that choice?

The standard response to this kind of question is to resort to any of a number of customary oppositions thereby invoking a schism that is an all too familiar feature of the assessment landscape. On one side of this divide is the inclination towards the criterial, the measurement-oriented and the objective; on the other, the normative, the interpretative and the subjective. Much is made of the difference between, say, criterion-referenced and norm-referenced assessment, between competence-based assessment and unseen examinations, between standardised tests and teacher assessment, between objective marking and subjective marking, and so on. The whole subject of assessment is replete with any number of such distinctions, categories and procedural types that can be seen to be implicated in marking out this basic methodological divide. At its deepest this rift takes on a philosophical character centred on the ontological differentiation of body and mind, behaviour and understanding—a differentiation made much of by critics of the new regime with references to such things as ‘skills’ and ‘behaviouristic’ tendencies. And of course each side of this divide has its political adherents, the one being favoured by those who would seek to have extraneous control over the processes and achievements of education, the other by those who would give precedence to the autonomous agency of practitioners. On the face of it, then, this is well-mapped territory.

Yet I want to suggest that this picture is illusory. More specifically, I want to suggest that it serves to conceal and deflect attention from a distinction that I will argue is more fundamentally at issue and which goes unrecognised in this scheme of things. The central claim of this paper is that one hitherto unacknowledged consequence of the so-called ‘paradigm shift’ in the field of assessment, this new-found preoccupation with determining exactly what the learner knows, is that it requires us to choose between two conceptually and logically distinct kinds of assessment. By this I mean not a choice between the previous culture and the new. Rather, the suggestion is that it is precisely the requirement to determine what a person knows or what they can do that precipitates the need to choose between two fundamentally different methodological approaches. The choice does not arise when test performance is used for purposes of prediction, comparison or selection, which perhaps goes some way towards explaining why the distinction at issue has previously escaped notice. Importantly, not one of the great profusion of distinctions traditionally employed in connection with assessment corresponds with this particular distinction. Indeed, as we shall see, it is a distinction that has been not merely overlooked but persistently misconstrued in official thinking about assessment, the resulting misunderstanding evidently having been passed down through test instrument design into assessment practice. The upshot, I will suggest, is a fundamental confusion at the heart of current assessment policy, a confusion occasioned by the demand to know what learners know. It turns out that critics have been largely correct in their intuitions: there is something fundamentally wrongheaded about current policy and the essential difficulty does indeed revolve around the kind of assessment we choose to employ. However, the important thing is how we characterise the kinds of assessment at issue.

Now the distinction I have in mind is certainly not one of my own invention; its informal application would be familiar to any teacher who has had occasion to distinguish a pupil's de facto performance from what she discerns to be the substantive extent of that pupil's understanding. Such a teacher might judge a pupil not to understand despite, say, answering a question correctly, or conversely, judge that a pupil does understand despite failing to answer the question correctly. If asked, she would no doubt account for this in explicitly ontological terms, as being a case in which outward appearances are at odds with the pupil's inner capabilities. And she would be all too aware of the potential ambiguity implicit in the question of whether the pupil could be said to ‘know’, aware that how one answers such a question would depend on whether priority is attached to the pupil's performance or to their understanding. In practice, then, this much is familiar; indeed one difficulty here is precisely that this familiarity and our all too ready resort to terms such as ‘performance’ and ‘understanding’, with their implicit suggestion of an ontological differentiation of some kind, already places us at risk of prejudging the issue.

Putting aside for the moment the question of the precise nature of this distinction and the process by which the teacher is able to make such judgements, it is nevertheless possible to acknowledge the first of several senses in which this distinction—whatever it might be—can be said to be fundamental. It is clearly fundamental to the business of teaching because in framing her intentions the teacher must intend that the learner come to know in either one or other or both of these senses of ‘know'—whatever these two senses turn out to be. If any teacher is to gauge the success of her own endeavours she must be alert to the fact that knowing in one sense is neither necessary nor sufficient condition for knowing in the other sense. I am certainly not suggesting, of course, that it is only educators who are sensitive to this distinction: for whatever it is, it is clearly a facility possessed, albeit in varying degrees, by most if not all human beings, and one that would seem fundamental to all social interactions insofar as those interactions require some estimation of what another person knows, believes, thinks, feels, etc. We are all of us familiar with the sense of knowing what is in a person's mind despite how they behave.

Familiarity is only one of the obstacles that stand in the way of our clarifying the distinction at issue. Another is the all-pervasive influence of the traditional assessment categories—the kind of distinctions referred to above—which impinge so greatly on thinking about assessment that educators and assessment professionals seem unable to shake off their attachment to them. Many will be quick to resort to the traditional methodological oppositions, perceiving the matter as something approximating to the opposition between objective, concrete, standards-based or criterial methodological perspectives on the one hand, and subjective, interpretative, impressionistic or holistic perspectives on the other. Accordingly, one task here will be to untangle these traditional ways of thinking about assessment from the distinction I will suggest is more substantively at issue.

A third obstacle arises from the widespread commitment to the aforementioned ontological differentiation of mind and body, with all its various connotations: thinking as against doing, understanding as against behaviour, performance as against underpinning knowledge, knowing how as against knowing that, and so on. Some such commitment can often be seen to be central to the critical position in the form of a complaint that current arrangements are focused on ‘skills’ or ‘behaviours’, along with the not unreasonable suggestion that education is necessarily impoverished to the extent that it neglects the life of the mind. Yet however much the complaint of behaviourism might be feasible in other contexts—in curriculum design, for example1—there is an important sense in which in the context of assessment the accusation of behaviourism appears incoherent, for the stark fact is that any assessment must ultimately be based on behaviour. Gilbert Ryle (1949) was surely right when he famously insisted that we do not have access to the inner workings of other people's minds. But if we are to reject this dualistic scheme of things and not fall prey to the accusations of behaviourism which beset Ryle's account of mind, then we need to at least be able to account for the conviction—evident in our example of the teacher—that it is possible make judgements about a person's mind as distinct from their behaviour. As we shall see, becoming clearer about what lies behind this conviction will enable us to become clearer about the distinction that is fundamentally at issue.

Knowing Other Minds

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References

As Donald Davidson has said, whilst there may be a puzzle about the basis upon which we come to know our own minds there seems not to be the same mystery when it comes to the question of how we know the minds of other people:

There is no secret about the nature of the evidence we use to decide what other people think: we observe their acts, read their letters, study their expressions, listen to their words, learn their histories, and note their relations to society. How we are to assemble such material into a convincing picture of a mind is another matter; we know how to do it without necessarily knowing how we do it (Davidson, 2001, p. 15).

All that matters for our purposes is that we can do it; and it would seem that the full implications of this simple point for assessment methodology have never been fully appreciated. Such ‘pictures’ are clearly of inestimable worth in our dealings with other people, not least in enabling us to explain their past behaviour and predict how they are likely to behave in future. Importantly, they are much more than the sum of available evidence: they are the result of active work on our part. Of significance here is the considerable amount of empirical evidence which suggests that all of our perceptions, including our perceptions of other people, depend upon our being able to interpret and ‘go beyond’ the raw data received via our senses. In perceiving, we are involved unavoidably in complex and largely unconscious processes of selection, interpretation and judgement, processes which are influenced by the subtle nuances of the situation and by our own past experience. Such processes have been shown to be fundamental to our ability to cope and find our way around in the world.2 And nowhere is this capacity to extrapolate on the basis of limited information more apparent than in our perceptions of other people. Indeed, so deep-seated is this facility that we have to guard against its negative employment, against prejudice—in the literal sense of that word. Nevertheless, on the whole it would seem that we are reasonably successful in our efforts to understand, explain and predict the behaviour of others.

Such considerations not only controvert the common assumption that our conception of what another person knows and thinks is exhausted by an account of what we have expressly witnessed that person say and do, but they also indicate how it is possible to speak meaningfully about a person's knowledge or understanding as something distinct from their behaviour whilst acknowledging that behavioural manifestations are all we have access to and in this sense all we can know. There is no inconsistency, I want to suggest, because what we refer to when we speak of ‘knowledge’ or ‘understanding’ in such circumstances is not some inner realm within the subject but, rather, something that exists by virtue of something that occurs in our mind. The distinction that is fundamentally at issue here is not, as traditionally conceived, between the subject's outward behaviours on the one hand and their inner mental states on the other but, rather, between those behaviours and a ‘picture of a mind’ in the mind of the observer.

Now in some contexts it may be of little consequence to differentiate between mental states and the picture someone might have of some such states. In the context of curriculum design, for example, there would be little sense in asking whether by ‘knowledge and understanding’ is meant the intended states of learners' minds or the ‘picture’ the curriculum designer has of those states. To all intents and purposes it amounts to the same thing. In the context of assessment, however, the difference becomes vital, because it allows us to concede our manifest lack of access to other minds without sliding into behaviourism. It points up the fact that even allowing for this lack of access, something else over and above behaviour impinges on our judgements about the intelligent capabilities of others.

By the same token our acknowledging this simple point also allows us to avoid the difficulties that arise if we presume to be able to make inferences about mental states understood as something ontologically distinct from behaviour. The view of assessment which emerges from this ontological schism is one in which behaviour inevitably predominates by virtue of being first and foremost in any assessment process. For it will be conceded that all assessment must begin with behaviour, beyond which the process becomes a conspicuously more tentative affair that will be described variously as ‘inferential’, ‘subjective’ or ‘impressionistic’. Perceived thus, the choice is between staying with what is evident and observable, or venturing into a realm of hesitant speculation about the concealed and metaphysically contentious ontology that is mind: the ‘ghost in the machine’, to recall Ryle's memorable phrase. This does much to explain why critics of current arrangements have found it so difficult to oppose demands for assessment to be focused on what learners can do, particularly when those demands are prompted by a heightened political interest in such things as attainment, entitlement or accountability.

And this is not the only difficulty facing those bent on perceiving the issue in ontological terms. The strategy of militating against ‘doing’ inevitably runs into difficulty when it is precisely ‘doing’ that is at issue, as will so often be the case in vocational and professional education. The likely retort, put simply, will be ‘If we need a person to be able to do x then what can be wrong with assessment designed to determine if they can do x?’ It is then incumbent upon the critic to show why it is necessary to assess something other than the doing of x. The critic's standard response will be that we need to assess a person's understanding rather than just their behaviour, because only then can we be sure that the correct performance will issue in different contexts or situations. But if the reply comes ‘We will assess the technician's ability to test electrical generators of different types in different situations, at least all the situations that matter’, then the critic is left floundering to provide any basis for his complaint. A further task here, then, is to respond to what I would suggest is the anxiety that lies behind this complaint, what we might call the ‘basic worry’ of the critical position—the intuition that it is possible to have knowledge of a person's capabilities that belies what might otherwise be indicated by their behaviour—without having to resort to the mistaken idea that it is possible to assess minds understood as ontologically distinct from behaviours.

Whilst Ryle was surely correct in noting that our ascription of mental epithets is done entirely on the basis of behavioural evidence, what he failed to see was that much hangs on how we choose to treat such evidence. And it is clear that there are two possibilities. On the one hand we might choose to regard behaviours as ends in themselves: that is, we might set out to verify the presence or otherwise of x, where x is some specific behaviour, achievement or performance. On the other hand, we might approach the assessment situation more proactively, either consciously or unconsciously selecting, interpreting and ascribing significance to evidence as we create a ‘picture of a mind’ upon which basis we are able to make claims about a person's knowledge, mental states or capabilities.

What is fundamentally at issue here, then, is a distinction that relates not to what it is the subject knows—in the sense of outward behaviour as against inner knowledge or understanding—but, rather, to the two very different senses in which we might be said to know about another person's capabilities. As Wittgenstein (1968) said, ‘The grammar of the word “knows” is evidently closely related to that of “can”, “is able to”. But also closely related to that of “understands” ’ (§150). What Wittgenstein tries to draw our attention to here is not an ontological differentiation of the inner and the outer, for as he says, we lack the criteria to distinguish mental dispositions from their effects (see ibid. §149). What could I possibly say about the facility I have that enables me to do x that would not simply involve me in describing the doing of x? We might say that the distinction that is fundamentally at issue here is epistemological rather than ontological because it concerns what it is we can know about another person rather than the kind of thing they know.

Certainly we might sometimes use the word ‘know’ to denote nothing other than the behaviour itself, such as when we report a successful performance by saying ‘he knows how to x’. And it is also true that we sometimes speak of knowledge or understanding in an abstract or theoretical way in order to hypothesise about minds and mental states as philosophers and curriculum planners do. But if someone uses words such as ‘knowledge’ or ‘understanding’ to express a judgement about a particular individual's mental states as distinct from their behaviour then there is an important sense in which those words should properly be thought of as identifying something located in the speaker's head rather than the subject's, for they refer to a ‘picture of a mind’ that is substantially of the speaker's own making. This turns out to be of no small significance when it comes to the design of formal assessment procedures, as will become clearer with the following thought experiment—what I will refer to as the ‘Right/Wrong Scenario’.

The Right/Wrong Scenario

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References

Imagine that we wished to assess a person's knowledge of, say, current affairs by means of oral questioning. And suppose that this person was able to answer our questions correctly but with each and every answer betrayed some either quite subtle or perhaps quite radical misunderstanding. Perhaps on being asked who the current British Prime Minister is the response comes ‘David Cameron—leader of the Liberal Democrats’, or ‘David Cameron—the Welshman who lives at No 9 Downing Street’, or ‘David Cameron—a lizard-like alien from Mars who lives in the sewers of New York’. Let us say, then, that with each and every ‘correct’ answer comes countervailing evidence which suggests that the respondent does not fully understand the matter in hand—what I will call the ‘Right/Wrong Scenario’. The question here is whether and in what sense there could be said to be a correct or appropriate interpretation of such a response.

Before attempting to answer this question let us first deal with some possible objections to the way it is framed. Those inclined to dismiss this scenario on grounds of improbability should note that the only thing that is unlikely about it is the idea that someone would betray their ignorance so readily by volunteering extra information. The idea that someone could provide the requisite answers whilst lacking understanding is by no means improbable; the only novelty being introduced here is the idea that we have clear evidence to that effect.

The classic response of test theorists to this scenario would be to say that it merely demonstrates how the design of any test depends on our purposes, on what we intend to achieve by the test, and that we need simply to ensure that we include test items sufficient to cover the range of knowledge that has been deemed to constitute ‘a knowledge of current affairs’. If it matters to us that the person should know that David Cameron is leader of the Conservative Party then we should include a test item to that effect. The difficulty, however, is that whilst this proposal would be entirely apposite in the previous testing culture, a culture preoccupied with ‘constructs’ and ‘universes’ of test items, it is of little relevance within the new scheme of things where we are concerned only with whether the subject does or does not know the thing in question. Certainly, there is a sense in which with just one more question we could confirm whether or not the candidate ‘knows’ that David Cameron is leader of the Conservative Party. Yet this would be to miss the point which is that however many questions are set it is still logically possible that the Right/Wrong Scenario could obtain. That we might, with the benefit of hindsight, envisage a set of questions which would have identified a specific lack of knowledge or misunderstanding in a particular case is neither here nor there. It would seem, then, that the question of what would count as a correct or appropriate interpretation of the Right/Wrong Scenario still stands. And what I want to suggest is that whether we judge such responses to be correct or incorrect ultimately depends upon the kind of assessment we choose to employ.

By way of illustration let us imagine the Right/Wrong Scenario arising in two very different situations. First, let us suppose that the ‘assessor’ is someone carrying out a door-to-door survey of prospective voters. She has no interest in what people know beyond whether they can or cannot answer the survey questions. Simply hearing the words ‘David Cameron’ or just ‘Cameron’ in response to the question has been deemed sufficient to merit a tick in the appropriate box. Either the respondent is able to give the requisite answer or they are not. Accordingly, countervailing evidence of the type described is of no consequence and if the Right/Wrong Scenario were to obtain in these circumstances it would be entirely appropriate to interpret the responses positively. By the same token, someone who was unable to get the name ‘Cameron’ off the tip of their tongue would be deemed not to know even if by any other measure they evidently did know, for example, they knew his first name, could accurately describe him, and so on.

Now contrast this with a situation in which the ‘assessor’ is, say, a Member of Parliament using the very same questions to assess the suitability of applicants for an internship at Westminster. How would the MP react if presented with the Right/Wrong Scenario? Perhaps on receiving the first response she might be disposed to give the applicant the benefit of the doubt. But on receiving further responses in the same vein any sense that the applicant had the requisite understanding would surely begin to evaporate. It seems inconceivable in such circumstances that anything other than a negative judgement would result.

So whilst in one case it would seem entirely appropriate to judge the responses positively, in the other it would appear equally appropriate to regard them negatively—even though the same questions are used, the same answers are sought and the very same answers are received. There is nothing mysterious about this but it is clearly not sufficient, even though obviously true, to say that the matter is contingent upon each assessor's particular aims or purposes. Indeed I want to suggest that the only way we can explain such divergent outcomes arising from these prima facie identical procedures and justify them, is to recognise that what we have here are two fundamentally different and apparently hitherto unacknowledged methods of assessment. Each assessor has access to exactly the same evidence; the difference between them consists entirely in what they do with that evidence. Whilst the focus of one is restricted to evidence that has been prescribed the other, in contrast, could be said to take an expansive view of the evidence. Given that we stand in need of some appropriate terms let us refer to these two approaches as the prescriptive and expansive modes respectively.

Prescriptive and Expansive Modes of Assessment

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References

In the prescriptive mode the assessor's role is essentially that of passive, non-judgemental facilitator tasked with gathering rigidly prescribed, predetermined data. The assessor's function is essentially binary: to indicate the presence or otherwise of a given behavioural manifestation. Judgement in this mode is limited to judgements of identity, that is, the assessor will be required to determine whether a given manifestation corresponds to that specified, but there judgement stops. The assessor is not required to take into account any manifestation other than those specified; indeed the test instrument will often be designed to exclude this possibility. And it would certainly not be the assessor's role to speculate on the internal states which give rise to those manifestations or to give any consideration to the wider context or circumstances in which those manifestations appear.3

In contrast, in the expansive mode the assessor's role is essentially active and judgemental with judgements being potentially a matter of degree. Judgements in this mode are essentially judgements of significance: evidence is not pre-specified but discovered, revealed and afforded significance in the process of creating, either consciously or unconsciously, a ‘picture of a mind’ which in turn influences reflexively the assessor's judgement as to the significance of any given evidence. Whilst in the prescriptive mode each behavioural manifestation will be regarded as logically discrete and all things being equal of equivalent standing, in the expansive mode the weight afforded any given manifestation will depend entirely on its significance in the eyes of the assessor, with any one manifestation potentially able to defeat any number of contrary indications.

Considerations of relevance and fairness aside there may be few bounds placed on the type and extent of evidence which might reasonably be drawn upon by an assessor operating in the expansive mode, the focus of the assessor's attention being expanded either to the limits of practicability or to the point where the assessor is satisfied that the evidence obtained is sufficient.4 The assessor must then judge the relative merits of the diverse and potentially conflicting indications which present themselves, and it is here that the main complexity of judgement in the expansive mode arises. For there is no explicit or clear-cut rule which can be said to guide this process: judgement arises from the assessor's own innermost grasp of what it is to know and understand the thing in question, together with an awareness of the disparate ways in which this might be made manifest.

In informal situations we constantly vacillate between modes, continually comparing a person's behaviour with the ‘picture’ we have of that person and their capabilities, modifying the picture as appropriate. It is by this means that the teacher in our previous example is able to distinguish ‘behaviour’ from ‘understanding'—or, more properly understood, one particular instance of behaviour from the picture she has of the pupil's capabilities. But when it comes to the design or implementation of any formal assessment process then, for reasons of reliability and fairness, an explicit and conscious decision must be made as to that assessment's intended mode. There can be no equivocation, for as the Right/Wrong Scenario demonstrates, what counts as correct hangs on this. And it is in this sense that the distinction can again be regarded as fundamental, for the integrity of any formal procedure intended to determine what a person can do or what a person knows will be radically undermined if there is any uncertainty as to the mode being applied.

Now each mode of assessment has its own advantages and disadvantages. The prescriptive mode comes into its own where our interest lies in knowing whether a person can or cannot respond, perform or operate in a specific way regardless of any other considerations. In contrast, the advantage of assessment in the expansive mode is that it allows the assessor to draw on the fullest range of evidence to make the best possible judgement about what the candidate knows. Whenever it is imperative to make the best estimation of what a person knows, understands or can do, this is the mode we instinctively adopt.

The crucial thing about this distinction, then, is that it is concerned not with any ontological differentiation of inner states as against outward behaviour but simply with how we choose to treat the evidence. This would seem to have important implications for the issue of reliability. Wrongly perceived as a choice between manifest behaviours on the one hand and tentative inferences about inner mental states on the other, the former appears to have the distinct advantage as regards inter-assessor reliability. But when more properly perceived as a choice between restricting the assessment process to prescribed evidence as against drawing on the most expansive range of evidence, it becomes clear that any possible inter-assessor variance in the expansive mode is vastly outweighed by its being able to provide the best possible indication of what a person knows. Use of the expansive mode may certainly detract from the commensurability of judgements—i.e. different assessors might identify different reasons for their judgements or be inclined to articulate them in different ways. For this reason, and also because judgements in this mode may often be matters of degree, assessment in the expansive mode may require recourse to the grades and percentages characteristic of the old culture, not for reasons of comparison or selection but because it allows judgements to be made commensurable. The important thing here, however, is that by far the most significant threat to reliability for any assessment designed to determine what a person knows or what they can do is obfuscation of mode use, that is, a failure to make it clear which mode should be used in a given instance.

We are now in a position to give a more accurate characterisation of the ‘basic worry’ that underlies the critical position: the intuition that it is possible to have knowledge of a person's capabilities that belies what might otherwise be indicated by their behaviour.

I want to suggest that, properly understood, the ‘basic worry’ stems from an implicit recognition that an assessor operating in the prescriptive mode and being presented with the requisite prescribed behaviours will be duty bound to attribute knowledge even in the face of countervailing evidence. Such an assessor would have to record the successful completion of the task of, say, ‘testing an electrical generator’ even if the person being assessed happened to do or say something that gave the assessor cause for serious doubt as to whether they really did understand the task in hand. And as the Right/Wrong Scenario illustrates, this difficulty cannot be resolved by extending the number of test items. Seen thus, the ‘basic worry’ is a concern not about ‘criterion-referenced’ or ‘competence-based’ assessment but about prescriptive as opposed to expansive mode procedures.

It is now possible for us to disentangle the prescriptive/expansive distinction from some of the distinctions customarily used in connection with assessment. Certainly, it is possible to see here or there some partial correspondence; for example, the prescriptive mode might in some respects seem to correspond with criterion-referencing, were it not for the fact that what we call criterion-referenced assessment could be in either the prescriptive or the expansive mode, and it would be difficult to see how the expansive mode could be said to correspond with norm-referencing. Competence-based assessment might similarly take the form of either mode; so too with teacher assessment which differs from standardised tests more by virtue of being a longitudinal exercise than being necessarily associated with one or other mode. Some might see a partial correspondence with the reductionist/holistic distinction by which account, given a particular cluster of evidence e1 …… en, the choice is between treating each item of evidence separately or making a judgement about the whole. The essential difference here is that whilst the characterising feature of holism is synthesis, the expansive mode, in contrast, is distinguished by judgements of significance, since it is entirely feasible for the smallest evidential clue to trump the greater mass of evidence. From a design perspective, terms such as ‘reductionist’ and ‘holistic are’, as Hager and Beckett (1995) rightly note, ‘relative terms’ (p. 3) whereas the distinction between the two modes is certainly not one of degree: as we have seen, we must employ one mode or the other—there is no middle ground. Similarly with the objective /subjective marking distinction (see Fairbrother and Harrison, 2001, p. 188): whilst the expansive mode could perhaps be seen to involve a certain element of ‘subjectivity’ it would not be correct to say that one mode is more or less objective/subjective than the other. In the Right/Wrong Scenario the assessor operating in the expansive mode who judges the candidate negatively could hardly be accused of being merely ‘subjective’.

It is not without significance that these more abstract distinctions or categories correspond to some degree with the prescriptive/expansive mode distinction given that it replicates a basic, natural bifurcation in our facility to make sense of other human beings. Indeed, yet again we could say that this distinction is fundamental in the sense that many of the formal distinctions and categories traditionally used in connection with assessment can be seen to derive in part from this more elemental distinction.

Under current arrangements it is a moot point which mode might be applied in any given instance. But what seems clear is that the language in which the criteria, descriptors or competence statements are couched is likely to play a part in prompting the assessor to adopt one mode or the other. Whilst criteria specifying behaviours or other manifest outcomes will tend to prompt judgements of identity in the prescriptive mode, criteria centred on attributes of the person are likely to prompt judgements of significance in the expansive mode. Somewhat paradoxically, then, it is ontological differentiation of criteria that is likely to determine the kind of assessment adopted, although the modes of assessment thus prompted are not themselves ontologically differentiated, being differentiated only in terms of the stance taken with respect to the evidence.

However, when we look more closely at current practice it can be seen that the matter is not quite so straightforward. For instance, the requirement of the QCA's Teacher's Handbook for Writing (Levels 1–3) and Reading (Levels 1–2) that the teacher determine whether a child can ‘write imaginative, interesting and thoughtful texts’ (QCA, 2007, p. 23) would seem, on the face of it, to indicate that the expansive mode is called for—the teacher's role being to judge on the basis of any number of possible manifestations the extent to which the child has this capability. However on closer examination we see that this requirement is cashed out in more specific terms which require the teacher to determine whether the child's writing contains such things as ‘recognisable letters’, ‘time-related words’, or a ‘sequence of events’ (ibid.). In other words, the teacher is expressly required to assume the binary function characteristic of assessment in the prescriptive mode, the task being one of indicating whether or not the manifestations specified are present. Similarly with other so-called ‘assessment focuses’: in the guise of providing clearer, more specific guidance, expansive-mode styled descriptors are converted into prescriptive mode criteria. The language of the expansive mode is in effect appropriated, employed in the form of slogans which play little if any part in the assessment process but when positioned above groups of prescriptive mode criteria give the semblance of assessment in the expansive mode. This is by no means something unique to assessment in schools, for exactly the same strategy is evident in other sectors, such as in vocational education where ‘competence statements’ are similarly cashed out in terms of ‘performance criteria’. This unwitting deception has come to be one of the hallmarks of the encroaching political and managerial control of education in which any and every inclination to afford some place to the autonomous judgement of the assessor is likely to be offset by demands for commensurable and auditable evidence.

There are two broad conclusions to be drawn from all this. First, it seems clear that arrangements for assessment in a good many areas of education in the UK and elsewhere at present are methodologically insensitive to the kind of distinction that has been made here. In blunt terms, we simply do not know which mode of assessment is being used in any particular instance. This should be of no small concern if, as has been suggested, an explicit choice between the two modes is a necessary condition of fair and reliable assessment. An urgent reassessment of these arrangements would seem to be required insofar as they remain ambiguous as to their intended mode. Second, there is every indication that the use of assessment as an instrument of state control with all its concomitant demands for specificity and commensurability is likely to effect a shift in procedures towards assessment in the prescriptive mode. The misleading characterisation of these procedures variously as ‘criterion-referenced’, ‘competence-based’ and so on, has served to conceal their true nature and their possible limitations. More correctly characterised, it may well be that a good many of the arrangements currently used in the UK and elsewhere are of precisely the kind which will not provide the best indication of what learners know, what they can do, or what they have achieved.

Notes
  1. 1

    It is a remarkable yet little noticed fact that much of the polemic directed against the use of behavioural outcomes in education can be seen to conflate assessment and curriculum in this respect. John Stenhouse's (1978) thoroughgoing critique of the use of ‘behavioural objectives’ is just one example of the tendency to carry over the ontological distinction between behaviour and mind from the curriculum, where it is generally coherent, into assessment, where it becomes decidedly less so.

  2. 2

    See, for example, M. L. J. Abercrombie's (1989) classic survey of empirical work in this area, demonstrating the important role of judgement in perception.

  3. 3

    It is not insignificant that there are exceptions, for example cheating, where the facts relating to wider circumstances might not only be deemed relevant but might actually overturn the results of the test. This indicates how the requirement to disregard wider circumstances imposed by the prescriptive mode is an artificiality that cannot be sustained against all eventualities and how in the final analysis the expansive modus operandi will always take precedence.

  4. 4

    Such evidence could even include matters at some remove from the test instrument. For example, it would not be unreasonable if an examiner, perhaps initially impressed by the apparent originality of an undergraduate's essay, were to modify their judgement on discovering that the student had replicated the unpublished ideas of their tutor. To stress the point still further, it is even conceivable that an examiner's estimation of one candidate's work could be influenced by manifestations evident in the work of other candidates: as, for example, if the examiner were to discover the same ‘originality’ in the work of similarly influenced students.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Knowing Other Minds
  5. The Right/Wrong Scenario
  6. Prescriptive and Expansive Modes of Assessment
  7. References
  • Abercrombie, M. L. J. (1989) The Anatomy of Judgement (London, Free Association Books).
  • Brown, M. (1991) Problematic Issues in National Assessment, Cambridge Journal of Education, 21.2, pp. 215229.
  • Davidson, D. (2001) Knowing One's Own Mind, in his Subjective, Intersubjective, Objective: Philosophical Essays (Oxford, Oxford University Press), pp. 1538.
  • Davis, A. (1995) Criterion-referenced Assessment and the Development of Knowledge and Understanding, Journal of Philosophy of Education, 29.1, pp. 321.
  • Fairbrother, B. and Harrison, C. (2001) Assessing Pupils, in: J. Dillon and M. Maguire (eds) Becoming a Teacher (Buckingham, Open University Press).
  • Gipps, C. (1995) Beyond Testing: Towards a Theory of Educational Assessment (London, Falmer Press).
  • Goldstein, H. and Noss, R. (1990) Against the Stream, Forum, 33.1, pp. 46.
  • Hager, P. and Beckett, D. (1995) Philosophical Underpinnings of the Integrated Conception of Competence, Educational Philosophy and Theory, 27.1, pp. 124.
  • Hyland, T. (1997 ) Reconsidering Competence, Journal of Philosophy of Education, 31. 3, pp. 491503.
  • National Council for Vocation Qualifications (NCVQ) (1991) Criteria for National Vocational Qualifications (London, National Council for Vocational Qualifications).
  • Qualifications and Curriculum Authority [QCA] (2007) English Tasks: Teachers' Handbook. Writing (Levels 1–3) Reading (Levels 1–2) (London, Department for Education and Skills).
  • Radnor, H. (1988) GCSE—Does it Support Equality? British Journal of Educational Studies, 36.1, pp. 3748.
  • Ryle, G. (1949) The Concept of Mind (London, Hutchinson).
  • Stenhouse, L. (1978) An Introduction to Curriculum Research and Development (London, Heinemann).
  • Wittgenstein, L. (1968) Philosophical Investigations (Oxford, Blackwell).