SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

Event data is the preferred method of characterizing directed-dyadic behavior through time and is a very versatile approach able to handle both state and nonstate actors. By scaling and aggregating values on a conflict and cooperation continuum, event data can provide a net measure of conflict between two parties for a set time interval. The CAMEO coding scheme was created to address structural flaws in the WEIS coding scheme and to handle better the post-Cold War environment. However, no systematic study has been completed for assigning fixed-weight conflict–cooperation scale values to the newer coding scheme, leaving an ad hoc transliteration of Goldstein scale values for WEIS as the best option. This paper reports the results of a psycho-physical magnitude scaling survey of 158 students from two universities where the students scaled CAMEO categories for conflict and cooperation. In addition to providing empirically based scale values for CAMEO, the paper also tests whether or not conflict and cooperation exist on a single continuum and whether or not a gender difference exists in perceptions of conflict and cooperation.

Event data is a preferred method of characterizing directed dyadic behavior through time and is a very versatile approach able to handle both state and nonstate actors. Researchers can capture behavior of interest by creating coding schemes that identify relevant actions. Then, by scaling these actions on a conflict and cooperation continuum and aggregating the resulting scale values, event data can also provide a net measure of conflict between two parties for a set time interval.

The World Events Interaction Survey (McClelland 1976), or WEIS, has been the most prominent coding scheme that remains in use. The CAMEO coding scheme (Gerner, Schrodt, and Yilmaz 2009) was created to address structural flaws in WEIS coding and to handle better the post-Cold War environment. However, no systematic study has been completed for assigning fixed-weight conflict–cooperation scale values to the newer coding scheme, leaving an ad hoc transliteration of Goldstein's scale values for WEIS (Goldstein 1992) as the best option. While these values exhibit face validity, little in-depth study has been undertaken to scientifically validate these values.

This paper reports the results of a psychophysical magnitude scaling survey of 157 students from two universities, the University of Memphis and the University of West Florida, where the students scaled CAMEO categories for conflict and cooperation. In addition to providing scientific, empirically based scale values for CAMEO, the paper also tests whether or not conflict and cooperation exist on a single continuum and whether or not a gender difference exists in perceptions of conflict and cooperation.

Scaling Event Data

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

Event data reduces natural language text transitive verb sentences to the essential features of who did what to whom, when, and increasingly where (Schrodt 2012). This involves many-to-one mapping of actions, represented by verb phrases contained in a verb dictionary, to a fixed scheme of verb categories. These verb categories are the heart of any event data coding system, and the verb coding scheme becomes the primary means of differentiating event data sets. While many different data sets exist, only a handful of coding schemes have made their way into the popular lexicon of international relations. These are Charles McClelland's World Events Interaction Survey (WEIS), Edward Azar's Conflict and Peace Data Bank (COPDAB), the Kansas Event Data Systems' Conflict and Mediation Event Observations (CAMEO), and VRA Inc.'s Integrated Data for Events Analysis (IDEA) (McClelland 1976; Azar 1980; Bond, Bond, Churl, Jenkins, and Taylor 2003; Gerner et al. 2009). When verb dictionaries are combined with actor dictionaries, which are also many-to-one mapping exercises, natural language texts, such as news articles, can be reduced to the form of who did what to whom and when.

Importantly, natural language sentences are mapped into categorical variables. While one can look at event counts across specified time periods, the analysis techniques for this type of data are somewhat limited compared to the more familiar arsenal of today's social scientists. Consequently, even early in the history of event data analysis, scholars turned to unidimensional conflict–cooperation scales for representing the nature of relations in a directed dyad. While McClelland rejected such approaches (McClelland 1983), Azar and his collaborators readily embraced scaling of the COPDAB coding scheme (Azar and Sloan 1975; Azar 1980, 1982).

Regardless of McClelland's own position, others worked to create scales that would allow the use of a larger range of analysis techniques on WEIS-coded data. Vincent (1979) developed the first systematic scale, but this scale only applied to the macro, two-digit, codes for WEIS. Goldstein's scale quickly became the most widely used scale with its publication in the Journal of Conflict Resolution in 1992 (Goldstein 1992). Both the high visibility of JCR and Goldstein's scaling of the more specific three-digit codes prompted users to make the switch. The appearance of the Kansas Event Data System (KEDS) automated/assisted coding software dramatically increased the availability of event data and also solidified the dominance of the WEIS coding scheme. The initial verb dictionaries distributed by the KEDS project used the WEIS coding scheme, which, when coupled with the convenience of Goldstein's scale, led to a dramatic resurgence in the analysis of WEIS-coded data.

Problematically though, the WEIS coding scheme was designed as a “first shot,” a system to be refined (Schrodt 2012). This refinement of the categories underlying the WEIS scheme never really occurred, and its lack virtually assured that structural issues would exist. The KEDS research team, led by Philip Schrodt and Deborah Gerner, initially attempted to address perceived shortfalls in the WEIS scheme by adding additional three-digit categories of behavior.1 This required the subsequent ad hoc assignment of conflict–cooperation scale values based on extrapolation from the original Goldstein's scale.

Finally, in frustration with the structural issues associated with WEIS and unable to fully explore the effects of mediation through the lens of the WEIS categories, Gerner and Schrodt (Gerner et al. 2009) introduced CAMEO. In some sense, CAMEO represented the long overdue refactoring of the WEIS scheme; however, in another sense, CAMEO represented the paradigmatic shift in expected behavior following the end of the Cold War. 2

The scales developed by Vincent (1979), Azar (1980), Goldstein (1992), and Shellman (2004)3 all assign weights to individual events in order to create scales of conflict and cooperation.4 Some of these scales have been quite sophisticated in design, while others have been little more than asking a handful of colleagues and averaging their responses. On the whole, scholars have relied on respondents who have been asked to assign fixed weights to each event category to establish the level of conflict or cooperation that matches the category. Then, looking at measures of central tendency, the analyst can establish a general weight for each category. Additionally, the internal validity of the resulting scale values can be assessed by looking at the correlations of rankings by all of the participants.

Goldstein's scale has seen tremendous use, which has surely exceeded the author's expectations. The scale itself was the result of an afternoon of feedback from eight fellow faculty members at the University of Southern California. His colleagues were asked to rank the categories on a numerical scale ranging from the most conflictual act, −10.0, to the most cooperative act, +10.0. Even with such a limited sample size and limited range of response, Goldstein was able to tap into key underlying factors, and the scale showed clear face validity when applied to events. The resulting scores could be aggregated across fixed time periods to produce net conflict–cooperation scores, and a veritable cottage industry of analysis sprung into being.

Undoubtedly, one reason for Goldstein's success is that fundamentally, his colleagues were asked to estimate the magnitude of conflict and cooperation vested in various categories of behavior. Numerical estimation of magnitudes has a very strong record in psychological studies and has been shown to be valid across a wide range of fields. Within political science, Milton Lodge and his collaborators explored a number of applications for magnitude estimation in the late 1970s and early 1980s. Nonetheless, Goldstein's arbitrary choice to balance the scale and limit the values on both the cooperative and conflictual sides of the unidimensional scale undermined the full value of the exercise. Indeed, the validity of any underlying scale would be shorn up by removing the artificial limits and allowing respondents to assign whatever value seemed most appropriate.5

While the psychophysical magnitude scaling promoted by Lodge and his colleagues never garnered a large following in political science, the techniques themselves continue to experience widespread use in other fields. This is most likely due to the difficulty of managing psychophysical magnitude scaling projects and the overall acceptance within the field of the standard Likert scales used in public opinion polling (DeVellis 2012). Sulfaro and Crislip (1997) provide a notable exception in international relations by scaling perceptions of countries' hostility toward the United States.6

The introduction of CAMEO, a refactored and refigured coding scheme from WEIS, has to this point not produced an article or scale comparable to Goldstein's for WEIS. Scaling of CAMEO has been largely an ad hoc transliteration of the Goldstein's scale by researchers at the University of Kansas and, more recently, Penn State. Based on the broad success of psychophysical magnitude scaling for estimating the perceived magnitude of stimuli, this paper develops an empirically based scale of conflict and cooperation for CAMEO. In addition, a long-standing question has existed whether or not conflict and cooperation can even be considered to exist on a single continuum. Therefore, we also attempt to answer this question an extensive psychophysical magnitude scaling survey of students across two universities.

Psychophysical Magnitude Scaling

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

Sulfaro and Crislip (1997) argue that perceived hostility is a magnitude measure, and one could certainly believe that perceived conflict is also a magnitude measure as is perceptions of cooperation. Magnitude estimation (ME) provides an established means of measuring the intensity of people's feelings. Indeed, ME has been shown to follow a simple power function, R = kSβ, where R is the magnitude of the response, k is a proportionality constant, S is the stimulus intensity, and β is the psychophysical exponent (Lodge, Cross, Tursky and Tanenhaus 1975; Lodge and Tursky 1979).

In most psychophysical experiments, subjects estimate the intensity of a series of known test stimuli. Taking the geometric mean for each stimulus across respondents and then graphing the values with the known stimulus intensity on log/log coordinates, the points will typically fall along a line. Indeed, the “principle governing this relationship is simple and lawful: Equal stimulus ratios produce equal subjective ratios” (Lodge and Tursky 1979:378-379).

Social scientists, however, are seldom interested in people's perceptions of known intensities. Rather, the psychophysical law is used to measure the intensity of people's response to subjective conditions. Psychophysics experiments elicit comparisons between objects based on a metric provided by the reference category. Arguably, the measurements of these comparisons produce ratio-level data allowing for a wide range of analysis techniques not possible with categorical data.

Nonetheless, how can one know if the responses are indeed valid? The results cannot be compared to known stimuli intensities (Lodge and Tursky 1981). Certainly, one can look at general construct validity (Cardello, Schutz, Lesher, and Merrill 2005): Does the rank order of magnitude estimates correspond to the accepted semantic meanings of phrases? Researchers have determined that even subjective conditions can be demonstrated as valid measures of people's perceptions of intensity by matching respondents' estimates across two different psychophysical modalities, for example numerical estimation and line production. This process is known as cross-modality matching and is used extensively as a means of establishing internal validity (for example, Lodge et al. 1975; Lodge and Tursky 1979, 1981; Sulfaro and Crislip 1997; Hofmans, Theuns, Baekelandt, Mairesse, Schillewaert, and Cools 2007).

The issue of whether or not conflict and cooperation exist on a single dimension has been largely swept aside over the last thirty years. Do comment and consult have different meanings depending on their conflictual or cooperative contexts as suggested by McClelland (1983)? Or is Dixon's summary measure friendliness and hostility (Dixon 1983) a better approach? Both of these studies were published almost thirty years ago, neither of which provided a definitive answer.

Should cooperative acts cancel out the value conflictual acts, or should they be measured independently of one another? When data is scaled and aggregated on a single continuum, the practical effect is that cooperation and conflict do cancel out one another. For example, on the KEDS ad hoc scale, an action in the cue category “improving relations” has a scale value of +3.5, while an act of “structural violence” has a scale value of −7. Does the granting of diplomatic relations (+3.5) and an apology (+3.5) necessarily cancel out the destruction of property (−7) if they occur in the same week.

Psychophysical scaling may help answer these questions. Given the accepted practice in the field of treating the two as if they are on the same continuum, this becomes the null hypothesis to be disproved.

Hypothesis 1: The level of cooperation and conflict in categories of events should be proportional in both a cooperative and conflictual frame.

Constructing a conflict–cooperation scale through psychophysical scaling also allows the introduction of a second embedded experiment. A sizeable body of literature argues that a gender divide exists in relation to conflict (for example, Tessler and Warriner 1997; Caprioli and Boyer 2001; Nincic and Ninci 2002; Caprioli 2005). Therefore, one can reasonably ask whether or not statistically significant differences exist between the genders on perceptions of conflict and cooperation intensities. The below null hypothesis is tested for each of the CAMEO cue categories:

Hypothesis 2: No difference exists between male and female perceptions of the intensity of conflict and cooperation in the CAMEO cue categories.

In summary, scaling the CAMEO cue categories not only provides the scale values themselves but also gives an opportunity to explore two very interesting hypotheses found in the international relations literature. First, the existence of a single conflict–cooperation continuum can be tested, and second, one can test for gender-based differences on perceptions of conflict and cooperation intensity.

Research Design

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

A psychophysical magnitude scaling survey, 65 pages in length, was administered to students at the University of West Florida in the spring of 2003. Students were offered extra credit for completing the survey. Twenty-nine students completed the survey, six of whom were female. Given the possible differences in how the two genders might perceive different acts of conflict and cooperation,7 the survey was given to an additional 129 students at the University of Memphis in the spring of 2011, who also completed the survey for extra credit. Of these, 63 were female. Therefore, the total sample size is 158, with 88 males, 69 females, and one unidentified respondent. The age of the respondents ranged from 17 to 53, with an average age of 21.98.8 However, as survey respondents, undergraduate students are not the most dependable. An initial eighteen students (10 male and 8 female – 4 from the University of West Florida, 14 from the University of Memphis) left three or more fields blank and are excluded from the subsequent analysis.

The standard approach in conducting psychophysical magnitude experiments in the social sciences is first to ascertain whether or not each respondent is capable of making proportional magnitude estimates. This is done through a calibration phase where both numerical estimation and line production are used to evaluate a series of known stimuli. Frankly, if respondents are unable, or unwilling, to make reasonably accurate estimates of known stimuli, little reason exists to believe that their estimates of the intensities of subjective measures will be meaningful. Lodge and Tursky (1981) assert that about five percent of the population will be unable to grasp the concept of proportionality.

In the calibration phase, respondents were first asked to assign values based on the lengths of a series of lines (with true lengths of 2 mm, 4 mm, 15 mm, 26 mm, 50 mm, 100 mm, 225 mm, and 300 mm) given a specific reference line with a value of 50 (a 50-mm line). Respondents were then asked to draw a series of line lengths based on numerical values given a reference line (50 mm in length) with a value of 50.9 All of the values and lines were given on separate pages to prevent direct comparisons, and the respondents were asked not to look back at previous pages. Moreover, the pages were randomized within each section, so that the ordering of the pages, and thus stimuli, would not systematically bias the results. Given the resulting values, one can determine whether or not the individual respondents appear capable of making proportional magnitude estimates. Proportional judgments are indicated by a strong positive linear relationship between the log(base 10) of the stimulus and the log(base 10) of the response. Following Hofmans et al. (2007), a cutoff value of r2 > 0.70 between the judgments and the stimuli was established for both numerical estimation and line production. On this basis, six respondents were dropped from the remainder of the analysis, leaving an n of 134. 10

After the calibration section, the remaining four sections of the survey were randomly ordered. These sections were numerical estimation for cooperative CAMEO cue categories, numerical estimation for conflictual CAMEO cue categories, line production for cooperative CAMEO cue categories, and line production for conflictual CAMEO cue categories.11 Consult and comment were included as both cooperative and conflictual CAMEO cue categories in order to test whether or not conflict and cooperation exist on the same dimension. If a single conflict–cooperation continuum exists, then the ratios between CONSULT and COMMENT on a conflict continuum and a cooperation continuum should be statistically indistinguishable. In order to avoid biasing the embedded experiment, neither CONSULT nor COMMENT is used for reference points for the experiments. The reference for cooperation is the cue category APPROVE, and the reference for conflict is the cue category THREATEN.

If the CAMEO cue categories exist on a latent dimension (a single conflict–cooperation continuum) or even on two separate dimensions (separate conflict and cooperation continuums), cross-modality should establish the internal validity of the experimental results. As with any survey whose speed is controlled by the respondent, our psychophysics experiment ran the risk of students rushing to complete the survey without making any attempt to engage in proportional judgments. Following Hofmans et al. (2007), a second check was conducted to test whether or not the respondents were trying to sincerely and seriously answer the questions. The internal correspondence between the magnitude estimation and magnitude production for each of the CAMEO cue categories was measured using the correlation coefficient, and only those respondents for whom r was greater than 0.70 were retained. Surprisingly, only 50 of the initial 158 respondents pass the above tests.12

Interestingly, 16 of the 29 students at the University of West Florida made this cut, versus only 34 of 129 at the University of Memphis. Several possible reasons exist for this difference. The University of West Florida is in Pensacola, Florida, an area surrounded by military bases, and the community has a very large active and retired military presence for whom war has a deep and personal impact. Second, the first survey was undertaken during a period of intense national and international debate on whether or not to undertake military action against Iraq, which may have also increased the relevance of the survey. By the time the second survey was conducted, the wars in Iraq and Afghanistan were a given.

Additionally, the undergraduate students surveyed at Memphis were allowed to leave after completing the survey, which appears to have resulted in a large number of respondents caring more about writing an answer than writing their best answer. Finally, the survey was administered by the instructor to the students at the University of West Florida versus a graduate assistant at Memphis. The difference in location, timing, and survey administration between the two universities appears to have dramatically affected the results. Given these differences, I believe that the dramatic reduction in remaining sample size is justified and increases the validity of the study results. The log/log plot for the 50 respondents of numerical estimation and line production for each of the CAMEO cue categories is shown in Figure 1 and shows the expected linear relationship between the two modalities.13

image

Figure 1. Cross-modality matching for CAMEO cue categories on log-log coordinates

Download figure to PowerPoint

Importantly, psychophysical magnitude scaling suffers from known biases (DeCarlo 2005). However, these biases can be reduced, or even canceled, by taking the geometric mean of magnitude estimation and magnitude production for the same stimuli.14 Therefore, the geometric mean of each of the remaining respondent's numerical estimation and line production has been calculated for each of the CAMEO cue categories. This should provide the best estimate of each respondent's individual judgments regarding the intensity of cooperative and conflictual acts in CAMEO.

The resulting values are shown in Table 1 along with the “corrected” standard deviation for the geometric mean (Lodge and Tursky 1981; Sulfaro and Crislip 1997), and the upper and lower bounds for the 95% confidence intervals. Overall, the values show general construct validity (Cardello et al. 2005) in that they correspond to the commonly understood meanings of the terms.

Table 1. Psychophysical Scaling Intensities for CAMEO Cue Categories
CAMEO CategoryReferencePsychophysical Intensitya95% Confidence Interval
Lower BoundUpper Bound
  1. a

    “Corrected” standard deviations for the geometric mean are enclosed in parentheses.

  2. b

    Conventional violence was not included in the survey.

Cooperation
CommentApprove20.371 (16.603)15.60425.137
ConsultApprove31.133 (27.241)23.22739.038
ApproveApprove50  
Improve Relations (Cooperate)Approve72.342 (69.159)52.27192.413
Request/ProposeApprove39.541 (26.866)31.82947.254
AgreeApprove72.486 (58.077)55.63289.340
Provide AidApprove83.535 (68.267)63.937103.134
YieldApprove44.788 (43.531)32.29257.285
Conflict
CommentThreaten13.843 (10.243)10.90216.783
ConsultThreaten10.694 (8.069)8.29913.089
InvestigateThreaten23.938 (20.861)17.94929.926
DemandThreaten38.579 (24.116)31.65645.502
DisapproveThreaten22.330 (13.075)18.57726.084
RejectThreaten50.267 (39.204)38.76361.771
ThreatenThreaten50  
Civilian Direct ActionThreaten43.263 (33.694)33.59052.936
Military PostureThreaten58.056 (37.015)47.42968.682
Reduce RelationsThreaten44.624 (33.124)35.11554.134
Structural ViolenceThreaten95.086 (57.635)78.540111.632
Conventional ViolencebThreaten   
Conventional ForceThreaten118.329 (62.402)100.414136.243
Massive Unconventional ForceThreaten246.271 (207.195)185.471307.071

Hypotheses Tests

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

One of the fundamental questions that this study seeks to address is whether or not conflict and cooperation appear to exist on a single continuum. In order to test H1, the psychophysics experiment was designed to include both CONSULT and COMMENT in both the cooperative and conflictual sets of events. When the two categories are located on a cooperative continuum and measured against the term approve, the mean cooperative intensity for COMMENT is 20.4 (s = 16.6) and CONSULT is 31.1 (s = 27.2). When located on a conflictual continuum, the mean conflictual intensity for CONSULT is 10.7 (s = 8.1) and that of COMMENT is 13.8 (s = 10.2). A difference in paired ratios test shows that this difference could easily exist by chance (p = .3795). Therefore, H1 cannot be rejected.

Both DeCarlo (2005) and Hofmans et al. (2007) argue that changing the orientation of the scale will change the perceived intensity. Almost astonishingly, changing the orientation of the scale from conflict to cooperation and from changing the reference point from THREATEN to APPROVE has essentially no real effect on the perceived intensities. The ratio between COMMENT and CONSULT on the cooperation scale is statistically indistinguishable from the inverse of the ratio on the conflict scale. This only strengthens conclusions in favor of a single conflict–cooperation continuum.

Turning to the second embedded experiment, a large body of literature leads one to expect that differences will exist between male and female respondents regarding perceptions of conflictualness and cooperativeness. The second hypothesis must be tested for each CAMEO cue category. On the basis of the literature, we would expect to reject the null hypothesis that no difference exists between male and female perceptions of the intensity of conflict and cooperation in the CAMEO cue categories. The responses by gender are shown in Table 2.

Table 2. Conflict–Cooperation Intensities by Gender
CAMEO Cue CategoryFemaleaMaleadfProb.
  1. a

    “Corrected” standard deviations for the geometric mean are enclosed in parentheses.

Cooperation Frame
Comment25.61 (21.46)18.59 (14.10)240.187
Consult36.66 (33.56)29.26 (23.15)250.282
Improve Relations (Cooperate)64.78 (59.66)79.94 (76.50)380.297
Request/Propose48.54 (28.91)37.83 (26.56)300.176
Agree71.76 (51.69)73.84 (62.50)390.393
Provide Aid72.83 (63.94)85.55 (60.46)310.313
Yield49.55 (56.20)41.18 (35.37)230.336
Conflict Frame
Comment13.10 (10.33)14.25 (10.35)330.369
Consult9.88 (8.47)11.20 (7.84)310.343
Investigate24.27 (26.28)23.76 (13.85)210.393
Demand33.54 (27.06)42.19 (22.47)280.208
Disapprove17.14 (15.20)25.70 (10.52)240.050
Reject49.26 (50.50)50.76 (27.53)200.391
Civilian Direct Action41.51 (40.86)46.46 (30.03)250.357
Military Posture48.95 (41.05)65.22 (34.47)280.149
Reduce Relations52.41 (40.69)42.71 (27.46)240.265
Structural Violence75.80 (54.89)102.79 (45.39)280.090
Conventional Force90.37 (42.28)139.54 (65.86)450.004
Massive Unconventional Force178.15 (136.39)300.27 (245.43)450.039

Existing research has shown that women are far less belligerent than men (Page and Shapiro 1992). This could be the case either because women find the costs of conflict to be excessive or because they view force than their male counterparts. The psychophysics experiment suggests the latter, although further research directly addressing this question would be needed to confirm this.

While no difference appears between the genders for the cooperative categories, female perceptions of conflictualness do indeed appear lower than for males. The female respondents in the survey show a remarkably lower perception of conflict intensity. This is the case for the cue categories of DISAPPROVE, USE OF STRUCTURAL VIOLENCE, USE OF CONVENTIONAL FORCE, and USE OF MASSIVE UNCONVENTIONAL FORCE. The differences for these categories are all statistically significant to varying degrees. In the case of massive unconventional force, the female respondents reported an intensity of 178.15 versus 300.27 for males. The score for females is approximately 41 percent lower than for males. Also surprising is the difference between males and females on USE OF CONVENTIONAL FORCE, where women perceived 35% lower conflict intensity compared to males.

In summary, psychophysical scaling can provide an interesting window onto gender and conflict. Statistically significant differences do exist between the genders on perceptions of conflict. Yet, while these results hopefully contribute to the ongoing discussion regarding gender and violence, I do not believe that they affect how one should use event data.

Creating a Single Continuum Scale

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

Returning to the combined results shown in Table 1, based on the comment/consult experiment, no statistical reason exists to reject a single conflict–cooperation continuum. While McClelland's (1983) assertion that context determines the nature of comment and consult has face validity, when the two are situated in both conflictual and cooperative frames, the ratio between the two behaviors remains statistically indistinguishable. Therefore, one can align the two scales on a single continuum.15 The resulting values can be placed in a conflict frame, cooperative frame, or a conflict–cooperation frame. Putting each of the cue categories within a conflict frame gives the ratio of conflict intensities across all cue categories.16 These values can be seen in the second column of Table 3.

Table 3. Scaling CAMEO on a Conflict–Cooperation Continuum
CAMEO Cue CategoryKEDS ad hoc Scale ValuesContinuum IntensitiesPsychophysical Scale Values
  1. a

    Conventional violence was not included in the survey; the scale value shown is a linear interpolation based on the ad hoc Kansas Event Data System (KEDS) scale.

Provide Aid73.9852.68
Agree64.5932.33
Improve Relations (Cooperate)3.54.6022.32
Approve46.6591.61
Yield57.4331.44
Request/Propose38.4201.27
Consult310.6941
Comment013.8430
Disapprove−222.330−1.25
Investigate−223.938−1.34
Demand−538.579−2.15
Civilian Direct Action−6.543.263−2.41
Reduce Relations−444.624−2.49
Threaten−650−2.79
Reject−450.267−2.81
Military Posture−7.258.056−3.24
Structural Violence−795.086−5.31
Conventional Violencea−9 −6.17
Conventional Force−10118.329−6.60
Massive Unconventional Force−10246.271−13.74

By putting the cue categories in a conflict–cooperation frame, valences can be associated with the resulting intensity scores (for example, Cardello et al. 2005). However, unlike Cardello et al. (2005) and previous scale design for WEIS (Goldstein 1992), I make no attempt to create symmetry in scale values. The category COMMENT appears to be the most neutral category based on the survey results and has been chosen as the midpoint in previous conflict–cooperation scales. Therefore, I have set the scale value for COMMENT at 0. In order to then proportionally set the remaining values in terms of conflict and cooperation, a single additional point must be fixed. Accordingly, CONSULT has been set at positive 1. The remainder of the scale values has then been calculated to maintain the ratios between categories, with cooperative actions assigned positive values and conflictual actions assigned negative values.17 These results are shown in Table 3.

After CAMEO appeared, people repeatedly requested scale values from the KEDS project that would be comparable to the Goldstein's scale values. Philip Schrodt transliterated the Goldstein's scale for CAMEO, and the resulting values are shown on alongside the psychophysics scale values in Table 3.18 Overall, the two scales show a reasonable, and unsurprising, correspondence, with a Pearson's r of 0.847. Both scales show face validity. However, some key differences deserve attention. The continuum for the ad hoc scale ranges from +7 to −10, while the psychophysics scale ranges from +2.68 to −13.74.

When the same series of randomly distributed events are scaled and aggregated, the KEDS-developed scale will produce values showing a greater level of cooperation than the psychophysics scale. See Table 4. Cooperative events have much greater power in the KEDS scheme to cancel out the impact of conflict, even the use of massive unconventional force. Three events that map to providing aid would outweigh two uses of massive unconventional force. I doubt that the leadership of a country that had been the target of two massive unconventional attacks followed by three offers of assistance would consider such a relationship cooperative. More than five events mapping to providing aid would be required to equal one use of massive unconventional force in the psychophysics scale. While still bordering on the absurd, this is more reflective of reality. In almost every case, cooperative events have three times the weight in the KEDS scale, versus the psychophysics scale.

Table 4. Comparing Net Conflict–Cooperation Monthly Scores for Key Directed Dyads from the KEDS Project CAMEO Data set, April 1979–November 2011
Directed DyadMeanStd. Dev.MinMaxMax Conflict as % of RangeMax Cooperation as % of Range
  1. n = 392; ISR-Israel, PSE-Palestinians, USA-United States.

Kansas Event Data System (KEDS) Ad Hoc Scale
ISR[RIGHTWARDS ARROW]PSE−101.701180.0184−1341.99193.65%6.35%
PSE[RIGHTWARDS ARROW]ISR−46.7114884.25725−80610288.77%11.23%
PSE[RIGHTWARDS ARROW]USA6.66581615.28579−2592.521.28%78.72%
USA[RIGHTWARDS ARROW]PSE8.10510217.89989−59124.832.1%67.9%
ISR[RIGHTWARDS ARROW]USA16.5326523.31069−53.7153.525.92%74.08%
USA[RIGHTWARDS ARROW]ISR20.1196422.96291−3012719.11%80.89%
Psychophysical Magnitude Scale
ISR[RIGHTWARDS ARROW]PSE−76.88814125.004−917.518530.2596.81%3.19%
PSE[RIGHTWARDS ARROW]ISR−36.6675858.52931−548.74024392.73%7.27%
PSE[RIGHTWARDS ARROW]USA1.6352046.033149−20.9127.7342.99%57.01%
USA[RIGHTWARDS ARROW]PSE2.2966587.327008−40.6637.0852.3%47.7%
ISR[RIGHTWARDS ARROW]USA4.1050269.921166−39.4754.7200141.90%58.1%
USA[RIGHTWARDS ARROW]ISR6.0060719.127523−28.8646.338.4%61.6%

On the other hand, in the KEDS scale, most of the conflictual events up to the level of use of conventional force have scale values that are approximately double those found in our scale. This will somewhat balance out the difference between the scales, but given differences, the psychophysics scale will show a net difference toward conflict compared with the KEDS scale for most data series. These thoughts are borne out by the actual numbers generated by scaling and aggregating events from the Middle East.

Table 4 shows descriptive statistics for the scaled and aggregated values from six different directed dyads from the Levant data set generated by the Penn State Event Data Project covering April 1979 through November 2011. Clearly, the actual values will differ for the two scales, but a reasonable question is will this just be by a scaling factor or will the difference represent a qualitative shift in reported behavior? For all six directed dyads, the psychophysics scale shows the full range of behavior shifting toward conflict compared with the KEDS ad hoc scale. If one divides the absolute value of the minimum reported monthly score (which is the maximum level of conflict) by the range of monthly totals for the dyad, one can compare the qualitative behaviors captured by both scales. The basic objective of doing this is to determine how much of the range lies on the conflictual end of the spectrum versus the cooperative end of the spectrum. While the Israeli-Palestinian dyad shifts the range of behavior only 3 to 4 percent toward conflict, the Palestinian-USA dyad shifts dramatically from 21 and 32 percent (PSE[RIGHTWARDS ARROW]USA and USA[RIGHTWARDS ARROW]PSE) of the range being on the conflictual side of the continuum to 43 and 52 percent, respectively. This represents a dramatic shift in perceived levels of conflict and cooperation. Using the new psychophysical scale, the Palestinian-USA dyad shows almost double the conflict for aggregated monthly totals than if one scales using the KEDS ad hoc values.

These shifts happen even though the two series have very high Pearson's correlation coefficients. These are shown in Table 5. While the Israeli-Palestinian dyads have a very high correspondence across the two scaling schemes (r = 0.9957 and r = 0.9917), the remaining values show a small drop-off. The remaining four dyads involving the USA all show a Pearson's r = 0.94 between the ad hoc-scaled monthly totals and the psychophysics-scaled monthly totals.

Table 5. Comparing Kansas Event Data System (KEDS) Ad Hoc Scale Values for CAMEO with the Psychophysical Scale Values for Key Directed Dyads from the KEDS Project CAMEO Data set, April 1979–November 2011
Directed DyadRatioaStd. Err.[95% Conf. Interval]Correlationb
  1. n = 392; ISR-Israel, PSE-Palestinians, USA-United States.

  2. a

    (KEDS/PSYCH).

  3. b

    All values significant at the p < .001 level.

ISR[RIGHTWARDS ARROW]PSE1.3227140.01422941.2947381.350690.9957
PSE[RIGHTWARDS ARROW]ISR1.2739180.01942751.2357221.3121130.9917
PSE[RIGHTWARDS ARROW]USA4.0764430.35181123.3847654.7681210.9427
USA[RIGHTWARDS ARROW]PSE3.5290850.23557770.23557773.9922420.9444
ISR[RIGHTWARDS ARROW]USA4.0274180.23764683.5601924.4946430.9485
USA[RIGHTWARDS ARROW]ISR3.3498840.09740833.1583753.5413940.9457

The ratios between the two scales for each of the six directed dyads also show remarkably different behaviors. One can conceptualize the conflict–cooperation scores as measuring the perceived intensity of behaviors and then examine how these perceived intensities change in relation to the scale and dyad. While the ratios of the values given by the two scales for the Israeli-Palestinian dyads (KEDS monthly totals/psychophysical monthly totals) are approximately 1.3, the other dyads range as high as 4. The Palestinian[RIGHTWARDS ARROW]USA directed dyad scaled and aggregated values for the ad hoc scale are roughly four times larger in magnitude than those found in the psychophysics scale. These results clearly suggest that the differences between the two scales will produce radically different summary measures of behavior depending on the nature of the dyad in question.

On a final note, the scale value for massive unconventional force deserves further discussion. While the value itself may still be too small, that is, surely two conventional military attacks do not equal the use of massive unconventional force, the psychophysics scale produces a value that has greater face validity than the ad hoc KEDs scale which equates the two types of events. Problematically from the perspective of someone scaling the behavior types, uses of massive unconventional force have been very uncommon, leaving respondents unsure of their intensity perceptions. This is apparent in the standard deviation for massive unconventional force, which is three times larger than that for any other conflictual event.

Conclusion

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References

In summary, this study is the first to apply psychophysical magnitude scaling to developing a conflict–cooperation score for event data. As such, I have successfully demonstrated that respondents can make proportional judgments on the differing levels of conflict and cooperation in a series of verb categories. Moreover, I have successfully developed and demonstrated a series of scale values capturing the perceived intensities of conflict and cooperation embedded in the CAMEO cue categories.

The resulting scale is a substantial improvement over the previously developed ad hoc CAMEO scale based on similarities with the Goldstein's scale. First, each category was assessed in terms of its relative level of conflict or cooperation based on separate reference categories. Second, the scale was developed specifically for CAMEO and tested respondents' perceptions of those categories, rather than matching similar categories from WEIS. Third, the scale has been developed using a transparent, systematic, and scientific approach.

In addition, this analysis demonstrates that conflict and cooperation have virtually identical differences in proportional intensity regardless of whether the two are being assessed in a conflict frame or cooperation frame. Given the existing literature argues that the change in frame and reference should alter the perceptions of intensity, these results can be considered quite strong. People do perceive both cooperation and conflict on the same continuum.

Finally, the results of the second embedded experiment support previous findings that a gender divide exists on the use of force. Male and female respondents showed significant divergence on perceptions of the conflictualness of each of the major categories for using force. These results may help provide one additional piece to the puzzle for how males and females approach the use of force.

In conclusion, while scaling events and aggregating those across time periods to produce net conflict–cooperation scores is one approach to event data, other approaches exist as well. For example, one can also use unweighted event counts or even treat events as symbol sequences. Both of these are perfectly valid choices and are well represented in the field. Whether one is using event data or not, the research question should dictate how the data is analyzed rather than the reverse.

Notes
  1. 1
  2. 2

    “CAMEO combined ambiguous WEIS categories and greatly expanded the level of detail on acts of violence, but contains excessive detail on events related to mediation, the original focus of the taxonomy” (Schrodt 2012, 24).

  3. 3

    The Intranational Political Interactions project codes the efforts of domestic actors and is therefore not discussed further.

  4. 4

    Schrodt (2007) is a notable exception to the “standard” approach of relying on external scaling of events. Schrodt uses item response theory to scale the events based on their relative frequencies. While an interesting exercise, this approach was driven as much by a desire to “normalize” event streams from different sources as much as it was an attempt to “scale” the events themselves.

  5. 5

    This is still assuming conflictual acts would have a negative valence, while cooperative acts would have a positive valence.

  6. 6

    I am not asserting that such magnitude estimation studies have not occurred at all, but rather, they have been very limited in number.

  7. 7

    Schaeffer and Bradburn (1989) find minimal differences between the genders on their ability to make proportional magnitude estimates. Therefore, any statistically significant differences between the genders in estimating conflict–cooperation intensities are likely to reflect real differences in perceptions.

  8. 8

    The level of education beyond a high school diploma is not a statistically significant factor in whether or not one can make proportional magnitude judgments (Schaeffer and Bradburn 1989). Given that all respondents are college students, education is not considered as a factor in the analysis of the results.

  9. 9

    The specific instructions were: All you need do is draw a horizontal line for each value. The bigger a number appears to be compared to your reference number, the longer the line you will draw compared to the reference line. For example, if one of the numbers is 100, then you would draw a line that seems about two times longer than the reference line. If the number is 500, then you would draw a line ten times longer. On the other hand, some of the numbers are smaller than 50. If the number is 25, then you would draw a line about one half as long. A line one-tenth as long would be drawn for the number 5. Draw a line for each number, whatever length seems appropriate, to express how the number compares to your reference number: the bigger the number, the longer your line compared to the reference line. The smaller the number, the shorter the line compared to the reference line. Once you begin, please do not check your previous lines. We are only interested in your general impressions.

  10. 10

    Four respondents struggled with line production, and two could not satisfactorily complete the numerical estimation part of the calibration. Interestingly, no respondent failed both parts of the calibration.

  11. 11

    The CAMEO category conventional violence was unintentionally left out of the original survey.

  12. 12

    Forty-seven of the remaining 50 respondents were from the United States. The non-US respondents were Canadian, German, and Serbian. The survey did not ask whether the respondent was a native English speaker.

  13. 13

    When regressing the log of line production against the log numerical estimates, the resulting coefficient for line production is 0.802 (std. error=0.023). Theoretically, this value should be statistically undistinguishable from a value of 1. Nonetheless, this is similar to the findings of Sulfaro and Crislip (1997) on perceived hostility; both studies find some slippage between line production and numerical estimation. The goal of cross-modality matching is to establish the internal validity of the experiment, and I believe that these results still show a strong enough linear relationship to demonstrate internal validity.

  14. 14

    If the bias in both magnitude estimation (b) and magnitude production (b') is symmetric, then the bias cancels. Otherwise, the estimated psychophysical exponent is affected by a factor of (b/b')1/2 (DeCarlo 2005, 889).

  15. 15

    In order to maintain the same ratios when mapping these intensity values to a single continuum, the intensity of the cooperation cue categories must be multiplied by a factor of 0.847 in order to align correctly with the conflict scale. For example, provide aid (83.535) becomes 83.535*0.847 = 70.7535. This is repeated for agree, improve relations, approve, request/propose, and yield. The unmodified values for consult and comment are used from the conflict frame.

  16. 16

    Maintaining the ratios between each cue category is the key in manipulating the values. Even though the conflict frame values have been used for CONSULT and COMMENT, the ratio between YIELD and CONSULT should remain the same. Therefore, to calculate the conflict intensity of YIELD using CONSULT (from the conflict frame) as a base value, one takes the value of CONSULT-conflict (10.694) * CONSULT-cooperation (26.369)/YIELD (37.935), which equals 7.433. This maintains the correct ratio of the cooperative frame while translating the values to the conflict frame.

  17. 17

    Placing the cue categories within a conflict–cooperation frame changes the transformation of values slightly from the procedure of placing the categories in a conflict frame. Each side of zero now represents either conflict or cooperation. By choice based on previous research, the value of COMMENT has been set to zero, and the value of CONSULT has been set to positive one, thereby placing CONSULT in cooperative behavior. Therefore, maintaining the proper ratio of YIELD to CONSULT requires CONSULT (1) * YIELD (37.935)/CONSULT-cooperation (26.369), which equals 1.4386. The translation of ratios created by setting the distance between CONSULT and COMMENT to one on the new scale requires that each value for conflictual behavior be rescaled by a value of CONSULT (10.694)/COMMENT (13.843). This is necessary to maintain the proper ratio of intensities. For example, INVESTIGATE becomes CONSULT (10.694)/COMMENT (13.843) * COMMENT (13.843)/INVESTIGATE (23.938) * −1, which equals −1.3358. Similarly, DISAPPROVE becomes CONSULT (10.694)/COMMENT (13.843) * COMMENT (13.843)/DISAPPROVE (22.33) * −1, which equals −1.2462.

  18. 18

References

  1. Top of page
  2. Abstract
  3. Scaling Event Data
  4. Psychophysical Magnitude Scaling
  5. Research Design
  6. Hypotheses Tests
  7. Creating a Single Continuum Scale
  8. Conclusion
  9. References
  • Azar, Edward E. (1980) The Conflict and Peace Data Bank (COPDAB) Project. Journal of Conflict Resolution 24: 143152.
  • Azar, Edward E. (1982) The Codebook of the Conflict and Peace Data Bank (COPDAB). College Park: Center for International Development, University of Maryland.
  • Azar, Edward E., and Thomas Sloan (1975) Dimensions of Interaction. Pittsburgh: University Center for International Studies, University of Pittsburgh.
  • Bond, Doug, Joe Bond, Oh Churl, J. Craig Jenkins, and Charles L. Taylor (2003) Integrated Data for Events Analysis (IDEA): An Event Topology for Automated Events Data Development. Journal of Peace Research 40(6): 733745.
  • Caprioli, M. (2005) Primed for Violence: The Role of Gender Inequality in Predicting Internal Conflict. International Studies Quarterly 49.2 (June): 161178.
  • Caprioli, M., and Mark A. Boyer (2001) Gender, Violence, and International Crisis. Journal of Conflict Resolution 45(4): 503518.
  • Cardello, Armand V., Howard G. Schutz, Larry L. Lesher, and Ellen Merrill (2005) Research Review: Development and Testing of a Labeled Magnitude Scale of Perceived Satiety. Appetite 44: 113.
  • DeCarlo, Lawrence T. (2005) On Bias in Magnitude Scaling and Some Conjectures of Stevens. Perceptions and Psychophysics 67(5): 886896.
  • DeVellis, Robert F. (2012) Scale Development: Theory and Applications,3rd edn. Three Oaks, CA: Sage.
  • Dixon, William J. (1983) Measuring Interstate Affect. American Journal of Political Science 27(4): 828851.
  • Gerner, Deborah J., Philip A. Schrodt, and Ömür Yilmaz. (2009) Conflict and Mediation Event Observations (CAMEO) Codebook. Available at http://eventdata.psu.edu/data.dir/cameo.html. (Accessed April 30, 2012.)
  • Goldstein, Joshua S. (1992) A Conflict-Cooperation Scale for WEIS Events Data. Journal of Conflict Resolution 36: 369385.
  • Hofmans, Joeri, Peter Theuns, Sven Baekelandt, Olivier Mairesse, Niels Schillewaert, and Walentina Cools (2007) Bias and Changes in Perceived Intensity of Verbal Qualifiers Effected by Scale Orientation. Survey Research Methods 1(2): 97108.
  • Lodge, Milton, David V. Cross, Bernard Tursky, and Joseph Tanenhaus. (1975) American Journal of Political Science 19.4 (November): 611649.
  • Lodge, Milton, and Bernard Tursky. (1979) Comparisons Between Category and Magnitude Scaling of Political Opinion Employing SRC/CPS Items. The American Political Science Review 73.1 (March): 5066.
  • Lodge, Milton, and Bernard Tursky. (1981) On the Magnitude Scaling of Public Opinion in Survey Research. American Journal of Political Science 25.2 (May): 376419.
  • McClelland, Charles (1976) World Event/Interaction Survey Codebook (ICPSR 5211). Ann Arbor, MI: Inter-University Consortium for Political and Social Research.
  • McClelland, Charles (1983) Let the User Beware. International Studies Quarterly 27: 169177.
  • Nincic, Miroslav, and Donna J. Ninci. (2002) Race, Gender, and War. Journal of Peace Research 39.5 (September): 547568.
  • Page, Benjamin, and Robert Y. Shapiro (1992) The Rational Public: Fifty Years of Trends in Americans' Policy Preferences. Chicago: University of Chicago Press.
  • Schaeffer, Nora Cate, and Norman M. Bradburn. (1989) Respondent Behavior in Magnitude Estimation. Journal of the American Statistical Association 84.406 (June): 402413.
  • Schrodt, Philip A. (2007) Inductive Event Data Scaling Using Item Response Theory. Paper prepared for Summer Meeting of the Society for Political Methodology, Pennsylvania State University, 1820 July.
  • Schrodt, Philip A. (2012) Precedents, Progress and Prospects in Political Event Data. International Interactions 38: 546569.
  • Shellman, Stephen (2004) Measuring the Intensity of International Political Events Data: Two Interval-Like Scales. International Interactions 30: 109141.
  • Sulfaro, Valerie A., and Mark N. Crislip. (1997) How Americans Perceive Foreign Policy Threats: A Magnitude Scaling Analysis. Political Psychology 18.1 (March): 103126.
  • Tessler, Mark, and Ina Warriner. (1997) Gender, Feminism, and Attitudes toward International Conflict: Exploring Relationships with Survey Data from the Middle East. World Politics 49.2 (January): 250281.
  • Vincent, Jack E. (1979) Project Theory: Interpretations and Policy Relevance. Lanham, MD: University Press of America.