SEARCH

SEARCH BY CITATION

Keywords:

  • Rating scales;
  • item response theory models;
  • clinimetrics;
  • pharmacopsychometric triangle

Abstract

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

Objective:  To consider applied psychometrics in psychiatry as a discipline focusing on pharmacopsychology rather than psychopharmacology as illustrated by the pharmacopsychometric triangle.

Method:  The pharmacopsychological dimensions of clinically valid effects of drugs (antianxiety, antidepressive, antimanic, and antipsychotic), of clinically unwanted effects of these drugs, and the patients’ own subjective perception of the balance between wanted and unwanted effects are analysed using rating scales assessed by modern psychometric tests (item response theory models)

Results:  Symptom rating scales fulfilling the item response theory models have been shown to be psychometrically valid outcome scales as their total scores are sufficient statistics for demonstrating dose–response relationship within the various classes of antianxiety, antidepressive, antimanic or antipsychotic drugs. The total scores of side-effect rating scales are, however, not sufficient statistics, implying that each symptom has to be analysed individually. Self-rating scales with very few items appear to be sufficient statistics when measuring the patients’ own perception of quality of life.

Conclusion:  Applied psychometrics in psychiatry have been found to cover a pharmacopsychometric triangle illustrating the measurements of wanted and unwanted effects of pharmacotherapeutic drugs as well as health-related quality of life.


Clinical recommendations

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References
  •  In dose–response trials of psychotherapeutic drugs symptom rating scales fulfilling the modern psychometric models are needed (total score a sufficient statistic).
  •  To evaluate side-effects of pharmacotherapeutic drugs the individual items have to be analysed separately.
  •  The pharmacopsychometric triangle includes the wanted and unwanted effects of pharmacotherapeutic drugs as well as patient-perceived quality of life.

Additional comments

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References
  •  Modern psychometrics is the testing of perceptual knowledge (about the severity of clinical states) by item response theory models.
  •  Classical psychometrics such as factor analysis should be restricted to psychiatric issues still difficult to describe in clearly formulated symptoms.

Introduction

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

In trials with psychotherapeutical drugs we have to equally evaluate the clinical effects on the target symptoms for which the drug has been approved by the regulatory authorities, the side-effects or unwanted clinical effects, and finally the perceived well-being as assessed by the patients themselves during the treatment period.

The aim of this study was to evaluate the psychometric properties of the rating scales developed to evaluate the pharmacopsychometric triangle covering: the target symptoms specific for the drug under examination, the possible side-effects, and the subjective well-being as perceived by the patients themselves.

Material and methods

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

Rating scales

Figure 1 shows the pharmacological triangle in which A denotes the rating scales (1) covering the clinically wanted effects of psychotherapeutic drugs: The Hamilton Anxiety Scale (HAM-A) for antianxiety drugs; the Hamilton Depression Scale (HAM-D) for antidepressive drugs; the Bech-Rafaelsen Mania Scale (MAS) for antimanic drugs; and the Brief Psychiatric Rating Scale (BPRS) for antipsychotic drugs.

image

Figure 1.  The psychometric triangle: A = scales measuring wanted effects, B = scales measuring unwanted effects, and C = scales measuring health-related quality of life.

Download figure to PowerPoint

In Fig. 1, B denotes the rating scales covering the clinically unwanted effects of psychotherapeutic drugs, namely the UKU Side Effect Rating Scale which covers the potential side-effects of the antianxiety, antidepressant, antimanic, and antipsychotic drugs (2, 3).

In Fig. 1, C denotes the health-related quality of life assessment or self-perceived psychological well-being as evaluated by the patients when making a balanced assessment of the wanted vs. unwanted effects of the psychotherapeutic drugs (the WHO-5) (4–6).

Psychometric analysis

In the psychometric literature (Fig. 2), we refer to the classical tests (factor analysis and Cronbach’s α coefficient) and to the modern tests (item response theory models).

image

Figure 2.  A comparison between classical and modern psychometrics with reference to high vs. low degree of redundancy.

Download figure to PowerPoint

The classical tests were accepted by Hamilton when he introduced the HAM-A and HAM-D as outcome measurements in randomized clinical trials of antianxiety or antidepressant drugs respectively (7). If the items of a scale include overlapping symptoms because they are incompletely described, the classical psychometric tests need relatively many items to describe such clinical effects as the antianxiety or antidepressive effect.

In the classical psychometric tests the term redundancy is used as in the field of information theory to express the degree to which the message of an item is only a hint because it is difficult to describe the symptoms captured by this item (8). In the modern psychometric test the term local dependency between items is used to express the degree to which they overlap. For example, in the Montgomery Åsberg Depression Rating Scale (MADRS) (9), the items of apparent and reported mood overlap to a great degree. As the total score is consequently an insufficient statistic (Fig. 2), the outcome of the factor analysis is to use the factor scores (10, 11).

Therefore, the modern psychometric tests (item response theory models) are attempts to exclude redundancy or local dependence between items. Accordingly each item needs to give innovative information about the dimension being measured. To fulfil the item response theory models implies that total score of the items in the rating scale is a sufficient statistic (12–14).

Within the pharmacopsychometric triangle it was hypothesized that the rating scale measuring the clinically wanted effects (antianxiety, antidepressive, antimanic, or antipsychotic) should fulfil the item response theory model while the hypothesis as to the rating scale measuring clinically unwanted effects was that response theory models should not be fulfilled because according to the WHO Classification of side-effects (15), each item belongs to a quite different organ class.

Concerning the quality of life scales it was hypothesized that the total score of the positively formulated items should be a sufficient statistic.

Results

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

Rating scales covering the clinically effects of psychotherapeutical drugs

Antipsychotic scales. Figure 3 shows the subscales of the most widely used rating scales developed to measure the wanted effects of the psychotherapeutic drugs.

image

Figure 3.  Rating scales sensitive to measure outcome of pharmacotherapeutic drugs In brackets [ ] the original item numbering is indicated.

Download figure to PowerPoint

The BPRS (16, 17) was originally developed to include symptoms responsive to treatment with antipsychotic drugs (e.g. chlorpromazine) or antidepressants (e.g. imipramine). In the original study (17) which used factor analysis to measure the severity of schizophrenic states, such BPRS items as ‘blunted affect’ or ‘unusual thought content’ (Fig. 2) were not identified statistically but were added to fulfil the clinical validity of the scale (17). In the much expanded versions of BPRS, the Positive and Negative Symptom Scale (PANSS) the item of ‘blunted affect’ is considered a negative symptom of schizophrenia (18), while the item of ‘unusual thought content’ is included in the positive symptom of general delusions. By use of item responses theory models (Rasch analysis) we have shown that the antipsychotic subscale of BPRS (Fig. 3) is a unidimensional measure of antipsychotic activity, i.e. the total score is a sufficient statistic (19, 20). This has been confirmed by Hafkenscheid (21, 22).

Antimanic scales.  The MAS has been found to fulfill the Rasch analysis implying that the total score is a sufficient statistic for measuring severity of manic states (23). Using the nonparametric item response theory model developed by Mokken it was found that the MAS in contrast to the Young Mania Scale (24) is a unidimensional scale (25). In this study, we were able to demonstrate a plasma concentration vs. clinical response relationship in manic patients for the MAS, but not for the Young scale when using a fixed dose of 20 mg olanzapine over 2 weeks of therapy (25). This was also found for a fixed dose of 10 mg haloperidol (26). It has actually been demonstrated that the BPRS is more sensitive than the Young Mania Scale in trials of antimanic treatment (27), but Licht et al. (28) found the MAS superior to BPRS in discriminating between risperidone and zuchlophentixol in manic patients.

Antidepressive scales.  In Overall’s original work (29) leading to the publication of the BPRS he actually identified the following items to cover the dimension of manifest depression on which antidepressants seemed to work: depressed mood, guilt, psychomotor retardation, anxiety, subjective experience of impaired functioning or loss of interests, and general preoccupation with somatic health. This antidepressive subscale of the BPRS is shown in Fig. 3.

As one of the first to do so, Pichot (30) recommended the use of this BPRS depression subscale in trials of antidepressants as he considered the subscale more clinically valid than the HAM-D17. The BPRS has been compared to HAM-D17 in a few randomized clinical trials of antidepressants, and in these been found more sensitive than the HAM-D17 (31–33).

A closer look at the antidepressive BPRS subscale (Fig. 3) reveals its high correspondence with HAM-D6. In our attempt to evaluate the clinical validity of HAM-D17, we used experienced psychiatrists’ global perception of the severity of depressive states as the index of validity (34). The results showed that only the six items (HAM-D6) followed the global ratings (Fig. 3). We were able to show that not only the Hamilton raters, but also the experienced psychiatrists performing global assessments had a high inter-observer reliability (34). Among the experienced psychiatrists selected to make a global assessment of the current depressive state of the included patients was Erling Dein (1922–1975). Dein was considered to be among the top psychiatrists between 1960 and 1975 at the University of Copenhagen Psychiatric Hospital (Rigshospitalet) with regard to perceived clinical experience and high empathy (35), ‘knowing when to smile and when to not smile; what kind of tone to use and what not to use in the doctor–patient relationship’ (36). We then used item response theory models and not factor analysis to investigate to what extent the six symptoms in Fig. 3 measured one single dimension of depressive severity (12). We found that the 6-item subscale (HAM-D6) but not the total 17-item scale (HAM-D17) fulfilled the item response theory model, implying that the total score is a sufficient statistic only for the HAM-D6 (i.e. a profile of the individual items is not necessary). Finally, we showed (13) that during trials of antidepressants the change in the HAM-D6 scores but not in the HAM-D17 scores fulfilled the item response theory model.

The discrimination validity of the HAM-D6 has been demonstrated by Santen et al. (37) who analysed all the 17 HAM-D items individually to determine their ability to discriminate between paroxetine and placebo in randomized clinical trials in patients with major depression. The most discriminating items were those included in the HAM-D6 (37).

Effect size has been used as the most adequate descriptive response statistic when demonstrating dose–response relationship of antidepressants in the acute therapy of depression, especially when different rating scales are evaluated. Effect size is defined as the difference in mean change from baseline to the respective time points in rating scores between the patient group treated with the experimental drug and the patient group treated with placebo, divided by the pooled standard deviation for the two groups of patients (38).

In an evaluation of all placebo-controlled fluoxetine trials in patients with DSM-III major depression we obtained an effect size of 0.38 on HAM-D6, but only 0.30 on HAM-D17 (39).

When fluoxetine was used as an active comparator in placebo-controlled venlafaxine trials, an effect size of 0.40 was obtained on HAM-D6 but was merely 0.24 on HAM-D17 (40). In both trials (39, 40), the fluoxetine dose was between 20 and 60 mg daily and no dose–response relationship was found.

When reanalysing the only citalopram study in which fixed doses were compared to placebo in patients with major depression (41), it was found that 40 mg citalopram daily over 6 weeks was the optimum dose with an effect size on HAM-D6 of 0.51 and on the full HAM-D17 of 0.39. The fixed dose of 20 mg citalopram daily was found insufficient with an effect size 0.21 on HAM-D6 and only 0.09 on HAM-D17 (41).

When reanalysing the dose–response trials with fixed doses of escitalopram vs. placebo in patients with major depression we found (42) that on the HAM-D6 10 mg escitalopram daily was insufficient with an effect size of 0.38, while 20 mg daily was sufficient with an effect size of 0.61.

In an evaluation of all placebo-controlled trials with mirtazapine in patients with major depression we found an effect size of 0.42 on HAM-D6, but of 0.49 on the full HAM-D17 (43). No dose–response relationship was found which is in accordance with Pinder and Zivkov (44). Mirtazapine, a drug with multidimensional action, has both an antihistamine and adrenergic action as well as a serotonin-reuptake 2A blocking activity. Therefore, this drug works on many of the HAM-D17 items which might be an advantage in the acute therapy of depression but a disadvantage in the continuation phase.

Antianxiety scale.  The HAM-A14 has been the major outcome scale in trials of antianxiety drugs over the last 50 years. Although Hamilton himself at the introduction of the HAM-A14 demonstrated various problems inherent in the use of the total score of this scale to discriminate between an experimental anxiolytic drug and placebo (7), the total score has traditionally been used as outcome statistic in trials of antianxiety drugs.

In 1997, Schweizer and Rickels (45) reviewed the past 15 years’ attempts to develop new anxiolytics. They concluded that with the single exception of busperidone the results were negative. One of their explanations of these limited results was that the HAM-A14 is somatically biased by its total score which will favour the benzodiazepines conventionally used as an active comparator in the placebo-controlled randomized trials. When Rickels et al. (46) showed that imipramine but not diazepam was superior on the psychic symptoms in HAM-A14 while both imipramine and diazepam were superior to placebo on the somatic symptoms, the use of the total score on the whole HAM-A14 was seen to be an insufficient statistic in trials of antianxiety drugs.

Figure 3 shows the most valid HAM-A items. Out of these items, five belong to the psychic factor of HAM-A while the item of muscular tension is the only somatic item. Sleep and depressed mood are among the traditional items in the HAM-A psychic factors; these are obviously not specific anxiety items.

The HAM-A6 antianxiety subscale in Fig. 3 contains all the items that were most discriminating between venlafaxine and placebo in patients with a DSM-IV diagnosis of generalized anxiety disorder (47).

Using effect size statistics Stahl et al. (48) have shown that in these placebo-controlled studies (47), the most inclusive HAM-A6 anxiety symptoms in GAD (anxious mood, psychic tension, behaviour at interview and muscular tension) discriminate between venlafaxine and placebo as early as after 1 week of therapy, concentration difficulties after 2 weeks of therapy, whereas phobic symptoms only responded after 6 weeks of therapy.

Pooling four placebo-controlled trials of pregabaline in patients with a DSM-IV diagnosis of generalized anxiety we have shown that the HAM-A6 but not HAM-A14 fulfils the item response theory model implying that the total score is a sufficient statistic (49). By use of effect size statistics, we showed that a dose of 150 mg pregabaline over 4 weeks of therapy is insufficient. In the dose range of 200–450 mg a clinically significant effect size was obtained, although with a plateau-like curve without any increase at the maximum dose of 600 mg daily (49).

Rating scales covering side-effects of psychotherapeutical drugs.  The most comprehensive rating scale for measuring unwanted effects of psychotherapeutic drugs is the Udvalg for Kliniske Undersøgelser (UKU) Side Effect Rating Scale (2, 3). This scale was developed by the Committee on Clinical Investigations (UKU) which was a standing committee under the Scandinavian Society of Psychopharmacology (2, 3). Actually, it was the Department of Drugs at the Swedish National Board of Health and Welfare who asked the UKU to develop the scale as it was increasingly felt that the side-effects of psychotherapeutic drugs should be as reliably evaluated as the wanted clinical effects. The UKU scale consists of 48 items. The assessment of these symptoms is made on a 4-point scale: 0 – the symptom is not or only doubtfully present; 1 – present in a mild degree; 2 – present in a moderate degree; 3 – present in a severe degree. The statistical analyses of the UKU side-effect scale showed that each item had to be treated individually as the total score is not a sufficient statistic (2).

Figure 4 shows the UKU subscale developed to cover (50), the various side-effects of the Specific Serotonin Reuptake Inhibitors (SSRIs). The individual items have to be analysed separately.

image

Figure 4.  The UKU Side Effect Rating Scale for Specific Serotonin Reuptake Inhibitors (SSRIs) (4, 48).

Download figure to PowerPoint

While the UKU Side Effect Rating Scale is a clinician administered scale, the somatic items in the Hopkins Symptom Checklist (SCL) can be considered to be a patient-administered self-report side-effect scale. When using the SCL version in which both the items of ‘sweating’ and ‘nausea’ are included (41), we demonstrated that the fixed dose of 10 mg citalopram clearly induced these two side-effects no more frequently than placebo, while 20 mg was often similar to 40 or 60 mg citalopram in inducing these side-effects during the first weeks of therapy (41). As 40 mg citalopram was clearly superior to 20 mg citalopram as regards antidepressive effect, it is important to titrate the dose of this drug over the first weeks of therapy.

Measurement of patient-perceived well-being or health-related quality of life.  Health-related quality of life is considered to be a patient-administered evaluation of the weighted balance between wanted and unwanted effects of the pharmacological drug under examination. In health-related quality of life ratings the patients themselves are the best source for perceiving their own inner feelings (50).

The World Health Organization initiated quality of life studies as early as in the 1980s and the approach was summarized by Sartorius (51), who emphasized that the measurement of quality of life is subject to no other difficulties than those inherent in all measurements of emotions, and that the measure of quality of life in treatment trials should constitute the goal of therapy. The most useful questionnaire for measuring psychological well-being in the 1980s was the Psychological General Well-being Scale (PGWB), developed by the research group (52) behind the health status questionnaire Medical Outcome Study (MOS) Short-Form 36 (SF-36; 53). However, both the PGWB and the SF-36 include a mixture of positively and negatively worded items, raising several psychometric problems (4). Based on studies with items covering the five most important of the positively worded PGWB items, the WHO-5 Well-being Scale was developed (6).

The WHO-5 includes the following items: i) feeling cheerful and in good spirits; ii) feeling calm and relaxed; iii) feeling active and vigorous; iv) feeling fresh and rested when waking up and v) feeling interested in the day-to-day activities. Although quality of life is often considered to be a rather individualistic, personal or idiographic issue, implying that the language of this dimension is not so much a matter of communication with other persons as a self-reflective language for the private activities of the individual person from the moment when he or she is waking up, perceiving and planning the day, having both emotional and physical appetite for the adaption of the various details in his or her personal life. The WHO-5 items seem to cover these basic life perceptions of well-being to which the private language can be translated into a simple language of communication (4, 5).

When indicating the goal of treatment with antidepressants it has become conventional to use the norms of the WHO-5 as found in general population studies (54, 55). The WHO-5 scores are to be converted to the conventional quality of life dimension which goes from 0 (worst imaginable state of quality of life) to 100 (best imaginable state of quality of life). Before treatment, depressed patients score approximately 25 on the WHO-5 and after 6 weeks of therapy the mean scores on the WHO-5 have increased to approximately 50. Although this increase is of statistical significance, we need to follow the patients in another 6-week period because typically after 12 weeks of therapy the WHO-5 will reach a score of approximately 70, which is the norm in the general population (4, 54). This is in accordance with the measurement of restored social functioning in trials of antidepressants (55).

Discussion

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

It is a paradox that the psychometric methods in psychiatry were developed by clinical psychiatrists (e.g. Hamilton) and not by psychologists (the original developers of psychometrics); this can be considered a matter of perceptual knowledge vs. conceptual knowledge. Thus, perceptual knowledge in clinical psychiatry can only be gained by direct experience via clinical contact with patients. Conceptual knowledge is a second hand description of the patient, leading to what Feinstein (56) has called the psychosocial interpretation problem in clinimetrics. In their book on blindness Magee and Milligan (57) discuss these two kinds of knowledge, perceptual vs. conceptual, which are fundamental for the problem of clinical vs. statistical validity.

As discussed elsewhere (15) with reference to Brunswik’s lense model of perception (58), the item response theory models are developed within the very theory of measurement of perceptual knowledge, while factor analysis is a statistical analysis to be based on correlation coefficients when items are only vaguely formulated (Fig. 2) to supplement conceptual knowledge. We use visual rating scales as graduated, optic lines (15). Symptom rating scales can be considered as a series of items laid down along a line, and the total score of the individual items indicates the severity of the dimension being measured. The item response theory model tests to what extent the sequence of items is relevant for the measurement issue. Thus, each item has to have decreasing prevalence with increasing severity of the state under examination. In depressive states, depressed mood, lack of interests, and tiredness have higher prevalence than such items as guilt feelings or psychomotor retardation. Demonstrating this prevalence ordering is an indispensable component of showing the total scale score to be a sufficient and meaningful measure of depressive states.

Over the 50 years with factor analysis of the HAM-D17 it has not been possible to generate a factor array in which the first factor included the core items of depression as contained in the HAM-D6 (59). However, Fleck et al. (60) performed the study in which the first factor included the HAM-D6 most completely. On the other hand, in the most recent attempt to compare factor analysis with the item response theory analysis (61), the authors confirmed that the HAM-D6 items were not identified by a one-factor analysis. However, when making the HAM-D6 factor analysis separately all six items loaded on one-factor, explaining approximately 50% of the variance (61). Moreover, the discrimination parameter in the item response theory analysis on the HAM-D6 was higher than 1.35 which signifies that the total score is a sufficient statistic (61).

In his 1976, editorial on the role of rating scales in psychiatry Hamilton (62) concludes that rating scales permit comparison between different patients and between different occasions for the same patient: “…They do this with adequate reliability and validity, which is more than can be said for ‘free’ case histories and diagnostic labels…”. The reference book of rating scales used by Hamilton in this editorial is Pichot’s book from 1974 on psychological measurements in psychopharmacology (63), in which it is clearly shown that symptom rating scales cover clinical or perceptual knowledge when measuring outcome of treatment, while the diagnostic system covers conceptual knowledge.

As discussed elsewhere (7), among the attempts to improve the item definition of the HAM the Clinical Interview for Depression (CID), developed by Paykel (64) is the most important. The CID introduces a 7-category definition of each Hamilton item (0 = not present to 6 = present of an extreme degree). This approach is similar to the item definition in the BPRS (16). The various subscales in Fig. 3 have now been collected in Clinical Interview for Depression and Related Syndromes (CIDRS).

It was actually Pichot (65) who with reference to Kraepelin tried to replace the term psychopharmacology with the term pharmacopsychology to focus on the psychological effects of the drugs on the mind rather than on the intermediate biological effects of this class of drugs on the brain or body. Kraepelin himself had to make clinical examinations of his psychiatric patients as systematically and without bias as the approach used in pharmacopsychology. Psychotherapeutic drugs were, however, not available to Kraepelin so he had to develop a classification system based on the ‘natural history’ of the psychopathological symptoms, their shared phenomenology over time (30).

The shared, perceptual phenomenology of clinical symptoms as analysed by Kraepelin, Hamilton or Pichot has been considered to be an atheoretical approach. However, as so clearly stated by Ban (66), these clinicians (e.g. Kraepelin, Hamilton or Pichot) implicitly refer to the medical model of disease which in itself is a theory inherent in the daily work of clinical psychiatrists.

Sydenham (1624–1689) introduced the medical disease model as a syndrome in which the constellation of related, perceived symptoms has a characteristic course (shared, perceptual phenomenology) from onset to remission in the single episode or with relapse and remission in a recurrent form or a more persistent form (67). This medical model is also implicitly in operation in the diagnostic systems released after 1980 (the DSM-III, DSM-III-R, DSM-IV or the ICD-10). Both Klerman (68) and Spitzer (69) who were behind the DSM-III system have confessed that it was their experience with symptom rating scales that had inspired them to employ the perceptual or non-aetiological approach to the various mental disorders.

The mathematical structure behind the current DSM-IV or ICD-10 systems is, however, that of conceptual algorithms, typically based on approximately 10 symptoms for the individual disorder. This mathematical structure provides a high inter-rater reliability, but is without direct reference to the clinical, perceptual dimensions of severity.

The mathematical structure behind the rating scales discussed in this study is the item response theory model which tests to what extent these scales correspond to the clinical, perceptual dimension (70). The total scores of these scales are sufficient statistics (i.e. it is not necessary to make item profile scorings of the individual items). The total scores indicate the severity of the patient’s disorder on the various dimensions (anxiety, depression, mania or schizophrenia), permitting comparison between patients and between different occasions for the same patient during a treatment period. It is, for example, the mathematic principle of transitivity within the item response theory model that concludes that if patient X has a higher score on the HAM-D6 than patient Y, while Y has a higher score than Z, then patient X is more depressed than patient Z. Whereas the rating scales measuring the wanted effects of pharmacotherapeutic drugs have been shown to fulfil the item response theory model, implying that the total score is a sufficient statistic in dose–response trials, the scales measuring unwanted effects have to be analysed by the profiles of the individual items, as these side-effect scales do not fulfil the item response theory model (2).

Health-related quality of life as shown in the pharmacopsychometric triangle (Fig. 1) has been developed to make the patients themselves perform an evaluation of the ratio between wanted and unwanted effects of psychotherapeutic drugs in terms of perceived, psychological well-being. Many of the quality of life scales developed over the last decade have had more psychologists than psychiatrists as architects. However, by using the term health-related quality of life the clinical or perceptual approach rather than the conceptual approach has been identified.

In a recent review on the development of common instruments for health surveys within the World Health Organization, Power (71) focuses on the specific quality of life measure instruments.The World Health Organization Quality of Life Assessment (WHOQOL) is especially preferred by Power (71), who by means of an item response theory model has reduced the 100 items in the WHOQOL to an 8-item subscale. However, this psychometric analysis has been performed by item response theory model as if to capture conceptual knowledge rather than perceptual knowledge, thereby making the psychosocial investigator error described by Feinstein (56). From a perceptual, clinical point of view the WHO-5 has been developed according to the theory of positive, perceived well-being with the different dimensions of quality of life in the SF-36 (53) and has been found to fulfil the item response theory model (4, 54).

In conclusion, the discovery of psychotherapeutic drugs such as antianxiety, antidepressive, antimanic or antipsychotic medication has implied that Kraepelin’s concept of pharmacopsychology is the most useful term in dose–response trials while psychopharmacology is much closer to neuropsychopharmacology. In clinical trials, the side-effects of the psychotherapeutic drugs and the ultimate effect on the patient-rated well-being scale are supplemental outcome measurements, to be obtained within the pharmacopsychometric triangle.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References

Professor Thomas A. Ban kindly reviewed this manuscript. He is convinced that in our daily clinical work we are undertaking pharmacopsychology and when describing this scientifically, are embarking on pharmacopsychometrics.

References

  1. Top of page
  2. Abstract
  3. Clinical recommendations
  4. Additional comments
  5. Introduction
  6. Material and methods
  7. Results
  8. Discussion
  9. Acknowledgements
  10. References
  • 1
    Bech P, Kastrup M, Rafaelsen OJ. Mini-compendium of rating scales for anxiety, depression, mania, and schizophrenia. Acta Psychiatr Scand 1986;73(Suppl. 326):136.
  • 2
    Lingjærde O, Ahlfors UG, Bech P, Denscher SJ, Elgen K. The UKU Side Effect Rating Scale. Acta Psychiatr Scand 1987;76(suppl. 334):7100.
  • 3
    Bech P, Ahlfors UG, Dencker SJ, Elgen K, Lingjærde O. UKU’s bivirkningsskala. Skala til registrering af uønskede virkninger af psykofarmaka. (UKU’s side effect scale for measuring unwanted effects of psychopharmacological drugs). Nord J Psychiatr 1986;40:147157.
  • 4
    Bech P, Olsen RL, Kjoller M, Rasmussen NK. Measuring well-being rather than absence of distress symptoms: a comparison of the SF-36 mental health subscale with the WHO-FIVE Well-Being Scale. Int J Meth Psychiatr Res 2003;12:8591.
  • 5
    Sisask M, Värnik A, Kolves K, Konstabel K, Wasserman D. Subjective psychological well-being (WHO-5) in assessment of the severity of suicide attempt. Nord J Psychiatr 2008;62:431435.
  • 6
    Bech P. Quality of life instruments in depression. Eur Psychiatry 1997;12:194198.
  • 7
    Bech P. 50 years with the Hamilton scales for anxiety and depression. Psychother Psychosom 2009;78:135142.
  • 8
    Martin-löft P. The notion of redundancy and its use as a qualitative measure of the discrepancy between a statistical hypothesis and a set of observational data. Scand J Statist 1974;1:318.
  • 9
    Montgomery SA, ÅSberg M. A new depression scale designed to be sensitive to change. Brit J Psychiat 1979;134:382389.
  • 10
    Spearman C. The abilities of man. New York: Macmillan, 1927.
  • 11
    Mckeown B, Thomas D. Q methodology. London: Sage Publications, 1988.
  • 12
    Bech P, Allerup P, Gram LF et al. The Hamilton Depression Scale. Evaluation of objectivity using logistic models. Acta Psychiatr Scand 1981;63:290299.
  • 13
    Bech P, Allerup P, Reisby N. Assessment of symptom change from improvement curves on the Hamilton Depression Scales in trials with antidepressants. Psychopharmacology 1984;84:276281.
  • 14
    Rasch G. Probalistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research, 1960 (reprinted by University of Chicago Press, Chicago, 1980).
  • 15
    Bech P. Rating scales for psychopathology, health status and quality of life. Berlin: Springer, 1993.
  • 16
    Overall JE, Gorham DR. The Brief Psychiatric Rating Scale. Psychol Rep 1962;10:799812.
  • 17
    Overall JE. The Brief Psychiatric Rating Scale in psychopharmacological research. In: PichotP, ed. Psychological measurements in psychopharmacology. Basel: Karger, 1974:6768.
  • 18
    Kay SR, Opler LA, Lindenmayer JP. Reliability and validity of the Positive and Negative Syndrome Scale for Schizophrenia. Psychiatr Res 1988;23:99110.
  • 19
    Bech P, Larsen JK, Andersen J. The BPRS psychometric developments. Psychopharmacol Bull 1988;24:118121.
  • 20
    Andersen J, Larsen JK, Schultz V. The Brief Psychiatric Rating Scale. Dimensions of schizophrenia, reliability and construct validity. Psychopathology 1989;22:168176.
  • 21
    Hafkenscheid A. Psychometric evaluation of a standardized and expanded Brief Psychiatric Rating Scale. Acta Psychiatr Scand 1991;84:294300.
  • 22
    Hafkenscheid A. Reliability of a standardized and expanded Brief Psychiatric Rating Scale. Acta Psychiatr Scand 1993;86:16.
  • 23
    Bech P. The Bech-Rafaelsen Mania Scale in clinical trials of therapies for bipolar disorder: 20-year review of its use as an outcome measure. CNS Drugs 2002;16:4763.
  • 24
    Young RC, Biggs JT, Ziegler VE et al. A rating scale for mania: reliability, validity and sensitivity. Br J Psychiatry 1978;133:429435.
  • 25
    Bech P, Gex-Fabry M, Aubry J-M, Favre S, Bertschy G. Olazapine plasma level in relationship to antimanic effect in the acute therapy of manic states. Nord J Psychiatry 2006;60:181182.
  • 26
    Gjerris A, Bech P, Broen-Christensen C, Geisler A, Klysner R, Rafaelsen OJ. Haloperidol plasma levels in relation to antimanic effect. In: UsdinE, DahlS, GramLF, LingjærdeO, eds. Clinical pharmacology and psychiatry. London: MacMillan, 1980:227232.
  • 27
    Mishory A, Yaroslavsky Y, Bersudsky Y, Belmaker RH. Phenytoin as an antimanic anticonvulsant: a controlled study. Am J Psychiatry 2000;157:463465.
  • 28
    Licht RW, Bysted M, Christensen H. Fixed-dosed risperidone in mania: an open experimental trial. Int Clin Psychopharmacol 2001;16:103110.
  • 29
    Overall JE. The dimensions of manifest depression. Psychiatr Res 1962;1:239245.
  • 30
    Bech P. P Pichot – a tribute to the European pharmacopsychologist on his 90th birthday. Eur Psychiatr Rev 2008;1:7680.
  • 31
    Feighner JP, Boyer WF, Meredith CH, Hendrickson GC. A placebo-controlled inpatient comparison of fluvoxamine maleate and impramine in major depression. Int Clin Psychopharmacol 1989;4:239244.
  • 32
    Politis AM, Papadimitriou GN, Theleritis CG, Psarros C, Soldatos CR. Combination therapy with amisulpiride and antidepressants. Prog Neuropsychopharmacol Biol Psychiatry 2008;32:12271230.
  • 33
    Norton KRW, Sireling LI, Bhat AV, Rao B, Paykel ES. A double blind comparison of fluvoxamine, imipramine and placebo in depressed patients. J Affect Dis 1984;7:297308.
  • 34
    Bech P, Gram LF, Dein E. Qualitative rating of depressive states. Acta Psychiatr Scand 1975;51:161170.
  • 35
    Lunn V. Afsind (Insanity). Copenhagen: Gyldendal, 1987.
  • 36
    Lehman H. Psychopharmacotherapy. In: HealyD, ed. The psychopharmacologists I. London: Chapman & Hall, 1996:159186.
  • 37
    Santen G, Gomeni R, Danhof M, Pasqua OD. Sensitivity of the individual items of the Hamilton Depression Scale to response and its consequence for the assessment of efficacy. Psychiatr Res 2008;42:1001009.
  • 38
    Hedges LV, Olkin I. Statistical methods for meta-analysis. Orlando, FL: Academic Press, 1985.
  • 39
    Bech P, Cialdella P, Haugh M et al. A meta-analysis of randomised controlled trials of fluoxetine versus placebo and tricyclic antidepressants in the short-term treatment of major depression. Br J Psychiatry 2000;176:421428.
  • 40
    Entsuah R, Shaffer M, Zhang J. A critical examination of the sensitivity of unidimensional subscales derived from the Hamilton Depression Rating Scale to antidepressant drug effects. J Psychiatr Res 2002;36:437448.
  • 41
    Bech P, Tanghøj P, Andersen HF, Overø K. Citalopram dose–response revisited using an alternative psychometric approach to evaluate clinical effects of four fixed citalopram doses compared to placebo. Psychopharmacology 2002;163:2025.
  • 42
    Bech P, Tanghøj P, Cialdella P, Friis Andersen H, Pedersen AG. Escitalopram dose–response revisited: an alternative psychometric approach to evaluate clinical effects of escitalopram compared to citalopram and placebo in patients with major depression. Int J Neuropsychopharmacol 2004;7:283290.
  • 43
    Bech P. Meta-analysis of placebo-controlled trials with mirtazapine using the core items of the Hamilton Depression Scale as evidence of a pure antidepressive effect in the short-term treatment of major depression. Int J Neuropsychopharmacol 2001;4:337345.
  • 44
    Pinder RM, Zivkov M. On demonstrating the correct dose for new antidepressants. In: BalantLP, BenitezJ, DahlSG, GramLF, PinderRM, PotterWZ, eds. Clinical pharmacology in psychiatry: finding the right dose of psychotropic drugs. Brussels: European Commission, 1998:3142.
  • 45
    Schweizer E, Rickels K. Placebo response in generalized anxiety: its effect on the outcome of clinical trials. J Clin Psychiatry 1997;58(Suppl. 11):3038.
  • 46
    Rickels K, Downing R, Schweizer E, Hassman H. Antidepressants for the treatment of generalized anxiety disorder. A placebo-controlled comparison of imipramine, trezodone and diazepam. Arch Gen Psychiatr 1993;50:884895.
  • 47
    Meoni P, Salinas E, Brault Y, Hachet D. Pattern of symptom improvement following treatment with venlafaxine XR in patients with generalized anxiety disorder. J Clin Psychiatry 2001;62:888893.
  • 48
    Stahl SM, Ahmed S, Haudiquet V. Analysis of the rate of improvement of specific psychic and somatic symptoms of generalized anxiety disorder during long-term treatment with Venlafaxine ER. CNS Spectr 2007;12:703711.
  • 49
    Bech P. Dose–response relationship of pregabalin in patients with generalized anxiety disorder. A pooled analysis of four placebo-controlled trials. Pharmacopsychiatry 2007;40:163168.
  • 50
    Bech P. Quality of life in the psychiatric patient. London: Mosby-Wolfe, 1998.
  • 51
    Sartorius N. Cross-cultural comparisons of data about quality of life. In: AaronsonNK, BeckmannJ, eds. The quality of life in cancer patients. New York: Raven Press, 1987:1924.
  • 52
    Dupuy HJ. The Psychological General Well-Being (PGWB) Index. In: WengerNK, MattsonME, FurbergCD, ElinsonJ, eds. Assessment of quality of life in clinical trials of cardiovascular therapies. New York: Le Jacq Publishing, 1984:184188.
  • 53
    Ware JE. The SF-36 health survey. In: SpilkerB, ed. Quality of life and pharmacoeconomics in clinical trials, 2nd edn. Philadelphia, PA: Lippincott-Raven, 1996:337345.
  • 54
    Bech P, Lunde M, Bech-Andersen G, Lindberg L, Martiny K. Psychiatric outcome studies (POS): does treatment help the patients? A Popperian approach to research in clinical psychiatry. Nord J Psychiatry 2007;61(suppl. 46):480.
  • 55
    Bech P. Social functioning. Should it become an endpoint in trials of antidepressants? CNS Drugs 2005;19:313324.
  • 56
    Feinstein AR. Clinimetrics. New Haven, CT: Yale University Press, 1987.
  • 57
    Magee B, Milligan M. On blindness. Oxford: Oxford University Press, 1995.
  • 58
    Brunswik E. Perception and the representative design of psychological experiments. Los Angeles, CA: University of California Press, 1956.
  • 59
    Bech P. Fifty years with the Hamilton scales for anxiety and depression. Psychother Psychosom 2009;78:202211.
  • 60
    Fleck MPA, Poirter-Littre MF, Guilfi G-D. Factorial structure of the 17-item Hamilton Depression Rating Scale. Acta Psychiatr Scand 1995;92:168176.
  • 61
    Uher R, Farmer A, Maier W. Measuring depression: comparison and integration of three scales in the GENDEP study. Psychol Med 2008;38:289300.
  • 62
    Hamilton M. The role of rating scales in psychiatry. Psychol Med 1976;6:347349.
  • 63
    Pichot P. Psychological measurements in psychopharmacology. Basel: Karger, 1974.
  • 64
    Paykel ES. The clinical interview for depression. In: SartoriusN, BanTA, eds. Assessment of depression. Berlin: Springer, 1986:304315.
  • 65
    Pichot P. A century of psychiatry. Paris: Editions Roger Dacosta, 1983.
  • 66
    Ban T. Prolegomenon to clinical prerequisite: psychopharmacology and the classification of mental disorders. Prog Neuropsychopharmacol Biol Psychiatry 1987;11:527580.
  • 67
    Engle RL. Medical diagnosis: past, present and future II. Philosophical foundations and historical development of our concepts of health, disease and diagnosis. Arch Intern Med 1963;112:520529.
  • 68
    Klerman GL. The contemporary American scene: diagnosis and classification of mental disorders, alcoholism and drug abuse. In: SaroriusN, JablenskyA, RegierDA, BurkeJD, HirshfeltRMA, eds. Sources and tradition of classification in psychiatry. Bern: Hogrefe & Huber, 1990:93137.
  • 69
    Spitzer RL. A manual for diagnosis and statistics. In: HealyD, ed. The psychopharmacologist’s III. London: Arnold, 2000:415430.
  • 70
    Leon AC, Shear MK, Portea L, Klerman GL. Effect size as a measure of symptom-specific drug change in clinical trials. Psychopharmacol Bull 1993;29:163167.
  • 71
    Power M. Development of a common instrument for quality of life. In: NosikovA, GudexC, eds. EUROHIS Developing common instruments for health surveys. Amsterdam: IOS Press, 2003:145163.