Japanese midwives as psychiatric diagnosticians: Application of criteria of DSM-IV mood and anxiety disorders to case vignettes


Toshinori Kitamura, FRCPsych, Department of Clinical Behavioral Sciences (Psychological Medicine), Kumamoto University Graduate School of Medical Sciences, 1-1-1, Honjo, Kumamoto, Kumamoto 860-8556, Japan. Email: kitamura@kaiju.medic.kummoto-u.ac.jp


Abstract  It is believed in Japan that only psychiatrists are capable of providing reliable psychiatric diagnosis. However, more awareness of mental health issues related to perinatal care means that midwives are now required to have psychiatric diagnostic skills. The purpose of the present paper was to examine how well Japanese midwives agreed with a psychiatrist on diagnoses of different psychiatric disorders. Vignettes of 29 cases including DSM-IV mood disorders (major depressive disorder and bipolar disorder) and anxiety disorders (generalized anxiety disorder, panic disorder, phobic disorders, and obsessive-compulsive disorder) were distributed to 12 Japanese midwives. They decided the DSM-IV diagnoses independently and compared them with those made by an expert. The kappa coefficients of the diagnoses with a base rate of 0.1 or more were moderate to almost perfect (0.64–0.83). The accuracy of symptom assessment was also satisfactory. Appropriately trained Japanese midwives can use the diagnostic criteria for psychiatric disorders reliably. It is therefore feasible to dispatch midwives who are trained in psychiatric diagnosis to antenatal clinics.


The perinatal period is increasingly recognized as a life stage at which the mental health of women is at risk. Many studies have shown that approximately 10% of women develop clinical depression after childbirth.1,2 Postnatal depression may also be linked to both development of children3–5 and depression in the spouse.6,7 Although it has long been thought of as a period during which women are unlikely to develop psychiatric dis-orders, recent studies have shown that approximately 10% of pregnant women experience onset of depression during pregnancy.8,9 Despite the clinical importance of detecting psychiatric disorders in pregnant and child-bearing women, the rate of help-seeking behavior in these women is low. For example, Okano et al. studied all psychiatric referrals from the obstetrics clinic of a university clinic and reported a referral rate of only 2–3%.10

Traditional belief in Japan is that only psychiatrists are capable of gaining a comprehensive understanding of a patient's condition. However, as in Western countries, mental health services in Japan are increasingly being provided by multidisciplinary teams. In this situation, it is essential that all members of the mentalhealth team are able to communicate accurately with one another about the types of cases they are treating. To achieve this end, training of non-medical mental health professionals in the commonly used operational diagnostic criteria is very important. Accuracy of the diagnostic skills of Japanese psychologists has already been reported.11,12 However, the diagnostic accuracy of other professionals has scarcely been studied in Japan.

The present study investigated how accurately Japanese midwives were able to use the diagnostic criteria for mood and anxiety disorders according to the DSM-IV.13 We are unaware of any previous studies that have examined this issue in Japan.



Twelve midwives, who had worked in the maternity wards of three university hospitals, participated. One of the three hospitals is located in Fukuoka, Kyushu, the westernmost island of Japan; the second is in Okayama, near Kansai district, in central Japan, and the third is in Kawagoe, located in the Kanto district, near Tokyo. Thus, we obtained good geographic diversity. The participating midwives' clinical experience ranged from 1 to 16 years; only one of them had had past experience (1 years and 8 months) in psychiatry. These midwives had never received training in postgraduate psychiatric diagnosis; they were newly recruited interviewers for an ongoing longitudinal study on perinatal mental health issues. At the time of the present study, they were undergoing training in order to administer a structured interview. For the training program, all the subjects were assigned homework of psychiatric diagnosis, the results of which are reported here. While so doing, they attended two half-day lectures on psychiatric interview techniques. They were requested to read through the DSM-IV before one of us (TK) gave a brief introduction of psychiatric disorders usually seen during the perinatal period and lectured about the DSM-IV system. Because an ad hoc structured diagnostic interview was developed and used in the main study, each midwife performed this structured interview with another midwife as a simulated patient (usually a pregnant woman with a specific psychiatric disorder).


A total of 30 case vignettes were selected. Using an answer sheet specially prepared for the present study, the psychiatrist (TK) rated whether psychiatric symptoms listed on the answer sheet were present in each case, and based on these ratings he determined whether any of the mood and anxiety disorders described in the DSM-IV were present in each case. The case vignettes were selected from the Japanese translation of the DSM-III-R Case Book.14 Because we were interested in the midwives' diagnostic skills in DSM-IV, and because DSM-IV categories have additional symptomatic items necessary for definite diagnoses, slight alterations were added to the case vignettes to allow DSM-IV diagnoses. Except for the first case used as a guide sample, the cases included generalized anxiety disorder (one case), panic disorder (three cases), major depressive episode (seven cases), dysthymic disorder (two cases), manic episode (four cases), agoraphobia (two cases), social phobia (two cases), specific phobia (three cases), and obsessive-compulsive disorder (two cases). We included seven cases in which psychiatric symptoms were presented that but did not meet any of the DSM-IV criteria for mood or anxiety disorders. These miscellaneous cases were deleted from the following analyses because they consisted of single cases of different diagnostic categories that were added as an aid for educational purpose. The total number of cases exceeded 29 because of the multiple diagnosis policy used by the DSM-IV. It should be noted that we were interested in mood and anxiety disorders because mood and anxiety disorders have been reported as prevalent among pregnant and child-bearing women.15 Among them depression has been studied extensively,1 while anxiety has been studies less frequently.16,17 We also took note of the epidemiological findings in a Japanese population.18,19

All the cases selected were diagnosed independently by one of us (TK) as an expert in psychiatric diagnosis. The expert's diagnoses were in perfect concordance with those given in the Case Book. He had at least 25 years of experience in clinical and research work in psychiatric diagnostics (at the time the present study was conducted) using operational criteria including the DSM, as well as working as an editor of several international journals. These diagnoses were thus used as a specialist's judgement, against which the midwives' diagnoses were validated.


At the beginning, the midwives practised diagnosing the first vignette, and this promoted a better understanding of the procedure for diagnosis using the DSM-IV. The ratings of the remaining 29 case vignettes were collected on three occasions, when the midwives submitted their ratings on the second through to the sixth vignettes, seventh through to the 18th, and 19th through to the 30th, respectively. The midwives received copies of the psychiatrist's ratings as the correct solutions after they had submitted each of their own.

Statistical analysis

The ratings for all 29 case vignettes (the first case was discarded because the correct answer had been provided as a sample of how to fill in the sheets) were included when calculating the agreement of diagnoses made by the midwives with those made by the expert.

Kappa statistics have been the most widely used form of chance-corrected interrater agreement, and were used here.20 The value of kappa is influenced by the base rate or the prevalence of a particular observation.21 In the present study the base rate refers to the proportion of the cases given a specific diagnosis by the criterion rater. Because the values of the kappa coefficient become increasingly unstable as the base rate reaches either end of the continuum, only the kappa coefficients for diagnostic categories with a base rate ≥10% were considered in the main analysis (none reached a base rate of ≥90%). Those diagnoses with a base rate of <10% will be presented only for reference purposes.

Midwives may reach a correct diagnosis without correctly identifying key symptoms for specific diagnosis. Thus, we examined the agreement (in kappa coefficient) of the assessment of nine key diagnostic symptoms made by the midwives with that made by the expert. The nine key symptoms were general anxiety (for generalized anxiety disorder), panic (anxiety) attack (for panic disorder), dysphoric mood and loss of interest (for major depressive episode and dysthymic disorder), elated mood and irritability (for manic episode and hypomanic episode), fear of objects or situations (for agoraphobia, social phobia, and specific phobia), and obsessive thought and compulsive act (for obsessive-compulsive disorder).

The SPSS-X program (SPSS, Chicago, IL, USA) was used to obtain kappa coefficients. Similar but varying standards have been used to interpret the values of kappa in reliability studies.


The base rate of four diagnoses in the DSM-IV (panic disorder, major depressive episode, manic episode, and specific phobia) exceeded 10% (Table 1). The kappa coefficient obtained was almost perfect for panic disorder (κ = 0.83); it was moderate for major depressive episode (κ = 0.71), manic episode (κ = 0.78), and specific phobia (κ = 0.64). For the other categories with a base rate <0.1, the midwives agreed with the specialist's diagnosis less well, probably because a few midwives did not diagnose such categories in any of the vignettes, thus producing no κ (thus treated as 0).

Table 1.  Agreement with diagnoses made by the expert (mean ± SD)
  1. AGO, agoraphobia; DYS, dysthymic disorder; GAD, generalized anxiety disorder; MAN, manic episode; MDE, major depressive episode; OCD, obsessive-compulsive disorder; PAN, panic disorder; SOC, social phobia; SPE, specific phobia.

Base rate (%)3.410.324.16.913.86.96.910.36.9
10.24 ± 0.290.84 ± 0.161.00 ± 0.000.46 ± 0.321.00 ± 0.00−0.05 ± 0.14 0.63 ± 0.240.71 ± 0.190.78 ± 0.210.62 ± 0.33
20.84 ± 0.160.67 ± 0.181.00 ± 0.000.84 ± 0.160.46 ± 0.320.84 ± 0.160.52 ± 0.240.57 ± 0.35
31.00 ± 0.001.00 ± 0.001.00 ± 0.001.00 ± 0.001.00 ± 0.000.78 ± 0.210.71 ± 0.190.52 ± 0.240.78 ± 0.32
40.47 ± 0.321.00 ± 0.000.53 ± 0.210.35 ± 0.300.42 ± 0.240.46 ± 0.320.78 ± 0.210.78 ± 0.210.33 ± 0.250.57 ± 0.22
50.29 ± 0.301.00 ± 0.000.66 ± 0.160.33 ± 0.250.87 ± 0.130.65 ± 0.320.78 ± 0.210.61 ± 0.200.38 ± 0.240.62 ± 0.23
61.00 ± 0.001.00 ± 0.000.67 ± 0.180.46 ± 0.320.36 ± 0.310.46 ± 0.320.46 ± 0.320.63 ± 0.240.21 ± 0.270.58 ± 0.26
7−0.06 ± 0.32 1.00 ± 0.000.75 ± 0.130.19 ± 0.270.76 ± 0.16−0.07 ± 0.04 1.00 ± 0.000.84 ± 0.160.63 ± 0.240.56 ± 0.40
80.36 ± 0.310.26 ± 0.270.37 ± 0.21−0.09 ± 0.10 0.84 ± 0.160.26 ± 0.27−0.09 ± 0.10 0.21 ± 0.28
90.29 ± 0.301.00 ± 0.000.90 ± 0.100.84 ± 0.160.65 ± 0.320.52 ± 0.240.53 ± 0.210.63 ± 0.240.60 ± 0.29
101.00 ± 0.000.84 ± 0.160.81 ± 0.130.65 ± 0.320.71 ± 0.190.65 ± 0.320.63 ± 0.241.00 ± 0.000.63 ± 0.240.77 ± 0.14
110.36 ± 0.310.84 ± 0.160.62 ± 0.17−0.05 ± 0.14 0.84 ± 0.160.35 ± 0.300.46 ± 0.210.28 ± 0.250.41 ± 0.30
120.47 ± 0.320.35 ± 0.300.58 ± 0.190.21 ± 0.270.84 ± 0.160.27 ± 0.280.35 ± 0.30−0.09 ± 0.10 0.33 ± 0.27
Mean0.45 ± 0.350.83 ± 0.250.71 ± 0.180.38 ± 0.350.78 ± 0.190.23 ± 0.300.56 ± 0.260.64 ± 0.210.39 ± 0.27

Across all of the nine DSM-IV categories, the mean κ coefficient for each midwife varied from 0.21 to 0.78. Their accuracy of psychiatric diagnosis was moderate or better in 10 midwives, and substantial in five (The following standards were used in the present study: a kappa coefficient of between 0.80 and 1.00, almost perfect; between 0.60 and 0.79, substantial; between 0.40 and 0.59, moderate; between 0.20 and 0.39, fair; between 0.00 and 0.19, slight; <0.00, poor). One midwife (no. 3 in Table 1) had perfect agreement with the expert in five diagnostic categories.

The diagnostic accuracy (concordance with a specialist's diagnosis) may be determined by the accuracy of assessment of key diagnostic symptoms. This is because the present study used a skip policy in that the raters were instructed to jump to the next diagnostic category if they rated ‘absent’ the key diagnostic symptoms of a category. Therefore, we examined the accuracy of the midwives' assessment of these key symptoms. As expected, better agreement was obtained for key symptoms of the diagnostic categories showing good accuracy (Table 2).

Table 2.  Agreement with the assessment of key symptoms made by the expert (mean ± SD)
  1. CA, compulsive acts; DM, dysphoric mood; EM, elated mood; FR, fear of any situations or objects; GA, general anxiety; IRR, irritability; LOI, loss of interest; OT, obsessional thought; PA, panic (anxiety) attack.

Base rate (%)24.110.337.924.110.313.831.013.813.8 
10.62 ± 0.170.84 ± 0.160.85 ± 0.100.70 ± 0.161.00 ± 0.000.61 ± 0.630.63 ± 0.150.84 ± 0.160.84 ± 0.160.77 ± 0.13
20.53 ± 0.210.84 ± 0.160.93 ± 0.070.90 ± 0.100.78 ± 0.210.71 ± 0.190.92 ± 0.080.28 ± 0.230.63 ± 0.240.72 ± 0.20
30.67 ± 0.181.00 ± 0.000.93 ± 0.070.81 ± 0.130.78 ± 0.210.84 ± 0.160.92 ± 0.080.61 ± 0.211.00 ± 0.000.84 ± 0.13
40.58 ± 0.191.00 ± 0.000.93 ± 0.070.81 ± 0.131.00 ± 0.000.27 ± 0.280.77 ± 0.130.59 ± 0.181.00 ± 0.000.77 ± 0.24
50.81 ± 0.131.00 ± 0.000.93 ± 0.070.83 ± 0.071.00 ± 0.000.61 ± 0.210.85 ± 0.100.76 ± 0.160.76 ± 0.160.84 ± 0.12
60.45 ± 0.211.00 ± 0.000.77 ± 0.120.90 ± 0.100.84 ± 0.160.63 ± 0.240.70 ± 0.140.59 ± 0.180.76 ± 0.160.74 ± 0.16
70.44 ± 0.191.00 ± 0.000.93 ± 0.070.75 ± 0.130.61 ± 0.200.61 ± 0.210.56 ± 0.180.84 ± 0.160.84 ± 0.160.73 ± 0.18
80.44 ± 0.190.71 ± 0.190.93 ± 0.070.81 ± 0.130.78 ± 0.210.27 ± 0.280.45 ± 0.200.23 ± 0.230.67 ± 0.180.59 ± 0.23
90.58 ± 0.191.00 ± 0.000.77 ± 0.120.91 ± 0.091.00 ± 0.000.51 ± 0.250.92 ± 0.081.00 ± 0.001.00 ± 0.000.85 ± 0.18
100.51 ± 0.190.84 ± 0.160.85 ± 0.100.91 ± 0.090.61 ± 0.200.42 ± 0.240.66 ± 0.160.51 ± 0.251.00 ± 0.000.70 ± 0.19
110.58 ± 0.190.84 ± 0.160.77 ± 0.120.53 ± 0.210.63 ± 0.240.36 ± 0.310.77 ± 0.130.52 ± 0.190.67 ± 0.180.63 ± 0.14
120.25 ± 0.200.85 ± 0.100.51 ± 0.180.51 ± 0.181.00 ± 0.000.63 ± 0.240.68 ± 0.150.28 ± 0.230.13 ± 0.220.54 ± 0.27
Mean0.54 ± 0.130.91 ± 0.100.84 ± 0.120.78 ± 0.130.84 ± 0.160.54 ± 0.170.74 ± 0.140.59 ± 0.230.78 ± 0.24


The present study has shown that generally, Japanese midwives can apply the diagnostic criteria for mood and anxiety disorders to case vignettes well. The midwives tested in the present study had been trained to use psychiatric diagnosis using the DSM-IV for only approximately 1 month. If they had taken much more time to acquire a better understanding of the DSM-IV diagnosis, the result would have been better. Nonetheless, these findings seem to be encouraging, and suggest that Japanese midwives can understand the diagnostic rules set by the DSM-IV within a very short period. Although psychiatry is taught in Japanese nursing education, we feel that more emphasis should be placed on psychiatric diagnostic skills for nursing and midwifery.

A few studies have attempted to assess the diagnostic accuracy (concordance with an expert's judgement) of mental health professionals in Japan (Table 3). However, these professionals were psychiatrists or psychologists, and no studies have focused on the psychiatric diagnostic skill of midwives. Relative to the diagnostic accuracy of mental health professionals, the present midwives demonstrated comparable skills. It is of note that some midwives (3 and 10) demonstrated very high reliabilities in several diagnostic categories. We are not aware of definite reasons for such a finding. They never worked for psychiatric wards or hospitals. Nor were they involved in a liaison services with psychiatric department. This issue, however, deserves further attention in future research.

Table 3.  Comparison of previous case vignette studies of diagnostic accuracy among Japanese mental health professionals
Kitamura et al. (1986)22Suga et al. (1987)23Fujihara et al. (1988)24Sugiura et al. (1998)12Hasui et al. (1999)11Present study§
PsychiatristsPsychiatristsPsychologistsPsychologists and psychology studentsPsychologists and psychology studentsMidwives
  • RDC, Research Diagnostic Criteria.

  • RDC diagnosis;

  • DSM-III-R diagnosis;

  • §

    § DSM-IV diagnosis.

  • Child cases: no cases of the given diagnosis were presented.

Schizophrenia (RDC)0.890.800.60
Schizoaffective disorder, manic type (RDC)1.000.820.72
Schizoaffective disorder, depressed type (RDC)0.780.510.38
Major depression (RDC/DSM)1.000.890.780.720.910.71
Mania (RDC)0.950.950.730.78
Generalized anxiety disorder (RDC/DSM) 0.510.640.45
Panic disorder (RDC)0.920.83
Obsessive-compulsive disorder (RDC) 0.660.39
Phobic disorder (RDC) 0.730.64

We found some variation in the accuracy of symptom assessment by midwives. Panic attack had the highest accuracy, while generalized anxiety, irritability, and obsessional thought had poor accuracy. The high accuracy for panic attack may be due to its somatic components, which are easy for midwives to understand. The latter three symptoms are more internalized. An interesting contrast was seen between obsessional thought (κ = 0.59) and compulsive act (κ = 0.78). Achenbach and Cantwell divided psychiatric symptoms into internalized and externalized.25,26 Ribera et al. reported that the reliability of symptomatic assessment was higher for externalized than for internalized symptoms.27 As in the assessment of symptoms in children, the Japanese midwives may have found it more difficult to assess internalized symptoms. Further research and revision of training are therefore needed.

Participation in the present study seems to have made the midwives more aware of the importance of psychiatric examination for pregnant and child-bearing women. Prior to the present study, the midwives had been aware of the occurrence of mental illness during the perinatal period. However, for them, it was an issue that needed consultation and liaison services from outside specialists. Having obtained skills in psychiatric diagnosis, most of the midwives felt that it was an issue falling within the area of their clinical responsibility.

The present study used a total of 29 case vignettes, with varying background factors including sex, age and occupation. The receivers of midwives' care are perinatal women. Therefore, it may be necessary to accumulate further case vignettes of perinatal women who have mental problems.

The present study is the first to have examined the ability of Japanese midwives to apply diagnostic criteria. The results are encouraging, and suggest that if midwives are trained and practised in psychiatric interview and diagnostic techniques, then, mental health services for pregnant and postpartum women in Japan will advance.


This study was supported by a Grant-Aid for Scientific Research from the Japanese Ministry of Health and Welfare, Assessment of the Mother and Child Health Care System.