The use of population based registers in psychiatric research


  • Invited paper

Peter Allebeck, Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden.


Objective:  Much of the knowledge we now take for granted regarding major mental disorders such as schizophrenia, suicide and other disorders, would not exist without the use of population based registers. The use of population based registers in psychiatric epidemiology have enabled analyses of associations that otherwise would not have been possible to address.

Method:  The use of registers in psychiatric research is described, exemplified, and discussed.

Results:  Methodological and validity aspects depend to a large part on the type of register being considered. A classification is proposed of different types of registers, each one implying specific methodological issues. These can be addressed according to the dimensions coverage, attrition, representativity and validity. Specific methodological consideration has still to be taken in relation to each specific research question. Thus, special validity studies usually need to be performed when embarking on studies using population based registers.

Conclusion:  With increasing burden of disease due to mental disorders worldwide, knowledge of the epidemiology of these disorders are of increasing interest. The Nordic countries have a strong history in this field of research, of great interest to the rest of the world. Universities and research funding agencies should recognize this valuable source of research capacity, and support fruitful continuation of a strong tradition.

Clinical recommendations

  •  For follow-up studies on patients in care and analytic studies the Nordic population based registers provide an important tool.
  •  Choice of registers to be used and methods of analyses have to be thought out carefully, and limitations in the different data sets used must be considered.

Additional comments

  •  Since validity of diagnoses differ between countries and settings and types of disorders, care should be taken in comparing data from different sources.
  •  Validation studies need to be performed depending on the aim of the study.
  •  For many disorders, a large and increasing part of the population is treated only in outpatient care, and therefore not included in inpatient care registers.


Dohrenwend has in several classical texts described what he calls three generations of psychiatric epidemiology (1). The first generation of studies were performed before the second world war and were based mainly on second hand information; medical records, key informants, etc. The second generation were the large population based surveys such as the Midtown Manhattan and Stirling County studies. The Swedish Lundby study and Strömgren’s Samsoe study belong to this category. In the third generation of epidemiological studies, efforts were made to develop diagnostic instruments and measurement scales in order to improve validity and reliability of assessments. The United States based ECA (Epidemiological Catchment Area) and National Comorbidity Survey belong to this category (2, 3). With this wave of studies, psychiatry has in fact been in the forefront in developing diagnostic criteria and assessment scales, in order to avoid the critique of being a soft science without objective criteria to rely on. This in spite of the fact that the diagnosis of a musculoskeletal disorder seldom is more straightforward than the diagnosis of a mental disorder.

It would be appropriate to place registers using population based registers in a fourth generation category of epidemiological studies. These have opened up possibilities that none of the previous types of studies have enabled. Many of the association we now take for granted would not have been without the use of population based registers: The high mortality in schizophrenia, and not only from suicides; the role of urbanity as risk factor for schizophrenia; the role of obstetric complications as risk factors for schizophrenia, to give some examples.

While epidemiological studies and health services research are the most common type of studies in which population based registers are used, clinical research, research in treatment course, and episodes of care are other important applications (4, 5).

Aims of the study

In this study, we will focus on the use of registers for epidemiological research and discuss firstly, the advantages of and reasons for using population based registers in psychiatric research, and secondly different types of registers typically used, and the methodological aspects intrinsic in different types of registers. Examples will be given with focus on experiences from the Nordic countries.

Material and methods

The use of registers in psychiatric research is described and discussed based on illustrative examples from the literature.


There are certain advantages in using population based registers, e.g. large number of subjects, number of variables, and possibilities for record linkage.

Number of subjects

One problem in psychiatric research is that many diseases of interest have low incidence but high prevalence due to its chronic course. Thus, while schizophrenia is considered a common disease, with a life time prevalence of around 7/1000, the incidence is around 15/100 000 (6), and thus the sample size required to assess even an important, say two-fold increased risk, requires populations of tens to hundreds of thousand persons, depending on the prevalence of the potential risk factor. Interestingly the span of incidence of schizophrenia shown by McGrath et al. (6), around 7/10 000 to 40/10 000, is very similar to the span of suicide rates in different countries (7), another common outcome in psychiatric research.

Thus in order to assess the effect of potential risk factors such as cannabis use or obstetric complications on the incidence of schizophrenia, it is clear that population based registers are necessary to obtain sufficient power for adequate analyses. The same applies to suicide according to the figures above.

Number of variables

It is not uncommon to use many variables in psychiatric research, in particular many clinical studies use a large number of psychological scales measuring various types of outcomes. But the characteristics of many population based registries, and the possibility of combining registers, enable analyses of associations that would otherwise not be possible to address for example, in the research on obstetric complications and schizophrenia, it has been important to look at a number of clinical variables that may be of importance for the association, such as birth weight, asphyxia, fetal distress, head circumference, but also to control for socio-demographic conditions and mother’s psychiatric history (8) This is possible using care registers comprising comprehensive clinical data and linking to census data. Recently, interest has grown in maternal exposure to stress and adverse life events as risk factors for schizophrenia in the offspring. Thus variables have been included covering demographic data and health outcomes of close relatives, and additional variables have thus been obtained from relatives using record linkages (9).

Another example is the research on social adversity as risk factor for schizophrenia. Studies have mostly been based on a single measure of social background such as father’s income. Furthermore, confounding by psychotic history in parents, migrant status and urbanicity had been poorly controlled for. Wicks et al. (10) used a number of indicators of social adversity and were able to control for confounders such as migrant status, urbanicity and parental psychosis, and could confirm an independent association between social adversity in childhood and development of schizophrenia later in life.

Record linkages

Several of the examples mentioned above require record linkages. The availability of the person number in the Nordic countries provides an excellent person identifier enabling record linkages, but many other countries have social insurance numbers or other person identifiers. Record linkages enable capture of exposure data from one source and outcome data from another source, as opposed to e.g. health care data where both exposure (including basic demographic data) and outcome are found in the same type or registers. These have enabled linkages of different types of data sets, such as the conscription surveys in Sweden, and Israel with health care registers, social insurance registers with health care registers, etc.

The possibility of linking different types of health registers with each other as well as with social and economic data, will probably increase in many countries, since insurance systems increasingly need to monitor and follow-up health care procedures performed by different providers as well as socio-economic information on their clients. Major insurance companies in the United States also have huge databases covering information on many millions of insured persons. However, the Northern European system with person numbers will still be unique in that they i) cover information on the total population within a geographic or administrative area, and not only insured patients and ii) the person identifier is used not only for insurance and health care purposes but also for other purposes enabling linkages of the type mentioned above.

Types of registers used.  The characteristics and quality of different data bases are to a large extent dependent of the purposes for which the database was created. We will give examples from main types of registers used in psychiatric research and discuss methodological issues in relation to different types of registers.

Research purposes

There are plenty of examples from many countries of surveys or cohorts used in psychiatric research, although they may or may not originally have been set up specifically for mental health research. As they are based on originally collected data using interviews, questionnaires, and sometimes laboratory tests, resource constraints form a natural limit to the possibilities of collecting large samples. Two phase sampling are often used when clinical interviews are to be performed, but with a certain drop out at each stage, population representativity may be hampered. While the already mentioned studies from the United States, the ECA and the National Comorbidity Surveys, are most impressive in their design, their size and scope of data included, they can to a very low extent be used for record linkages due to the lack of general person identifiers in the United States. These surveys, as well as the World Mental Health Survey (11), are also unique examples of data bases that have provided enormous amounts of information on the epidemiology of mental disorders in many countries of the world, as well as diagnostic aspects and comorbidity.

Compared with the above-mentioned US originated research surveys, the classical Nordic psychiatric epidemiological studies such as the Samsoe and Lundby studies established in 1957 and 1947, respectively, have had small samples, thus not enabling studies on the epidemiology of schizophrenia or suicide per se, although analytic studies on suicide and studies of depression and other common mental disorders over time have been performed (12–14). Furthermore, they have played a major role in setting the agenda for psychiatric epidemiology and social psychiatry.

Many longitudinal studies in psychiatric epidemiology are based on research databases not originally set up for psychiatric research. From the Nordic countries one could mention the Norvegian HUNT and HUSK cohorts set up for multidisciplinary health research, that have been extensively used also for mental health research (15, 16). In Gothenburg a number of research databases have been existing for almost 50 years, and they have been supplemented continuously with new data. Thus the cohort on women’s health and the cohort of elderly (H70) have been fruitfully used for a variety of psychiatric outcomes (17, 18).

As illustrated in Table 1, the studies are by definition based on selected study populations, although often well designed and carefully selected. Thus, they are often designed to well represent a defined population. The participation rate is sometimes limited, such as in the Stockholm PART study, and attrition over time may be considerable (19). Training of interviewers and application of standardized instruments generally insure high quality of data.

Table 1.   Characteristics of research data bases
CoverageSelected study population
AttritionSometimes considerable
RepresentativityUsually good
ValidityUsually excellent

Health care registers

The Danish psychiatric case register is one of the best examples of how a national database on persons in care can be used fruitfully for epidemiological research, and is in this sense a good heritage of Strömgrens introduction of psychiatric epidemiology on an island not far from where the register now is based. Among the first set of studies were a number of important analyses of incidence and other aspects of schizophrenia (20, 21), and subsequently, a series of important studies have been made on urbanicity and schizophrenia, mortality in mental disorders and suicide risk, to mention a few examples (22–24). In addition to using the case registers as such, perhaps with addition of census data or mortality records, many examples can be found on how the case registers have been used to identify outcomes when exposure data have been retrieved from other sources, such as for example in the Swedish conscripts survey or the Danish twin register (25, 26).

Other health related databases

We will briefly mention the central registers used for monitoring and follow-up. They have basically similar characteristics regarding quality aspects, listed in Table 2.

Table 2.   Characteristics of cause-of-death registers and other monitoring registers
CoverageUsually only inpatient care is covered, a strong limitation for many disorders
AttritionIf persons move to outpatient care or migrate
RepresentativityIf only hospitalized cases, more severe cases are over represented
ValidityDiagnostic validity needs to be assessed

Cause of death registers.  These have been central in all types of suicide research as well as studies on mortality in patients with mental illness. While all developed countries have careful recording of all deaths, not all countries have comprehensive cause-of-death registers available to researchers. Again, many northern European countries have the advantage of being able to link health care or research databases to cause of death registers. In addition to the above mentioned purposes, knowledge about vital status is necessary for follow-up of patients.

Cancer registers.  These have sometimes been used to address the association between cancer and psychiatric disorders, as well as suicide risk in cancer patients.

Birth records.  As mentioned above, these have been central in the research on obstetric risk factors and psychosis.

Other registers

Only two types of registers that have been in focus in recent years, school registers, and social insurance registers.

School registers.  With the increased interest in life course aspects of major psychiatric disorders, information on states and events early in life has become important. Early cognitive function and school performance are factors shown to be of importance for later mental illness. The availability of school registers in Finland and Sweden have identified poor school performance as a risk factor for later development of schizophrenia (27, 28). Regarding suicide, the association seems to be more complex, in that good school performance has been shown to be related to lower suicide risk, although among patients with psychosis the inverse association was found (29). Similar findings have been obtained from a Swedish cohort of school children on which data from cognitive tests were available (30), strengthening the hypothesis that measures of school performance and cognitive function are strongly linked. While coverage overall is good regarding school registers, validity of test results and school marks may vary between regions in the country and over time. As the main issue is comparability between exposure categories (e.g. subjects performing well and subjects performing poorly), and longitudinal design to exclude reverse causation, the studies mentioned are sufficiently unbiased to assure validity of findings.

Social insurance registers.  Sickness absence and disability due to mental disorders have increased considerably in several northern European countries in recent years, and research in the area has been promoted. However, social insurance regulations vary between countries and over time, and data are therefore difficult to interpret. Only a few countries have comprehensive registers on sickness absence and disability pension, and even if they are available, diagnoses are questionable, since mental disorders and especially substance abuse may be underreported in physicians’ certificates. Hensing and Wahlström (31) have summarized findings and also discussed methodological aspects. Thus, analyses of occurrence and trends in sickness absence or disability pension due to mental disorders are difficult when using only social insurance registers. However, if a defined cohort with background data is available, a social insurance register can be validly used for comparison of incidence of sickness absence or disability pension between various groups in the cohort. This has been done in several studies from the Whitehall cohort, as well as in several cohorts from the Nordic countries (32–34).


Issues of validity are central in the use of case registers (Table 3). While data on admission and discharge usually are exact, diagnoses are not always correct. Mental health diagnoses need to be validated, and this was done in the Danish case register regarding schizophrenia (35). Also the Swedish case register have been subject to some validation studies (36), but issues of validity and generalizability needs to be addressed for each specific study purpose. A more serious problem may be the fact that many case registers are based on inpatients only, and psychiatric patients are increasingly being treated in outpatient care. Patients identified in a case register are hardly ever representative of all cases with a disorder, also in somatic care. But since the issue of concern in epidemiology and health services research generally is comparability rather than total coverage of the population, again, issues of validity and generalizability needs to be addressed for each specific type of study.

Table 3.   Characteristics of psychiatric case registers
CoverageAll births and deaths are recorded to virtually 100%. Cancer also good coverage, but may depend on screening intensity
AttritionVirtually none
RepresentativityRepresent all persons in the country or region covered.
ValidityIn general good but causes of death can be questioned. Suicides in particular difficult

Much of the knowledge we now take for granted regarding major mental disorders, schizophrenia, suicide and other disorders, would not exist without the use of population based registers. Thus knowledge of background factors, occurrence of and outcome of major mental disorders have been possible through longitudinal studies and record linkages of large population samples.

Methodological and validity aspects are to a large extent dependent on the type of registers used. We have found a classification into different types of registers useful in a discussion of methodological aspects. While other perspectives could be added, the following four have been addressed in this paper: Coverage, attrition, representativity and validity. These perspectives need to be addressed in the planning and design of epidemiological studies, and validity studies should be performed according to the aim of the specific study being planned. Changes in the health care system need to be monitored, and the effect of increasing use of outpatient care should be understood.

With increasing burden of disease due to mental disorders worldwide, knowledge of the epidemiology of these disorders are of increasing interest. The Nordic countries have a strong history in this field of research, of great interest to the rest of the world. Universities and research funding agencies should recognize this valuable source of research capacity, and support fruitful continuation of a strong tradition.

Declaration of interest