Estonian National Mental Health Study: Design and methods for a registry‐linked longitudinal survey

Abstract Objectives The Estonian National Mental Health Study (EMHS) was conducted in 2021–2022 to provide population‐wide data on mental health in the context of COVID‐19 pandemic. The main objective of this paper is to describe the rationale, design, and methods of the EMHS and to evaluate the survey response. Methods Regionally representative stratified random sample of 20,000 persons aged 15 years and older was drawn from the Estonian Population Register for the study. Persons aged 18 years and older at the time of the sampling were enrolled into three survey waves where they were invited to complete an online or postal questionnaire about mental well‐being and disorders, and behavioral, cognitive, and other risk factors. Persons younger than 18 years of age were invited to fill an anonymous online questionnaire starting from wave 2. To complement and validate survey data, data on socio‐demographic, health‐related, and environmental variables were collected from six national administrative databases and registries. Additionally, a subsample was enrolled into a validation study using ecological momentary assessment. Results In total, 5636 adults participated in the survey wave 1, 3751 in wave 2, and 4744 in wave 3. Adjusted response rates were 30.6%, 21.1%, and 27.6%, respectively. Women and older age groups were more likely to respond. Throughout the three survey waves, a considerable share of adult respondents screened positive for depression (27.6%, 25.1%, and 25.6% in waves 1, 2, and 3, respectively). Women and young adults aged 18 to 29 years had the highest prevalence of depression symptoms. Conclusions The registry‐linked longitudinal EMHS dataset comprises a rich and trustworthy data source to allow in‐depth analysis of mental health outcomes and their correlates among the Estonian population. The study serves as an evidence base for planning mental health policies and prevention measures for possible future crises.

lion in 2019, and its proportion from total disease burden increased from 3.1% to 4.9% (GBD 2019Mental Disorders Collaborators, 2022. The public health impact of mental health problems can also be characterized by its considerable social and economic costs (Trautmann et al., 2016;Whiteford et al., 2013;Wittchen et al., 2011). Depending on the approach, the estimated global economic cost of mental health problems (based on data from 2010) varied between 2.5 and 8.5 trillion US dollars (Trautmann et al., 2016) and is expected to increase up to 16 trillion US dollars by 2030 (Bloom et al., 2011). This vast and growing societal impact emphasizes the urgency to tackle the mental health problems at the population level.
The role of mental health and well-being in public health domain has become even more important during the ongoing COVID-19 pandemic that in addition to its direct health impacts, has also created an environment where the existing risks for poor mental health are exacerbated. The pandemic has affected employment opportunities and educational engagement, financial security, access to health and social services in most societies, and thus re-shaped our daily experiences.
The mounting evidence on the mental health impact of COVID-19 pandemic suggests increasing prevalence of stress, suicidal thoughts, anxiety, and depressive feelings (COVID-19 Mental Disorders Collaborators, 2021;Fountoulakis et al., 2021; OECD], 2021; Reile et al., 2021;Wang et al., 2020). A recent meta-analysis (COVID-19 Mental Disorders Collaborators, 2021) estimated that COVID-19 pandemic has resulted in additional 53.2 million cases of major depressive disorder and 76.2 million cases of anxiety disorders globally, increasing their prevalence by more than 25%. The increased prevalence of depression, post-traumatic stress disorder (PTSD), anxiety symptoms, substance abuse, emotional disturbance, insomnia and avoidance behavior, and suicides associated with epidemic-related anxiety and fear, has been reported also during earlier epidemics (Brooks et al., 2020;Cheung et al., 2008;Luo et al., 2020). Moreover, the duration of mental health impacts is not necessarily limited to the epidemic/pandemic period.
Several studies have found long-term increased risk of psychiatric disorders and suicides related to exposure to the SARS outbreak in 2003 (Liu et al., 2012;Mak et al., 2009;Tzeng et al., 2020;Wu et al., 2009Wu et al., , 2008. Given that similar long-term consequences on mental health have also been found for economic crises (Frasquilho et al., 2016) and other natural or man-made disasters (Norris et al., 2002), the detailed and timely data collection is paramount to adequately assess the public health impact of the pandemic.
Although in Estonia mental health became prominent in public discourse already before COVID-19 pandemic, only a few policy documents have been published so far. The 2020 Green Paper on Mental Health (Sotsiaalministeerium, 2020) emphasized the need to enhance the prevention and early detection of mental health problems and the availability of high-quality care. However, up to this point, there is no monitoring system to assess the prevalence of mental health problems and treatment needs within general population. Although mental health indicators have been included in several regular surveys conducted in Estonia (e.g., Health Behaviour in School-Aged Children, Estonian Health Interview Survey, Survey of Health, Ageing and Retirement in Europe, and Health Behaviour Study Among Estonian Adult Population), neither of those allows a detailed assessment of population mental health. This is problematic as surveillance of population mental health and identification of vulnerable groups is essential for mental health policy planning and prevention. The COVID-19 pandemic not only exacerbated the need for a thorough overview of the population mental health situation in current crisis but also highlighted the necessity to increase the preparedness for any future crises.
This paper presents an outline of the rationale, design, and methods of the Estonian National Mental Health Study (EMHS) and evaluates the survey response. The EMHS is the first large-scale populationbased study focusing on mental health in Estonia and it aimed to: (a) provide a comprehensive overview of the mental health and related lifestyle factors of the Estonian population during the ongoing COVID-19 pandemic, (b) explore the mechanisms behind the development of mental health issues in stressful situations, (c) evaluate the need for national support services in a crisis, and (d) propose a set of survey and registry-based indicators that could be used for nationwide mental health monitoring. The study was commissioned by the Estonian government to gain an overview of the population mental health and to prepare the ground for building up a mental health monitoring system for Estonia. The EMHS was conducted by researchers from the Estonian National Institute for Health Development (NIHD) and University of Tartu.

Study design and timeline
The EMHS is a methodologically complex study that combined: (a)

Study sample
The sample frame for the EMHS consisted of 1,110,274 permanent residents of Estonia aged 15 years and older according to the Population Register data as of January 1, 2020 (Statistics Estonia). A stratified systematic random sampling was used to select 20,000 individuals from the sample frame. The sample size was predetermined to ensure enough respondents for (a) region-specific comparisons, and (b) threewave longitudinal analyses at the population level. The sample frame was first divided into 17 non-overlapping regional strata (15 counties and two largest cities-Tallinn and Tartu). Based on sample size calculations (alpha = 0.05, power = 80%) and modelling of different response rate scenarios (response rates per survey wave 25%−40%), the initial sample size for each regional stratum was defined as 1000 individuals (n = 17,000). Within each regional stratum, 30 strata were formed by sex and 5-year age groups (15−19, . . . , 85+) proportional to the distribution of the target population in the region, resulting in 510 strata.
As previous population-based health surveys in Estonia have demonstrated systematically lower participation rates for men and those in younger age groups (Reile & Veideman, 2021), different ageand sex-specific inclusion probabilities were used to guarantee enough participants in each stratum. Applying oversampling coefficients (ranging from 1.1 to 1.5), additional 3000 cases were distributed to ageand sex-specific strata resulting in the total sample size of 20,000 individuals with the mean size of a regional stratum being 1176.

Timeline
The surveys were carried out in mixed mode (web and postal survey) in three 2-month data collection waves between January 2021 and February 2022. The first wave was started on January 4, 2021, the second on May 3, 2021, and the third wave on January 3, 2022.

Eligibility criteria
The eligibility of sampled individuals into each survey wave was based on: (a) age criteria, (b) status of residence, (c) availability of contact deleted from the database (except from the file with initial sample data). Individuals who were not able to respond due to a persistent medical or other impeding condition were removed from the contact list in the subsequent wave(s), but their data were kept in the database.

Recruitment
Based on the contact data of the sample, individuals were assigned to either web (at least one valid email address available) or postal (no email address given) subsamples and given personal study numbers. For postal subsample, printed survey materials (a cover/informed consent letter with information about the study, a questionnaire, a pre-paid envelope to return the questionnaire, and an information leaflet with mental health support contacts) were sent to person's home address. The postal subsample had also the possibility to complete the questionnaire online by following the link on the study website and entering the personal study number marked on the F I G U R E 2 Formation of eligible survey sample from wave 1 to wave 3.
printed questionnaire. In case the completed questionnaire (either on paper or on web) was not returned within 2 weeks from the initial posting, a reminder was sent by post. Four weeks after the initial posting, another full set of survey materials was sent to nonresponders.
The survey was conducted in three languages (Estonian, Russian, and English) in the first wave and in two languages (Estonian and Russian) in the later waves to optimize the costs of the study. The survey language for each person was determined based on the reported ethnicity and/or native language and/or official citizenship in the Population Registry. In the online survey, participants had the opportunity to choose the language of their preference. In postal survey, participants could change the survey language by contacting the organizers. In waves 2 and 3, the web subsample initially assigned to the English version was contacted in Estonian (the official language of Estonia) and they could use automatic translation into English. The postal subsample assigned to the English version in wave 1 was not recontacted in subsequent waves, because the expected response rate for mailing the Estonian version was considered too low relative to the costs. However, if a valid email address became available via sample update (applicable also for the rest of the sample), these persons were included in the web subsample in subsequent waves.

Incentives to motivate participation
To motivate participation, 50 supermarket gift cards worth 30 Euros each were drawn between the respondents after every survey wave.
After the third wave, additional 20 gift cards worth 100 Euros were drawn between the participants who responded in all three survey waves.

Measures
Based on the study objectives, the following topics were covered in

Data processing
Completed postal questionnaires were entered manually using data entry form on the LimeSurvey platform. A random selection of 100 questionnaires from each wave was double entered for quality control.
An error rate of < 0.5% was considered acceptable.
Given the mixed-mode design and multiple contacts, dataset was screened for duplicate responses. If the same survey ID number appeared more than once in the dataset, the entry with more items answered was kept. Cases with invalid survey numbers were excluded.
In both postal and web survey data (excepting anonymous sample of minors), answers on gender and date of birth were verified against the Population Register data. Inconsistencies were checked for possible data entry errors (postal surveys) and cases with remaining inconsistencies were excluded from the data. A questionnaire was considered completed if more than 50% of items on mental health (section B in the questionnaire; excluding free-text questions and conditional items) were answered.
The EMHS uses different sets of post-stratification weights to adjust for the potential response bias in the data. Given that the initial study sample consisted of 17 regional subsamples (15 counties and 2 cities), each representative for respective population's sex and age distribution, the calculated weights are meant to adjust the responses to enable analysis at both regional and total population level. Simple proportional weighting was used with coefficients found by dividing the proportion of each 5-year age group for men and women in the population by the same proportion in the study data. Weights were calculated separately for each survey wave. Data were pre-processed using statistical software R (R Core Team, 2021).

Registry study
Registry study, based on the survey sample (n = 20,000), aimed to complement and validate the survey data and to estimate the non-response bias. Individual level data were linked from six administrative databases/registries on demographic, socioeconomic, health, and environmental characteristics: 1. Population Register data cover the date of birth, sex, legal marital status, citizenship, place of residence and self-reported ethnicity, native language, and highest completed educational level. In wave 2 web survey, these items were displayed to wave 1 non-respondents only.

Validation study
To validate self-assessed indicators, a subsample of wave 1 survey respondents was invited to participate in the validation study carried out simultaneously with the wave 2 survey. Only web respondents with a valid email address were considered for validation study; because the study was carried out in Estonian, an additional inclusion criterion was good command of the Estonian based on either registry data or wave 1 survey. By these criteria, 3698 participants were invited to the validation study including momentary assessments, emotions, and emotion regulation (five daily assessments during seven consecutive days), and daily reports of bedtime, alcohol consumption, physical activity, and self-reported health. Of them, a subsample of 1000 participants living in selected 6 areas close to the study center were also invited to wear an activity monitor for the assessment of physical activity and sleep, and to donate a saliva sample for cortisol assessment. Further details on the validation study are given in Additional File 6.

Data management
Data are stored and managed at the NIHD. Personal data were used solely for contacting participants and making enquiries in national databases. Thereafter data were pseudonymized and personal study number was used to link survey data with data received from registries.
The password-protected key file linking participants' personal data to their personal study number is stored separate from other study data on a limited-access NIHD server for further research in accordance with the ethical committee approval. The key file will be preserved until December 31, 2030 and will be deleted thereafter. The anonymous data will be preserved indefinitely and will be available on motivated request.

Ethics
The study protocol and its amendments were approved by the Committee. Participants were informed that responding to the questionnaire was considered consenting to participate in the survey and granting permission for enquiries to be made in national databases.
Consent was requested separately for each study wave and for the validation study. both national and regional levels. The weights are added to the dataset for the analytical sample.

RESULTS
The comprehensive overview of the prevalence of mental health problems in Estonian population remains out of the scope of this paper.
Nevertheless, such data, even if limited to selected indicators may be highly informative to shed light on the general reliability and usefulness of data collected in EMHS, whose design and methods have been thoroughly described in this paper. Therefore, we present here the weighted prevalence estimates for depression, one of the most common mental disorders, as measured with the EST-Q2 (Aluoja et al., 1999) ( diagnoses of mental disorders) and thus, to assess the consistency of the two approaches. Finally, the use of regionally stratified sampling allows to study (previously unavailable) regional variance in mental health indicators. As such, the EMHS has a strong potential to make a substantial contribution to the field of public mental health at both academic and applied levels.
The EMHS also has some limitations. One of the most serious challenges for the study was to recruit sufficient number of participants to allow reliable conclusions and guarantee the generalizability of the results. Low response rate in population-based surveys has become increasingly common (Galea & Tracy, 2007) with online surveys on average yielding substantially lower response rates compared with other survey modes (Wu et al., 2022). The EMHS is not an exception as the overall adjusted response rates varied from 30.6% in wave 1 to 27.6% in wave 3. Low response rate itself does not necessarily mean biased results if responders and non-responders do not differ systematically with respect to the outcome variable of interest. The good compliance with other recent studies regarding the prevalence of symptoms of depression allow us to believe that the results reflect the true pattern of mental health in the population. Another important limitation is the shortage of validated screening tools to detect mental disorders available in Estonian. The depression subscale of the EST-Q2 is so far the only screening scale in Estonian that has been calibrated against clinical diagnosis (Ööpik et al., 2006). In the framework of the EMHS, several internationally well-known measures (e.g., subscales from the DSM-5 Screener) were adapted to Estonian, however, more research is needed to confirm their validity and calibrate them against clinical diagnoses. Finally, because of the legal restrictions and study time frame we were only able to include the minors, aged 15−17 years, in two of the three waves of the study and conduct the data collection anonymously.

CONCLUSIONS
The data collected in the EMHS serves as a solid foundation for planning mental health policies both generally and during the crisis. Given a wide spectrum of mental health disorders covered, the EMHS offers the rich baseline to monitor the changes and build up a full-scale mental health surveillance system in Estonia. No less importantly, the EMHS allows valuable new insights into the determinants of mental health and well-being and enables to elucidate some mechanisms for developing mental disorders, particularly during the crisis. Such information is particularly valuable given the scarcity of mental health studies in the Eastern European region.

ACKNOWLEDGMENTS
The study was commissioned and funded by the Estonian Research Council using European Regional Development Fund program "Strengthening of sectoral R&D (RITA)" activity 1 "Support for

CONFLICT OF INTEREST STATEMENT
The authors declare no conflict of interest.

DATA AVAILABILITY STATEMENT
Data collected in the study are available from the authors upon request.

PEER REVIEW
The peer review history for this article is available at https://publons. com/publon/10.1002/brb3.3106