Tsogolo la Thanzi: A Longitudinal Study of Young Adults Living in Malawi's HIV Epidemic

Abstract Tsogolo la Thanzi (TLT) was designed to study how young adults navigate sexual relationships and childbearing during a generalized HIV epidemic. TLT began in 2009 with a population‐representative sample of 1,505 women and 574 men between the ages of 15 and 25 living in Balaka, southern Malawi, where regional adult HIV prevalence then stood at 15 percent. The first phase (2009–11) included a series of eight interviews, spaced four months apart. During this time, women's romantic and sexual partners enrolled in the study on an ongoing basis. A refresher sample of 315 women was added in 2012. Seventy‐eight percent of respondents were re‐interviewed in the second phase of TLT (2015), which consisted of follow‐up interviews approximately 3.5 years after the previous interview (ages 21–31). At each wave, detailed information about fertility intentions and behaviors, relationships, sexual behavior, health, and a range of sociodemographic and economic traits was gathered by means of face‐to‐face surveys. Biomarkers for HIV and pregnancy were also collected. Distinguishing features include: a population‐representative sample, closely spaced data collection, dyadic data on couples over time, and an experimental approach to HIV testing and counseling. Data are available through restricted data‐user agreements managed by Data Sharing for Demographic Research (DSDR) at the University of Michigan.

19 years and the total fertility rate (TFR) surpassed five children per woman as late as 2010 (NSO [Malawi] and ICF Macro 2011 and2016;Yeatman and Trinitapoli 2013). Malawi also has a generalized HIV epidemic where, as in the rest of sub-Saharan Africa, HIV is primarily transmitted through heterosexual sex. This poses a set of critical dilemmas for young adults who want children but are also concerned about HIV infection. The risks of contracting HIV and becoming pregnant are synchronized; many behaviors (e.g., abstinence, coital frequency, and condom use) that affect one also affect the other. Not coincidentally, peak HIV incidence for women overlaps almost exactly with peak fertility, underscoring the causal tangle that characterizes this period of the life course. TLT is based in and around the town of Balaka in southern Malawi, the region of the country with the highest HIV prevalence-about 15 percent at the time TLT began (NSO [Malawi] and ICF Macro 2011). A growing town and district capital, Balaka is located approximately 75 miles (120 kilometers) from Blantyre, along Malawi's main road and trading route, which bisects the country and connects Balaka to trading centers in Mozambique, Tanzania, and Zambia. Balaka is characterized by a religiously and ethnically diverse population surrounded by communities that rely primarily upon subsistence agriculture.

SAMPLE
TLT's core sample consists of population-representative samples of young women and young men between the ages of 15 and 25 in 2009; a separate sample of male partners was enrolled and followed over time (Table 1).

Core Sample
In April 2009, TLT conducted a complete household listing of all individuals whose usual residence was within a 4-mile (7-kilometer) radius of Balaka's main market. The catchment area includes a train station, a bus depot, a hospital and two smaller district health centers, a cotton "ginnery," dozens of religious congregations, and about 100 villages of various sizes. The listing exercise enumerated urban town residents and residents of rural villages well outside of town, and went beyond the listing of regular households and their members to include all individuals living in boarding schools and "rest houses" (motels used as residences, often for sex workers) within the catchment area. The core sample was drawn as a simple random sample of 1,562 women between the ages of 15 and 25 in the catchment area of whom 1,505 completed a baseline interview (96.4 percent response rate). Reasons for nonparticipation were refusal (N=30), not found (N=17), and language or mental health problems (N=10). Out of 614 men sampled in this same age-range, 574 (93.5 percent) completed a baseline survey. Reasons for nonparticipation among men included: not found (N=21), refusal (N=11), and language or mental health problems (N=8).
TLT was powered to study the relationship between childbearing and HIV among women. The smaller core sample of men facilitates comparisons by gender at this same stage of the life course and provides a benchmark for understanding the selective aspects of the male-partner sample (described below). The random sample of men also provided some "cover" for the romantic partners enrolled in the study; without them, it would have been too obvious that the men in the study had all gotten there because of their relationships with female respondents. Table 2 provides a descriptive overview of the core sample, by gender, at baseline. Coresample respondents come from a number of ethnic groups, the majority of which are traditionally matrilineal and matrilocal (e.g., Lomwe, Ngoni, and Yao), making the TLT sample more matrilocal than the country as a whole, which is approximately split between traditionally matrilineal and traditionally patrilineal ethnic groups (NSO [Malawi] and ICF Macro 2011). There is also considerable religious diversity in the sample: Catholic, Muslim, Mission Protestant, Pentecostal, and New Mission Protestant groups are all well represented. As expected for this age range, women are more likely than men to be married and have children and less likely to be in school.

Male Partners
Because both HIV infection and pregnancy are dyadic phenomena, TLT was designed with a focus on couple-level dynamics. Many of the behaviors of interest occur before marriage, thus TLT widened its lens to include not just spouses of women in the sample (a common approach in couples' studies) but also women's partners at much earlier stages of a relationship. During their first interview, and at all subsequent interviews, women answered detailed questions about their sexual and romantic relationships and then recruited the male partners they named into the TLT study by providing them with tokens that had unique identifiers. Female respondents shared these tokens with their male partners, and the men subsequently enrolled themselves in the study by coming to the TLT research center (N=1,226). Research staff confirmed the relationship through an initial conversation with the male partner at the reception area in the research center and then again based on his reports about his relationships during the survey itself. On a few occasions, the data from male partners were excluded from the study when they were identified as "imposters"-such as one case in which the respondent had passed the token to her brother, rather than her romantic partner. Male partners had a median age of 26 years at enrollment and ranged in age from 13-62.

Refresher Sample
Immediately following Wave 8, the study team added a refresher sample of 315 female respondents (out of 527, response rate: 60 percent) to TLT-1. Drawn from the original 2009 sampling frame to offset attrition, the refresher sample also provides a comparison sample against which analysts can identify potential panel-conditioning effects within the study.

STUDY DESIGN
TLT-1 (2009-11) featured an intensive design wherein the core sample of female and male respondents were interviewed every four months for eight waves (Figure 1). The design was motivated by the observation that many critical life events occur during early adulthood (e.g., sexual onset, marriage, school dropout, pregnancy, HIV infection), which makes it difficult (if not impossible) to untangle the temporal ordering of these events using cross-sectional data or prospective studies with multiyear intersurvey intervals. TLT's closely spaced intervals facilitate the identification of causal relationships during this eventful period of the life course and allow for new insights into the dynamics by which preferences, perceptions, and behaviors change.
The TLT-2 follow-up consisted of a single interview in 2015. TLT-2 included core-sample women, core-sample men, refresher-sample women, and current male partners. Across both phases of the study, men and women in the core sample were interviewed a maximum of nine times. Male partners contributed differing numbers of data points; once enrolled, they participated in all subsequent waves of TLT-1, even if their index relationship ended (e.g., up to eight waves for male partners who enrolled at Wave 1; up to three for those who enrolled at Wave 6). During TLT-2, women again recruited current male partners using tokens. Male partners from TLT-1 were only included in TLT-2 if they were still in a relationship with a female respondent. Male partners who appear in both the TLT-1 and TLT-2 datasets can be followed longitudinally across study phases using a single, unique identifier.
In 2015, TLT-2 successfully re-interviewed 80 percent of women and 71 percent of men in the core sample. A bivariate analysis of sample retention can be found in Table 2. Significant predictors of retention include: older age, not being enrolled in school at baseline, having less education, having been born in Balaka, ethnicity, and (for women) having ever been married. Photographs were used to confirm the identity of respondents during the study period.

CONSENT
Respondents gave consent at the time they were recruited for the study, before each interview, and before biomarker collection. Unmarried men and women aged 15-17 in the core sample enrolled in the study only after the study team obtained informed consent from a parent or guardian and assent from the minors themselves.

STUDY INSTRUMENTS
Malawian interviewers, trained extensively by the TLT research team and certified by Malawi's Ministry of Health to conduct HIV testing and counseling (HTC), conducted the interviews in Chichewa. Rather than following the standard household-survey approach, which is often impacted by bystanders and interruptions (Trinitapoli and Weinreb 2015), interviews took place in private rooms at the TLT research center, located near Balaka's main market. Interviewers and respondents were matched on gender. During each wave of TLT-1 and 2 respondents were asked questions from a core questionnaire about fertility behaviors and intentions, contraceptive use, relationships, perceived risk of HIV, HIV risk behaviors, HIV testing and treatment experiences, socioeconomic situation, childbearing, and health. In subsequent interviews, the core questionnaire was supplemented with in-depth modules focusing on a range of topics (Table 3).

ROSTERS
Respondents completed child rosters at baseline, in which they listed all the children they had given birth to or fathered, and full household rosters beginning at Wave 2. Interviewers updated these rosters at each subsequent wave, recording changes in the household's composition, including births and deaths.

BIOMARKER DATA
At each wave, female respondents were asked to take an hCG pregnancy test. The acceptance rate ranged between 84 and 94 percent across waves; the main reason for refusal was menstruation. Table 4 presents the results across all waves of pregnancy testing by self-reported pregnancy and identifies discrepancies between self-report and biomarker result. For example, 3.6 percent of women who reported not being pregnant had a positive pregnancy test, and 4.1 percent of those who reported being pregnant had a negative pregnancy test. Overall, however, self-reports of pregnancy and hCG test results are highly correlated (r = 0.81 when limited to pregnant/nonpregnant responses). During TLT-1, HIV testing and counseling was offered using a randomized experimental design in which the frequency of HIV testing was manipulated to allow researchers to examine the impact of HTC on a variety of behavioral and socioeconomic outcomes. At their baseline interview in 2009, core-sample respondents were randomized into three groups, with male partners subsequently assigned to the same group as their nominating partner: onethird would receive HTC at every interview, one-third would receive HTC at Waves 4 and 8; and the final one-third would not get tested at TLT until Wave 8. All respondents were offered HTC at TLT-2. At Wave 1, HTC uptake among respondents offered testing was 85 percent for women and 78 percent for men (core sample and partners). In 2015, 93 percent of women and 91 percent of men consented to HIV testing. HTC used rapid HIV tests and followed the protocol set by the Malawi Ministry of Health and used by local clinics.
Among core-sample respondents, HIV prevalence in 2009 (Wave 1) was 1.7 percent among men and 6.0 percent among women. By 2015 (TLT-2), prevalence had increased to 3.8 percent for men and 15.2 percent for women (Figure 2). Although sizable, this increase in HIV prevalence is consistent with the known age and gender patterns of HIV prevalence that have characterized the country's epidemic for decades (NSO [Malawi] and ICF International 2016). Women's HIV incidence between the 2011 (Wave 8)/2012 (refresher) interviews and TLT-2 (2015) when all respondents were offered testing was 1.58 per 100 person-years (95% CI=1.21, 2.02). 1

ANCILLARY QUESTIONNAIRES
Following the core questionnaire and biomarker portions of the study, women with new and ongoing pregnancies completed a pregnancy questionnaire that asked about when and how they found out they were pregnant, how they reacted emotionally, prenatal health behaviors, and antenatal testing for HIV. In waves that followed a birth, women also completed a postpartum questionnaire, which asked about their delivery, postnatal behaviors, and infant's health. Over the course of the study, the TLT research team tracked respondents who had migrated, re-interviewing those who had moved to other districts when possible. During TLT-1, most tracking was done within the district; however, at Wave 8 and TLT-2, migrants were followed throughout Malawi. When respondents truly could not be interviewed because of migration, interviewers conducted a migration autopsy with a family member or neighbor who knew the reasons for migration and the respondent's destination. Twenty-one respondents in the core and refresher samples died during the study period; to gather information about these deaths, interviewers conducted verbal autopsies with a family member or neighbor.

COUPLES AND PARTNERSHIP DATA
TLT data are designed to facilitate two types of dyadic data analysis. TLT collected data from women and men in relationships over time, capturing relationships as they formed, strengthened, and dissolved. Time-varying, couple-level linking files can be used to match data from TLT women to the corresponding data from the men they referred to the study. Additionally, all respondents were asked to report on up to three ongoing partnerships at each wave. These self-reported partnerships can be followed longitudinally over TLT-1 using a partnershiplink file-even if the partners themselves never enrolled in the study.

VALUE OF THE DATA
TLT's distinguishing features differentiate it from other studies in the region and make it useful to public health researchers and social scientists working in a variety of disciplines. Specifically, the study includes: r Detailed data from a population-based sample in an HIV-endemic context over time r An intensive design with respondents interviewed every four months during TLT-1 r A randomized, experimental approach to HIV testing and biomarkers for HIV status that mean researchers can study the impacts of knowledge of one's status as well as have data on respondents' HIV status and incident HIV infections r A biomarker for pregnancy r Dynamic, dyadic data on couples over time Key findings have centered on the changing HIV treatment context Yeatman and Trinitapoli 2017), fertility preferences (Trinitapoli and Yeatman 2011;Yeatman, Sennott, and Culpepper 2013;Yeatman and Sennott 2014;Sennott and Yeatman 2018;Trinitapoli and Yeatman 2018), education and literacy (Smith-Greenaway 2015; Frye 2017), young-adult relationships (Frye and Trinitapoli 2015), and family dynamics (Bachan 2014;Trinitapoli, Yeatman, and Fledderjohann 2014). A publication list for TLT is available at https://scholar.google.com/citations?user=OFx9oPIAAAAJ&hl=en.

LIMITATIONS
By design, TLT is a focused study of a particular period of the life course in a context of high HIV prevalence and early childbearing. As such, our study offers rich detail about men and women between the ages of 15 and 31 but cannot speak to behaviors and preferences outside this age range. Additionally, while the TLT sample is population-based and representative of the catchment area in 2009, it is not nationally representative and the geographic connection weakens over time as some respondents migrate outside the catchment area. Data users without much knowledge of African societies may not fully perceive the very particular and simultaneously very typical features of Balaka. We urge interested data users to read classic and contemporary ethnographic accounts of daily life in this part of the world (Mitchell 1956;Peters 1997;Verheijen 2011). Additionally, the study is relatively small, and while there are up to nine waves of data from each respondent, the sample size may constrain certain analyses of subpopulations including men in the core sample.
Although TLT data are distinct in key ways, some users may want to compare TLT data to other high-quality studies of adolescents and young adults in Malawi, such as the Marriage Transitions in Malawi project (MTM) (Beegle and Poulin 2017) and Malawi Schooling and Adolescent Study (MSAS) or to the Malawi Longitudinal Study of Families and Household (MLSFH) and the Malawi Demographic and Health Survey (MDHS), which cover a much wider age range.
Because of deductive disclosure concerns, all files are released under restricted-use access agreements. More detailed information about the TLT study including survey instruments, documentation guides, a data key, a user forum, and links to open-access versions of many publications are available on the TLT website: http://tsogololathanzi.uchicago.edu.