The BIOMarkers in Atopic Dermatitis and Psoriasis (BIOMAP) glossary: developing a lingua franca to facilitate data harmonization and cross‐cohort analyses

Dear Editor, BIOMAP (BIOMarkers in Atopic dermatitis and Psoriasis) is a large European consortium aiming to advance personalised medicine for atopic dermatitis and psoriasis by identifying biomarkers which predict therapeutic response and disease progression. BIOMAP brings together clinicians, researchers, patient organisations and pharmaceutical industry partners and encompasses data from over 60 individual studies, including randomised clinical trials, population-based cohorts and deeply-phenotyped disease registries. The curation and harmonisation of data and bio-samples from these established studies will facilitate cross-cohort clinical and molecular analyses, increasing the potential to identify small effect estimates and to better stratify disease subtypes. This letter serves to disseminate BIOMAP's pathway to data harmonisation and will inform future collaborative research endeavours.

DEAR EDITOR, The BIOMarkers in Atopic dermatitis and Psoriasis (BIOMAP) is a large European consortium aiming to advance personalized medicine for atopic dermatitis and psoriasis by identifying biomarkers that predict therapeutic response and disease progression. BIOMAP brings together clinicians, researchers, patient organizations and pharmaceutical industry partners, and encompasses data from over 60 individual studies, including randomized clinical trials, population-based cohorts and deeply phenotyped disease registries. The curation and harmonization of data and biosamples from these established studies will facilitate cross-cohort clinical and molecular analyses, increasing the potential to identify small-effect estimates and to better stratify disease subtypes. This research letter serves to disseminate BIOMAP's pathway to data harmonization and will inform future collaborative research endeavours.
Pooling data from diverse studies presents inherent challenges. Each study has different methodologies, research objectives and outcomes. Data harmonization improves the comparability of existing studies by converting similar variables to a common format and creating 'harmonized datasets', which can be used for cross-cohort analyses. Figure 1 outlines how BIOMAP follows existing data harmonization guidelines, 1 ensuring that clinically appropriate and meaningful conclusions can be drawn.
BIOMAP's objectives were outlined in the project proposal (step 0). During protocol development, a list of variables pertinent to BIOMAP's key research questions was devised. These predefined 'BIOMAP categories' included clinical phenotypes, disease associations, environmental/lifestyle factors, treatments and outcome measures. Next, a detailed mapping exercise was performed to explore what data were available in a subset of the studies underpinning BIOMAP. This involved the custodians of individual study datasets assigning a BIOMAP category to each variable in their study's data dictionary. Annotated data dictionaries were assimilated into a clinical 'metadata catalogue' indexed according to the BIO-MAP categories, generating a high-level overview of the clinical variables recorded in this sample of BIOMAP studies (step 1). The metadata catalogue identified similarities and discrepancies between studies, and formed the foundation of the BIOMAP glossary.
The BIOMAP glossary defines a list of core variables, using harmonized terminology and data format (step 2), and will be used to create harmonized datasets. The Glossary Development Team comprised clinical, bioinformatics, biostatistics and laboratory expertise, and discussed the potential contents of the glossary (11 members, representing five BIOMAP organizations). Discussions were informed by the metadata catalogue, literature reviews and existing harmonization initiatives, including the TREatment of ATopic eczema (TREAT) Registry Taskforce, 2 Harmonising Outcome Measures for Eczema 3 and the International Psoriasis Council. 4 A BIOMAP webinar introduced data harmonization to the wider BIOMAP consortium, illustrating the fundamental role the glossary would play in downstream BIOMAP analyses. Following the webinar, glossary stakeholders were identified (n = 67, including work-package leaders, dataset custodians, clinicians and analysts from 28 BIOMAP organizations).
A draft glossary was circulated to the glossary stakeholders who refined and approved the finalized glossary through a series of three interactive Zoom meetings. Following group discussion, any amendments to the proposed glossary were approved or rejected through anonymous polling, using in-built Zoom functionality (30 polls). The outcome of voting was accepted Figure 1 The pathway to data harmonization of BIOMarkers in Atopic dermatitis and Psoriasis (BIOMAP) studies. (Left) Proposed steps for retrospective data harmonization (adapted from the Maelstrom guidelines). 1 (Right) Implementation of these steps for data harmonization in BIOMAP. Overlapping boxes represent steps running concurrently. Following finalization of the BIOMAP glossary (step 2), harmonization of individual study datasets started in a pragmatic and prioritized manner, based on the availability of data and proposed cross-cohort analyses. Quality assurance (step 4) is integrated with step 3 in our harmonization pipeline, expediting the availability of harmonized datasets for crosscohort analyses. with a simple majority (median agreement 100%; range 57-100) and the BIOMAP glossary version 1.0 was finalized.
Primary datasets are being transformed to conform to the content and structure of the BIOMAP glossary, creating harmonized datasets (step 3). Iterative discussions between each dataset custodian and the harmonization bioinformaticians culminate with a dataset-specific mapping document specifying how individual variables will be transformed to the glossarydefined dataset, thus ensuring accurately harmonized data (step 4). Harmonized datasets are made available on a secure, centralized and access-controlled data platform (step 5). Harmonized clinical datasets complement a carefully curated bioresource of archived and newly obtained biospecimens, which will be used for multiomic profiling of skin and blood.
The structure of the BIOMAP glossary was inspired by the internationally recognized Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). 5 The OMOP CDM adopts existing vocabularies, such as SNOMED Clinical Terms, 6 and was developed to implement standardized analytical approaches on large observational datasets. During glossary development, deviations from the OMOP CDM were made where existing variables were not represented in the OMOP-defined terminology or where dermatological research required additional granularity (e.g. detailed information regarding phototherapy). The OMOP CDM tabular structure was adjusted to match BIOMAP analysts' requirements. Full compatibility with the OMOP CDM is a priority for further development of the glossary.
The publicly available BIOMAP glossary may benefit investigators beyond the BIOMAP consortium who could prospectively align future studies with the glossary's clinical variables, thus facilitating comparative analyses. 7 Published dermatological research using OMOP approaches is currently limited. 8 Cooperation between BIOMAP and OMOP, leading to the incorporation of BIOMAP customizations into the OMOP CDM is an appealing prospect. Collaboration could further enhance the potential for dermatological research using large observational datasets.

Supporting Information
Additional Supporting Information may be found in the online version of this article at the publisher's website: Appendix S1. Acknowledgements. Appendix S2. Full list of author affiliations. Appendix S3. Author conflicts of interest. The aim of this study was to establish whether FAPD can be differentiated from FFA by histopathological analysis. We conducted a cross-sectional analysis of biopsies from 43 women, all with a previous classical diagnosis of FAPD or FFA. [1][2][3][4] All samples had horizontal sections in haematoxylin and eosin stain. Anisotrichia (hair fibre diversity) was present only in patients with FAPD. Twenty-one histopathological markers were critically compared and contrasted. Analysis of nonparametrically distributed data was performed using the Mann-Whitney U-test, and Person's v 2 -test was used to measure association. The research complied with Good Clinical Practice guidelines and was approved by an institutional review board.
Twenty-six cases of FAPD and 17 of FFA were selected for the study. Nine of 21 (43%) parameters were statistically different between the groups. In FAPD we found an increased average quantity of vellus hair in each sample, increased terminal follicles in the catagen or telogen phase, and lower telogen-to-vellus (T:V) ratio. In FFA, there were higher follicular scar counts, more vacuolar degeneration of the follicle epithelium and higher presence of perifollicular clefts at the infundibulum and isthmus, as well as lower amounts of arrector pili muscles (Table 1).
In our study, FAPD showed a statistically higher percentage of vellus hairs and terminal follicles in the catagen or telogen phase, as well as lower T:V hair ratio. These features are all known to be present in androgenic alopecia (AGA) and were not found to be relevant in the analysed FFA specimens. The clinical presence of diffuse hair thinning and dermoscopic features of anisotrichia are the most distinguishing signs of FAPD. FAPD shares its clinical presentation and pattern loss with AGA, but also has many clinical, dermoscopic and histopathological findings that overlap with FFA. 1, 3 The first description of FAPD, by Zinkernagel and Tr€ ueb, noted an increase in telogen count as observed in patients with AGA. Zinkernagel and Tr€ ueb found hair follicle miniaturization in 10 of 14 FAPD scalp biopsies. 4 In a study of patients with FAPD, Teixeira et al. demonstrated histopathological findings of AGA in 16 of 16 biopsies and inflammation of vellus hair follicles in 10 of 16 scalp samples. 5 Starace et al., 6 Chiu and Lin, 7 and Griggs et al. 1 used the higher presence of vellus hairs in FAPD as a histological criterion to separate FAPD from differential diagnoses of FFA and other lichenoid alopecias.
Although neither the intensity nor the location of inflammatory infiltrate was significantly associated with either diagnosis, the analysis of the inflammatory infiltrate demonstrated that FFA did show a significantly greater presence of inflammatory infiltrate associated with vacuolar degeneration of the follicular epithelium (FFA 63% vs. FAPD 30%, P = 0Á034). We observed a higher frequency of perifollicular clefts in FFA (FFA 63% vs. FAPD 26%, P = 0Á018), a greater amount of follicular scars in FFA than in FADP (mean score 9Á37 vs. 4Á77, P = 0Á004) and a reduction in arrector pili muscles in FFA (FFA 63% vs. FAPD 26% FAPD, P = 0Á018), demonstrating that FFA causes greater structural disruption than FAPD. 8 In conclusion, follicle classification and counts are a relatively simple way to differentiate FAPD from FFA. FAPD presents with findings reminiscent of AGA at sites of disease activity, with an increase in the vellus follicle number, a reduction in the T:V ratio, and an increase in the terminal follicles that are in the catagen or telogen phases. FFA shows features of more structural disruption than FAPD. This observation is consistent with the fact that FFA rapidly progresses to a cicatricial condition while FAPD follows a more indolent evolution over a comparable duration.