Validation of diagnostic and predictive biomarkers for hereditary angioedema via plasma N‐glycomics

Abstract Background Hereditary angioedema (HAE) is a rare disease with heterogeneous clinical symptoms. It is vitally important to predict whether an HAE patient will develop severe symptoms in clinical practice, but there are currently no predictive biomarkers for HAE stratification. Plasma N‐glycomes are disease‐specific and have great potential for the discovery of non‐invasive biomarkers. In this study, we profiled the plasma N‐glycome of HAE patients from two independent cohorts to identify candidate biomarkers. Methods Linkage‐specific sialylation derivatization combined with matrix‐assisted laser desorption/ionization time‐of‐flight mass spectrometry detection and automated data processing was employed to analyze the plasma N‐glycome of two independent type‐1 HAE cohorts. Results HAE patients had abnormal glycan complexity, galactosylation, and α2,3‐ and α2,6‐linked sialylation compared to healthy controls (HC). The classification models based on dysregulated glycan traits could successfully discriminate between HAE and HC with area under the curves (AUCs) being greater than 0.9. Some of the aberrant glycans showed response to therapy. Moreover, we identified a series of glycan traits with strong associations with the occurrence of laryngeal or gastrointestinal angioedema or disease severity score. Predictive models based on these traits could be used to predict disease severity (AUC > 0.9). These results were replicated in an independent cohort. Conclusions We reported the full plasma N‐glycomic signature of HAE for the first time, and identified potential biomarkers. These findings may play a critical role in predicting disease severity and guide the treatment of HAE in clinical practice. Further protein‐specific and prospective studies are needed to validate our findings.

predicting disease severity and guide the treatment of HAE in clinical practice.
Further protein-specific and prospective studies are needed to validate our findings.

K E Y W O R D S
biomarker, disease severity, galactosylation, plasma N-glycome, sialylation

S C H L Ü S S E L W Ö R T E R
biomarker, galaktosylierung, plasma-N-glykom, schwere der erkrankung, sialylierung

| INTRODUCTION
Hereditary angioedema (HAE) is a rare autosomal dominant disease that is primarily characterized by unpredictable and potentially lifethreatening subcutaneous or submucosal edema, which can affect the extremities, face, larynx, abdomen, or genitalia, with an incidence of 1/50,000 to 1/100,000. [1][2][3] Typically, HAE is caused by C1 inhibitor (C1-INH) deficiency (reduced level or abnormal function), and mutations in the SERPING1 gene. 1,3 HAE attacks can be spontaneous or provoked by emotional fluctuation, menstruation, fever, and trauma. 1,2 Currently, HAE is clinically diagnosed by combining family history, clinical symptoms, and laboratory tests (primarily complement tests). 1 As HAE is a rare disease, and sometimes presents similar symptoms to other diseases, HAE patients are often misdiagnosed and treated incorrectly. For example, gastrointestinal angioedema episodes present clinical symptoms similar to those of acute intestinal obstruction, which often leads to unnecessary surgical intervention. 3 Delaying diagnosis and treatment can have fatal consequences for HAE patients. Laryngeal angioedema attacks cause the obstruction of the upper airways, for which ineffective treatments (antihistamines, adrenaline, and corticosteroids) are often misused, and suffocation may ensue. However, specific target therapies, such as bradykinin B2 receptor antagonists, C1 inhibitor concentrate, kallikrein inhibitor, and sometimes tracheotomy, can prevent death in HAE patients. 4 The situation is complicated by the heterogeneity of HAE clinical manifestations, even in individuals from the same family. Management strategies for different HAE patients differ greatly and may be adjusted based on the variable frequency and types of HAE attacks experienced by individual patients. However, there are currently no available methods to stratify HAE patients. Currently, danazol is the primary treatment option for shortand long-term prophylaxis in China. However, danazol is associated with serious side effects, such as virilization, sequelae of obesity, and increased hepatic enzyme levels. Therefore, it is vital to develop tools used for patient stratification to guide individualized disease management.
N-glycosylation is a common and functionally relevant posttranslational modification involving the synergistic action of multiple enzymes and transporters. [5][6][7] Glycosylation is not template-driven, but rather introduces variability in proteins that is independent of their corresponding DNA sequences. 7 N-glycans are highly diverse, and regulate multiple biological processes, from protein folding and stability to receptor-ligand interactions and immune responses. 8 Specific glycoforms depend on genetic, pathophysiological, and environmental factors. 9 Plasma or serum protein N-glycosylation in a given physiological state is stable, but can dramatically change in response to pathological conditions. Specific glycosylation features of the plasma N-glycome are related to various pathological conditions, such as Down syndrome, 10 rheumatoid arthritis, 11 inflammatory bowel disease, 12 and cancers. 13,14 Importantly, the plasma N-glycome is often disease-specific. The plasma N-glycome may represent a noninvasive biomarker, and provide a more thorough understanding of specific disease mechanisms. Currently, little is known regarding N-glycosylation in HAE.
In this study, we investigated the plasma protein N-glycomic features in HAE patients, HAE patients after treatment, and healthy controls (HCs). Thereafter, we validated our findings in an independent cohort. As the function of sialylation depends on the linkage type, we employed the linkage-specific sialic acid derivatization method developed by Reiding et al. 15 to distinguish between the α2,3-and α2,6-linked sialic acids on the non-reducing end of plasma glycans. We used matrix-assisted laser desorption/ionization time-of-flight (TOF) mass spectrometry (MALDI-TOF MS) to profile the N-glycome of the cohort. 15 Our primary objective was to reveal the plasma N-glycomic features that are HAE-specific, and to discover and validate putative glycan biomarkers for HAE diagnosis, stratification, and monitoring.

| Study design and participants
In the present study, the discovery cohort enrolled 21 drug-naïve

| Plasma N-glycome detection and MS data (pre-)processing
Plasma samples were enzymatically treated to release N-glycans according to a previously reported method. 15 Briefly, 10 μL of 2% SDS was added to 5 μL of plasma, and the mixture was incubated for 15 min at 65°C. The N-glycans were then released by the addition of 10 μL of release mixture (1 U PNGase F, 2% NP-40, and 2.5 � PBS), followed by overnight incubation at 37°C. To derivatize sialic acids, α2,3-linked sialic acids were lactonized, and α2,6-linked sialic acids were ethyl-esterified, allowing mass-based distinction between the sialic-acid linkage variants. 15 Released N-glycans were then enriched by hydrophilic interaction liquid chromatography solid-phase extraction (HILIC-SPE) micro-tips using cotton thread as the stationary phase packed in the pipette tips and enriched glycans were eluted with Milli-Q water according to the method described previously. 13,14 HILIC-SPE cotton-tips allowed the removal of salts, most deglycosylated peptides, and detergents from glycoconjugate samples. In addition, subsequent MALDI-TOF MS glycan profiles were very repeatable with different tips. 15 The sialylated N-glycans were detected simultaneously with non-sialylated N-glycans using MALDI-TOF MS in positive-ion mode as previously described, with minor modification. 14 Briefly, 1 μL of purified glycans was mixed with 1 μL of matrix (5 mg/mL super-2,5-dihydroxybenzoic acid with 1 mM NaOH in 50% acetonitrile) on the target plate. Mass spectra were obtained using a rapifleXtreme MALDI-TOF MS (Bruker Daltonics). The instrument was equipped with a Smartbeam-3D laser and was controlled using flexControl 4.0 (Bruker Daltonics). The laser was fired 5000 times per spot in a random walking pattern at a frequency of 5000 Hz. The mass range was set to 1000-5000 m/ z. The instrument was calibrated using external calibrants (Bruker Peptide Calibration Standard II).
Raw MS data was baseline-subtracted and smoothed. The MS data was transformed to .xy files and re-calibrated using selected glycan signals as calibrants (Table S1) 16 Thereafter, summed spectra were generated for each biological group (untreated HAE, treated HAE, and HC) and the quality control group (randomly distributed plasma standard). For each summed spectrum, mono-isotopic peaks with good signal to noise (S/N; >3), good relative intensity (>0.1%), and good isotopic patterns were filtered for further analysis, and 90 mono-isotopic peaks were assigned to N-glycan structures using Glycoworkbench 17 as well as previously confirmed glycan compositions. 15 Finally, an N-glycan composition list was generated for use in subsequent targeted data extraction. The peak intensities of the putative N-glycans were extracted as peak area (background-corrected) for all samples using the N-glycan composition list and MassyTools. Further processing of the extracted data was done in Microsoft Excel. Glycan structures were excluded after applying cut-offs for S/N (>9), mass accuracy (ppm error < |20|), isotopic pattern quality (QC score < 25%), and the minimum percentage (>50%) of presence in all spectra of HAE, HC, or quality control plasma samples. 14 After curation, 59/90 N-glycans (Table S2) (Table S3). The formulas used for the calculation of the derived glycan traits are given in Table S3, and calculations were performed in RStudio. The subject of the calculation is represented by the last letter, for example, sialylation (S), and the group on which it is calculated by the preceding letters, for example, triantennary fucosylated species (A3F). This, for instance, translates A3FS into the sialylation within triantennary fucosylated species. 14,15,18 Differential derived N-glycan traits demonstrate that changes to glycosylation are shared by a series of structurally related N-glycans. 14

| Statistical analysis
Comparisons were made for derived N-glycan traits between two subgroups in each cohort (untreated HAE vs. HC, treated HAE vs. HC) using the Mann-Whitney-Wilcoxon test (because the data was non-normally distributed). Multiple testing correction was performed to adjust the significance threshold (p = 0.05/82). Regression analysis was performed in RStudio. The diagnostic/predictive potential of the individual N-glycan traits was further evaluated based on receiver operator characteristics (ROC). Classification/ prediction biomarker models were constructed using multivariate algorithms for support vector machines (SVMs) based on the differentially expressed derived N-glycan traits or glycan traits strongly associated with clinical symptoms. ROC curves were obtained by Monte Carlo cross validation (MCCV). In each MCCV, two-thirds of the samples were used to assess the importance of each glycan trait, and the remaining one-third was used to validate the biomarker models generated in the first step. The top-ranking important glycan traits were subsequently used to construct predictive biomarker models. These steps were repeated, and the performance of each model was calculated and compared. The area under the curve (AUC) of ROC curves and predictive accuracy were used to assess the performance of the output models. We considered AUCs ≥ 0.9 to represent highly accurate tests, whereas 0.8 ≤ AUCs < 0.9 represented accurate tests, and 0.7 ≤ AUCs < 0.8 represented moderately accurate tests. ZHANG ET AL.

| RESULTS
The clinical characteristics of two cohorts are presented in Table 1.  (Table S4). The average RSD of the directly detected N-glycan traits (top 20) and all 82 derived traits was 3.93% and 2.32%, respectively (Table S4). Derived traits appear to have better technical robustness than directly detected glycans. When doing analysis, we found 13 directly detected glycan traits were significantly changed in HC compared with untreated HAE ( Figure 1B). As derived N-glycan traits could combine the exact effects of individual glycans sharing similar glycan structures, facilitate interpretation of biological effects, and have higher repeatability than the individual glycan traits from which they were calculated, 19 we subsequently focused on the derived N-glycan traits for the comparison and analysis in the present study.

| Identification of plasma N-glycomic features in HAE patients
Eight replicated derived N-glycan traits differed significantly between patients with drug-naïve type 1 HAE and HCs in the two independent cohorts ( Figure 2; Table 2; Figure S1). N-glycomes in HAE patients had different antennarity within the complex-type glycans (CA) than in HCs. The level of tetra-antennary glycans within complex type (CA4) in HAE patients decreased compared with that in HCs ( Figure 2; Table 2; Figure S1). The levels of galactosylation of tetraantennary glycans (A4G) and galactosylation of sialylated diantennary glycans (A2SG) in HAE patients were higher than that in HCs ( Figure 2; Table 2; Figure S1). Patterns of sialylation also differed between HAE patients and HCs. The level of sialylation of tetraantennary glycans (A4S) was higher in subjects with HAE than in HCs, which was primarily caused by the increase in α2,3-linked sialylation of tetra-antennary glycans (A4L; Figure 2; Table 2; Figure S1).

| Performance of plasma glycan traits for diagnosing HAE
The top seven potential classification biomarker models were constructed using multivariate algorithms for SVMs, based on the eight replicated differentially expressed derived N-glycan traits in the two independent cohorts (Figure 3; Figure S2). For each model, two to eight derived N-glycan traits were selected via automated important feature identification ( Figure  3D; Figure S2D). ROC curves were generated for each model. The performance of the seven models was highly accurate, with AUCs ranging from 0.927 to 0.931, and predictive accuracies from 87.5% to 90.0%, for discriminating between HAE patients and HCs in the discovery cohort ( Figure S2A-S2C). The performance of the

| Plasma N-glycomic changes in HAE patients after treatment
Among the eight replicated abnormal plasma N-glycomic features identified in HAE patients, A4G, A4S, A4L, A4F0E, A4FE, and A4GE showed responses to treatment (Figure 2; Figure S1). After treatment, these derived N-glycan traits showed significant differences between treated and untreated HAE patient groups and returned to near normal levels ( Figure 2; Figure S1). In contrast, derived N-glycan traits CA4 and A2SG did not change after treatment (Figure 2; Figure S1). In addition, medication control can reduce the frequency of edema attacks of the patients (data not shown). To investigate whether these glycan traits could be used as biomarkers for disease monitoring/ prognosis, we attempted to construct predictive models using multivariate algorithms based on the derived glycans traits that showed a response to treatment. The top five models included two to six of the glycan traits with "accurate" performance based on their AUCs, which ranged from 0.809 to 0.868 in predicting the response to treatment in the discovery cohort ( Figure S5A).
The results were validated in the validation cohort, with AUCs ranging from 0.831 to 0.852 ( Figure S5B). These models may have applications as biomarkers in HAE disease monitoring and prognosis.

| Association between plasma N-glycomes and clinical features and parameters
The association between plasma N-glycomes and the occurrence of laryngeal angioedema, gastrointestinal angioedema, and disease severity score in HAE patients was explored by logistic regression. A severity scoring system mainly based on the edema frequency and locations was used to assess the disease severity of HAE patients, as previously described. 20 Specifically, the clinical severity score  (Table S5). α2,3-linked sialylation (A2F0L, A2F0GL, and A3L) was negatively associated with gastrointestinal angioedema occurrence (Table S5). Galactosylation of diantennary glycans (A2G) was found to be positively associated with disease severity score (Table S5). Several glycan traits (A3F, A3LF, A3EF, A3F0S, and A4GE) were associated with plasma levels of CI-INH (Table S5). The associations of these glycan traits were validated in the validation cohort (Table S5).

| Performance of plasma glycan traits in predicting the occurrence of laryngeal and gastrointestinal angioedema
The predictive efficacy of glycan traits with strong associations with laryngeal or gastrointestinal angioedema was evaluated by plotting ROC curves. In the discovery cohort, the AUC of A2G was 0.938 for distinguishing between HAE groups in which laryngeal angioedema had ever occurred versus not occurred ( Figure 4A). The performance of A2G in predicting the occurrence of laryngeal angioedema was validated in the validation cohort with an AUC of 0.927 ( Figure 4B), suggesting that A2G has good predictive ability for laryngeal angioedema occurrence. Based on the three derived glycan traits that showed strong associations with gastrointestinal angioedema, two potential predictive biomarker models were constructed ( Figure S6). For each model, two to three derived N-glycan traits were selected through automated important feature identification ( Figure S6). ROC curves were generated for each model. The

| DISCUSSION
Accurate, non-invasive biomarkers for the diagnosis, prediction of disease severity, and disease monitoring of HAE are currently unmet clinical needs. Recently, plasma/serum protein N-glycomes have been identified as biologically significant, and have emerged as a showcase of non-invasive biomarkers for various diseases. [10][11][12][13][14]18 The objective of this study was to reveal the disease-specific Nglycome phenotype of HAE to discover potential biomarkers for the diagnosis, prediction of disease severity, and monitoring of HAE Here, we found that the levels of galactosylation of tetraantennary glycans (A4G) and galactosylation of sialylated diantennary glycans (A2SG) were increased in HAE compared to that in HCs. Higher galactosylation level was previously linked with increased risk of developing type 2 diabetes. 21 Moreover, the levels of tri-and tetra-galactosylated N-glycans were previously reported to be increased with age in patients with Down syndrome, but not in HCs. 10 In humans, the glycan traits A2SG/A4G are primarily derived from alpha1-antitrypsin, alpha1Bglycoprotein, fibrinogen, haptoglobin, hemopexin, serotransferrin, and other acute-phase proteins produced in the liver. 18 Evidence for altered hepatic synthesis has been found in HAE patients. 22 Furthermore, some of these liver-derived proteins are related to the pathophysiology of HAE. Acute-phase proteins, such as fibrinogen, are risk factors for angioedema induced by bradykinin. 23 In addition, the production of the inflammatory peptide bradykinin by the contact system can be inhibited by alpha1-antitrypsin variants. 24 Galactosylation occurs via a set of beta-1,4-galactosyltransferases, the activity of which in plasma is associated with inflammatory diseases and aging. 10,25 Decreased IgG-derived galactosylation has been linked to inflammation, immune disorders, cancers, and aging. [26][27][28] We also observed decreased IgG-galactosylation in HAE patient plasma, although this decrease was not significant after multiple testing correction (TA2FS0).
Though the role of increased non-IgG-derived galactosylation, which was linked to acute-phase liver proteins in the present study, has not yet been established, we assume that abnormal non-IgG-derived galactosylation is involved in the causal mechanisms of HAE, and thus warrants further investigation.
Sialylation directly participates in the activation of the immune system, which depends on sialic acid linkage types. 29,30 For example, siglecs on immune cells specifically recognize non-fucosylated α2,6sialic acid epitopes. 29 Our novel MS-based approach enables the F I G U R E 4 Performance of glycan trait A2G in predicting the occurrence of laryngeal angioedema (A) in the discovery cohort and (B) in the validation cohort. The red dot represents the optimal cut-off value ZHANG ET AL.
The decrease in α2,6-sialylation with a fucose (A4FE) may derive from immunoglobulins, whereas the increased glycan traits are primarily derived from a mixture of glycoproteins containing highly sialylated glycans produced by the liver (e.g., alpha1-acid glycoprotein and alpha1-antitrypsin), 18 as immunoglobulins primarily carry the α2,6-sialylation with core-fucosylation. 31,32 The observed increase in sialylation within non-fucosylated liver-derived tetraantennary glycans might reflect glycosylation and abundance changes in acute-phase proteins. Abnormal sialylation has been found previously in autoimmune diseases, multiple cancer types, and inflammation. 33,34 Furthermore, α2,6-sialylation of non-fucosylated multiple antennary glycans (A(3-4)F0GE) is associated with inflammatory markers in inflammatory bowel disease. 12 Although most glycan traits containing sialylation were altered during HAE, our data suggest that these changes in sialylation were partially driven by α2,6-sialyltransferases. ST6Gal1 is the main sialyltransferase, which attaches α2,6-linked sialic acids to N-glycans. 25  Medication control can reduce the frequency of edema attacks of these patients (data not shown). Considering both the glycan changes and reduction of attacks after treatment, we propose that the N-glycan traits which showed response to treatment may have potential as predictive or monitoring biomarkers. However, this needs further long-term follow-up and validation studies in large cohorts.
The present study has some limitations. First, isomers may exist for our assigned glycan structures. Second, MS-based N-glycome profiling provides relative quantification, and is influenced to some extent by plasma/serum levels of the related glycoproteins. Quantifying protein-specific glycosylation in combination with measuring the levels of glycoproteins will give in-depth insights into the mechanisms underpinning the N-glycome, but remains a challenge in this field. Third, because of the rarity of this disease, the untreated and treated HAE groups could not be paired, which could affect the resulting data. Despite these limitations, our research serves as a starting point for future validation and mechanistic studies.
In conclusion, using comprehensive N-glycomic profiling methods, we analyzed the plasma N-glycome for two independent well-characterized type 1 HAE cohorts, and for the first time reported the full plasma N-glycomic signature of HAE. We identified plasma N-glycosylation changes specific to HAE, and observed major disease-specific dysregulated glycosylation, namely branching of complex (CA), galactosylation, and sialylation. Novel associations between clinical symptoms/parameters of HAE and glycosylation were found. The combination of altered glycan traits was used to generate classification/prediction models that could be used for the diagnosis, prediction of disease severity, and monitoring of HAE. All the results were validated in an independent validation cohort. These results may play a critical role in predicting/assessing the disease severity of HAE in the future. Further studies are needed to improve our understanding of the role of glycosylation in the pathology of HAE.

ACKNOWLEDGMENTS
We thank all members of our group involved in this study. We thank the patients and volunteers who provided blood samples for the present study. This research was supported by grants from the Na- 2016YFC0901501, to Yuxiang Zhi).

CONFLICT OF INTEREST
The authors declare no conflict of interest.