Disease‐specific plasma protein profiles in patients with fever after traveling to tropical areas

Fever is common among individuals seeking healthcare after traveling to tropical regions. Despite the association with potentially severe disease, the etiology is often not determined. Plasma protein patterns can be informative to understand the host response to infection and can potentially indicate the pathogen causing the disease. In this study, we measured 49 proteins in the plasma of 124 patients with fever after travel to tropical or subtropical regions. The patients had confirmed diagnoses of either malaria, dengue fever, influenza, bacterial respiratory tract infection, or bacterial gastroenteritis, representing the most common etiologies. We used multivariate and machine learning methods to identify combinations of proteins that contributed to distinguishing infected patients from healthy controls, and each other. Malaria displayed the most unique protein signature, indicating a strong immunoregulatory response with high levels of IL10, sTNFRI and II, and sCD25 but low levels of sCD40L. In contrast, bacterial gastroenteritis had high levels of sCD40L, APRIL, and IFN‐γ, while dengue was the only infection with elevated IFN‐α2. These results suggest that characterization of the inflammatory profile of individuals with fever can help to identify disease‐specific host responses, which in turn can be used to guide future research on diagnostic strategies and therapeutic interventions.


Introduction
Fever is a common symptom of individuals seeking healthcare after returning from travel to tropical areas [1].In a previous study in Sweden, most of these cases were shown to be due to gastroenteritis, malaria, influenza, and dengue virus, but >40% remained with unknown etiology even after healthcare contact [2].Upon further serological analysis of those with unknown etiology, approximately 9% were found to have a likely influenza virus infection, 4% dengue virus infection, and another 4% had Rickettsial infection [2].This highlights the difficulties in travel medicine as the number of pathogens that need to be considered is numerous, and patients with various diagnoses often present with overlapping clinical symptoms [3][4][5][6].Moreover, clinical markers such as C-reactive protein (CRP) and white blood cell total and differential counts often provide limited guidance.Pathogens that can give rise to severe disease are important to detect at an early stage, while at the same time, unnecessary antibiotic usage should be avoided [7].Learning more about how immune responses differ between infections could potentially be informative in developing biomarker signatures for disease identification and for a better understanding of host-pathogen interaction and protective immunity.
During infection, a wide range of inflammatory proteins are up-or downregulated in the host's response to the pathogen.Depending on the infecting pathogen, different types of immune responses are important for efficient control [8].These inflammatory responses tend to be short-lived and self-regulatory as the pathogen is controlled.However, in some instances, the inflammatory response becomes dysregulated leading to severe disease manifestations [9][10][11].Measuring inflammatory proteins in the blood can therefore help to understand the host inflammatory response to infection and predict disease severity [12][13][14][15][16][17] or disease etiology [18].Different infections may induce overlapping inflammatory protein profiles and studying a single infection in isolation will make it difficult to understand the pathogenspecificity of the responses and limit the possibility to discern how patterns differ between infections.Recently developed experimental and bioinformatics methods allow for comprehensive analysis of cytokine profiles useful for mapping immunological events as well as for identification of clinically relevant markers of disease and its severity [18][19][20][21][22][23][24].
This exploratory study provides a comprehensive mapping of inflammation-associated proteins in the plasma of individuals with a known etiology of infection who presented with fever after travel to tropical or sub-tropical areas.By comparing the inflammatory response between the different disease groups, we could identify patterns associated with the respective etiologies.These patterns provide insights regarding the pathogen specificity of the host inflammatory response to infection and could potentially together with current clinical diagnostic variables be further explored as biomarkers for disease stratification.

Study inclusion
Patients with a history of travel were invited to participate when seeking care at the Emergency Department at Karolinska Uni-versity Hospital, Stockholm, Sweden.Inclusion criteria were (1) Travel within the past 2 months to a tropical (defined as between latitude 0°and ± 23.5°) or subtropical area (defined as between latitude ±23.5°and ±40°), (2) age ≥18 years, and (3) documented body temperature >38°C at the hospital or self-reported fever within the previous 2 days.Blood samples for serum and EDTA plasma isolation were collected from all individuals and aliquots were frozen at -80°C for later analysis.Demographic and clinical data, including microbiological diagnosis after routine clinical investigation, were extracted from medical records and a questionnaire filled in by patients.Five groups of diagnoses were selected for further study: (1) malaria caused by Plasmodium falciparum as defined by microscopy and qPCR, (2) dengue diagnosed by a positive result in qPCR for dengue virus, (3) gastroenteritis with fecal culture positive for enteropathogenic bacteria, (4) influenza diagnosed by positive qPCR for influenza virus in the nasopharyngeal swab, and (5) bacterial respiratory tract bacterial infections defined as nasopharyngeal culture positive together with chest X-ray with infiltrate or sputum culture positive for a respiratory pathogen together with signs or symptoms indicative of bronchitis or pneumonia.In addition, 13 healthy adult individuals without a current history of travel were sampled as controls.The study was approved by the Swedish Ethical Review Authority (2016/2512-31/2 with amendment 2021-04087).Study participants were provided with written and oral information and written consent was obtained.
In addition to the study inclusion for all tropical fevers, patients with symptomatic malaria had since 2011 been included in a prospective malaria immunology cohort previously investigated for longitudinal parasite persistence [25], cellular aging dynamics [26], parasite-specific antibodies [27,28], B-cell responses [29], and immunoregulatory cellular and antibody interplay [19].These individuals were enrolled at the Karolinska University Hospital after confirmed microscopy for P. falciparum and providing informed consent.The study was approved by the Stockholm regional ethical committee (2006/893-31/4 with amendments 2018/2354-32 and 2019-03436).The collection and freezing of samples from both cohorts used overlapping standard operating procedures, minimizing cohort-dependent differences in the data.

Multiplex plasma protein measurements
EDTA plasma aliquots were thawed at room temperature (RT) and cytokines and chemokines were measured using the LEG-ENDplex predefined 13-plex panels according to the manufacturer's instructions with some modifications.The panels used were; cytokine panel 2, inflammation panel 2, proinflammatory chemokine panel, and the B-cell panel (all from Biolegend), measuring a total of 52 proteins, including 49 unique proteins in total (Supporting information Table S1).
Briefly, 12.5 μL plasma was mixed with 12.5 μL multiplex beads, diluted to 75 μL, and incubated for 2 h at RT in a plate shaker (700 rpm).The beads were then washed and incubated with 12.5 μL biotin-conjugated marker-specific antibodies diluted to 25 μL with PBS for 45 min at RT in a plate shaker (700 rpm).12.5 μL PE-conjugated streptavidin was then diluted with PBS to 25 μL and added to each well without washing and incubated for 30 min at RT in a plate shaker (700 rpm).The beads were then washed twice, and the fluorescent signal was measured on a BD 4-laser Fortessa using the 488 nm laser (forward scatter vs side scatter) to separate beads based on size and granularity and the 640 nm laser (780/60 filter) to separate beads based on cytokine specificity.The 561 nm laser (582/16) filter was then used to detect the amount of marker signal, translating to the amount of cytokine.The median fluorescent signals were then exported, and the concentration of each marker was determined for each sample based on interpolation from sigmoidal dose-response curves established for each cytokine based on the standards included with the kits.All concentrations below the lower limit of detection (LOD) or above the upper LOD of the assays (as determined by the included standard curve) were assigned a concentration corresponding to the lower and upper LODs, respectively.

Differential abundance analysis
Data management and statistical analysis were performed using R version 4.3.1.Data visualization was performed using ggplot2 [30].All data on cytokine levels were log-transformed prior to analysis.Spearman's rank correlation was used to construct correlation networks of cytokine levels within each disease group.For the graphical display of correlation networks, only correlations with absolute values of Spearman's rho > 0.7 were included.Mann-Whitney U tests were applied to evaluate differences in the median cytokine levels between the different types of infections or healthy controls.All p-values were adjusted for false discovery rate (FDR) [31].FDR-adjusted p-values < 0.05 were considered significant.

Principal component analysis
Principal components analysis (PCA) was used to reduce the dimensionality of the multidimensional cytokine data (49 proteins) in order to examine the presence of disease-specific patterns in the overall plasma protein data.Prior to PCA analysis, all log-transformed values of plasma protein levels were scaled and centered by dividing by the median absolute deviation and by subtracting the median, respectively.

Feature selection, random forest classification, and ROC analysis
The Boruta feature selection algorithm [32] based on binary classification was applied to identify only plasma proteins that con-tribute significant information for identifying each type of infection from the others and healthy controls.For each infection type, random forest classifiers were then fitted to all the downselected cytokines identified by the Boruta algorithm and the predictive performance was evaluated using receiver operating characteristic (ROC) analysis with fivefold cross-validation.All data was divided into five parts and in each run, four parts of data were used for model training and the remaining one was used for testing.To further reduce the risk of overfitting the classification algorithms, a repeated k-fold cross-validation with 10 repeats and 10 folds was applied for parameter tuning within the model training process.Finally, all ROC plots obtained from five different runs of the model were aggregated into one ROC plot to show the average performance of the model.An imbalanced dataset can result in biased learning algorithms due to differences in the number of individuals in each class.This bias can lead to overly optimistic ROC results that may not be realistic.To address this issue, we also used precision-recall curves to provide a more accurate representation of each model's performance.

Clinical characteristics
In total, 124 patients with self-reported fever seeking healthcare at the Karolinska University Hospital after travel to tropical or subtropical areas were included in the study.A total of 17 patients had dengue fever, 26 had gastroenteritis, 26 had influenza, 49 had malaria, and 6 had bacterial respiratory tract infections.In addition, 14 volunteers were included as healthy controls (Table 1).Patients with P. falciparum malaria were admitted to the hospital per routine (93.6% of patients) for treatment and observation.Hospital admission was lower in the other groups, with dengue (35.3%), gastroenteritis (38.5%), influenza (19.2%), and respiratory tract infection (66.7%).Among patients with nasal swabs positive for Influenza, 12 patients had Influenza A, and 14 had Influenza B by qPCR.Fecal cultures in the group with clinical gastroenteritis revealed growth of either Salmonella spp.(n = 12) or Campylobacter spp.(n = 14).
There was no significant difference in axillary temperature between the infections.They did however differ significantly in leukocyte and platelet counts and levels of CRP (Table 1; Supporting information Fig. S1).Patients with dengue and malaria had significantly lower leukocyte counts (WBC) compared with bacterial gastroenteritis while dengue also had significantly lower counts compared with influenza and bacterial respiratory tract infection.Platelet counts (PLT) were significantly lower in the malaria group compared to all other infections.CRP was significantly elevated in all infections except dengue fever which remained close to the reference values.There was no significant difference in CRP values between bacterial gastroenteritis, influenza, and bacterial respiratory tract infection, while malaria was significantly higher than influenza (Supporting information Fig. S1).

Different types of infection display unique profiles of inflammatory markers
We measured the levels of 52 proteins associated with inflammation in the plasma from the study participants and healthy controls using four separate 13-plex suspension bead assays.Two proteins were overlapping between assays, IL12p70 and sCD40L, and had an assay correlation with a Pearson r = 0.66 and r = 0.82 and both p < 0.0001, respectively.One protein (PAI-1) was not quantifiable in all donors and since we could not determine the cause and did not want to introduce potential bias, the PAI-1 data was excluded, leading to a dataset including 49 unique proteins (Fig. 1).There was a substantial variation in the levels of the proteins among individual donors (Fig. 1A); however, several disease-specific patterns could be observed (Fig. 1B).For many of the measured proteins, levels were highly positively correlated across infections (Supporting information Fig. S2) as well as within specific infections (Supporting information Fig. S3).
We also compared correlations between the measured proteins and several clinical variables, including age, temperature, CRP, WBC, and PLT, in the disease groups (Supporting information Fig. S4).Age and WBC were not strongly correlated with any of the proteins, whereas body temperature was mainly correlated with IL6, consistent with its pyrogenic properties.Platelet counts were strongly correlated with several proteins, including RANTES, ENA78, TARC, CD40L, and APRIL (positive correlation) and IL18, IL10, sCD25, sTNFRI (negative correlation; Supporting information Fig. S4).
We next compared the levels of the 49 inflammationassociated proteins between each group and healthy controls using Mann-Whitney U tests, corrected for multiple testing, to assess if different disease etiologies were associated with up-or downregulation of specific proteins (Fig. 2A).Out of the 49 proteins, 29 were significantly up-or downregulated between the different infections or healthy controls (Fig. 2A).Compared with healthy controls, dengue patients had a significant increase in 14 out of the 49 measured proteins (logFC > 1 and FDR pvalue < 0.05), whereas bacterial gastroenteritis led to a significant increase in 18 proteins, and bacterial respiratory tract infection of three proteins.Influenza had a significant increase of 12 proteins and the reduction of one protein whereas malaria led to significantly increased levels of 17 proteins and reduced levels of three proteins.A set of proteins, including BAFF, IL6, IP10 (CXCL10), ITAC (CXCL11), MCP1 (CCL2), MIG (CXCL9), MIP1β, PTX3, sCD25, and sST2 were increased in most infections, potentially indicating quite general markers of febrile illness.There were, however, considerable differences in levels between the groups for some of the markers, such as IP10, where all individuals with dengue had levels above the limit of detection for the assay, and sCD25, which was higher in especially malaria (Fig. 2B).
Some proteins were primarily associated with specific pathogens, such as IFNγ and APRIL, which were significantly increased in bacterial gastroenteritis, and IFNα2 which was upregulated in dengue compared with bacterial gastroenteritis, malaria, and healthy controls.IL10 levels were significantly elevated in influenza, dengue, and malaria, with progressively higher levels, while changes to IL18, CD40L, sTNFRI, and TARC were largely specific for malaria.GMCSF levels were slightly lower in influenza compared with healthy controls only (Fig. 2A).In total, 20 out of the 49 proteins did not differ significantly in any comparison between the different diseases and healthy controls (Supporting information Fig. S5).

Immune response and disease-associated variation in protein profiles
The inflammatory response to infection is complex and is likely affected by the type of pathogen, its virulence, and the anatomical location of the infection.To assess the overall complexity of the response and further examine if different diseases were associated with different patterns, we visualized the protein data using PCA including either all 49 proteins or only the 29 that were significantly different in the previous analysis (Fig. 3).Both analyses indicated large variability in donor responses to infection while all but one healthy control were relatively well clustered away from infected individuals.Those with infection mostly overlapped based on PC1 and PC2, except for individuals with malaria, where approximately half the donors were separated from the other infections.However, due to the large variability among the measured proteins and between donors, further selection of proteins contributing to disease stratification is required.

Selection of proteins indicating different etiologies
To assess if we could identify a protein signature associated with each type of infection, we used the Boruta feature selection algorithm to identify proteins that contributed significant information for the accurate classification of each disease (Fig. 4).Since there were only six bacterial respiratory tract infections in the dataset, we did not try to identify a specific signature for this group, although the samples were retained in the dataset when classifying the other groups.For the four remaining diseases, the algorithm identified different numbers of proteins that provided significant information for classification, indicated by the green color (Fig. 4).We identified 19, 17, 15, and 21 proteins for dengue, bacterial gastroenteritis, influenza, and malaria, respectively (Fig. 4).Several proteins were selected as important for the classification of more than one pathogen, such as IL10, MIG, sCD25, and sTN-FRI, which were selected for all four diseases (Supporting information Fig. S6).

Performance analysis of the classification of different disease etiologies
Following the Boruta feature selection algorithm, binary random forest classifiers were fitted separately to the data for the proteins selected for each infection (from all individuals) in order to evaluate whether they were informative in identifying individuals with a specific infection type.The best cross-validated classifier performance, as determined by the aggregated classifier area under curve (AUC), was observed for malaria, followed by bacterial gastroenteritis, influenza, and then dengue (Fig. 5A).In malaria, we observed an aggregated cross-validated AUC of 0.97 for a combination of 21 proteins, whereas for bacterial gastroenteritis, an aggregated AUC of 0.94 was seen for a set of 15 proteins.For influenza and dengue, aggregated AUCs of 0.91 and 0.90 were observed from combinations of 17 and 19 proteins, respectively.Overall, all classifiers showed good performance (sensitivity > 0.87 and specificity > 0.72) for detecting each particular infection type using the selected signature.
In our binary classification problem, the goal was to distinguish a specific infection (i.e. the case class) from the other infections and healthy controls (i.e. the control class).In each case, the number of individuals in the control class was greater than the case class, resulting in an imbalanced dataset.To address this issue and report a more accurate model performance, we calculated precision-recall curves (Fig. 5B).In malaria, the dataset imbalance was very low, resulting in a low bias and a good model performance in terms of both AUC and area under the precisionrecall curve (AUPR) (with a sensitivity and precision of 0.9).However, for bacterial gastroenteritis and influenza, with a moderate dataset imbalance, the highest sensitivity and precision were 0.8 and 0.7, respectively.For dengue, we observe the lowest sensitivity and precision, of approximately 0.6, mainly due to a larger class imbalance in the data set (Fig. 5B).
To further examine to what extent the immune signatures identified above, and the performance of the random forest classifiers, were influenced by differences in protein levels between a given infection and healthy controls we repeated the feature selection and classification analysis after first excluding the data for healthy controls.The disease-specific immune signatures as well as the order of importance of different proteins identified in the absence of data for healthy controls differed slightly compared with the original analysis (Supporting information Fig. S7A).However, the classification performance was largely comparable except for dengue where it was reduced, going from an AUROC of 0.90 to 0.87 and a precision-recall from 0.72 to 0.52 (Supporting information Figs.S7B).
We next added several clinical parameters to the feature selection, including age, temperature, CRP, WBC, and PLT to assess if a combination of these clinical parameters could synergize with the proteins measured here to improve disease stratification.CRP, WBC, and PLT were all selected as important features, although to varying extent with the different diseases.WBC and CRP were selected for dengue, WBC and PLT for bacterial gastroenteritis, PLT for influenza, and PLT and CRP for malaria.Age and temperature did not contribute significantly to any disease (Supporting information Fig. S8A).For dengue, the inclusion of WBC and CRP led to clearly improved classification with the AUROC increasing from 0.87 to 0.96 and the precision recall from 0.52 to 0.88.There was only a minor improvement in the classification of bacterial gastroenteritis with the inclusion of WBC and PLT retaining a similar AUROC but improving the precision-recall from 0.87 to 0.91 (Supporting information Fig. S8B).There was no improvement in the classification of malaria or influenza.

Discussion
Fever is one of the most common symptoms in patients presenting in an emergency setting, especially following travel to tropicaland subtropical regions [33].However, fever is a nonspecific symptom associated with many different conditions, some potentially dangerous [6].Despite this, a large proportion of patients seeking care after travel are discharged without identification of the etiological agent causing the disease [2].This could be due to the lack of awareness and/or specific tests for rare pathogens, or difficulty in selecting the accurate test to perform.
Cytokines are immune mediators temporarily produced at high levels during infection, where they provide important functions, such as directly inhibiting pathogen dissemination, stimulating or dampening immune activation, and controlling cellular migration, among other functions [34].The response is generated as a reaction to pathogen-specific patterns and via antigen-specific recognition, potentially making it specific for a given pathogen [35,36].This makes it possible to better understand how the immune system responds to a given infection and potentially predict the type of pathogen based on the inflammation-associated markers that are increased during the infection.With a broad approach of including viral, parasitic, and bacterial infections in febrile patients we show that many cytokines and other inflammation-associated proteins are up-or downregulated compared with healthy controls and further differ between the infections, making us able to identify disease-specific plasma protein profiles.
We used three complementary approaches to assess the disease-specific protein profiles: (1) A univariate differential abundance analysis, where protein levels for each individual protein are compared across disease groups; (2) PCA, an unsupervised dimensionality reduction method illustrating variability in protein data between donors, and (3) Random forest classification, a supervised machine learning method, which combines multiple decision trees to predict disease group membership for each given sample, that was used to distinguish each disease from the others.Although the three methods answer fundamentally different questions about the data, they all highlight a similar set of key proteins that can characterize the main differences in the host inflammatory response toward the different diseases.
Using Boruta feature selection, four proteins (IL10, MIG, sCD25, and sTNFRI) were identified to contribute significantly to stratifying between the four infections included in the analysis, indicating a varied expression in different diseases.IL10 was among the top three features selected for patients with P. falciparum malaria where IL10 was markedly upregulated (also to some extent in dengue virus infection) and for patients with enteric bacterial infection where it remained at baseline levels similar to healthy controls.The increased level of IL10 in patients with malaria is in line with several previous reports [37][38][39].However, by comparing IL10 levels in malaria with other infections in this study, it becomes clear that the levels reached during acute P. falciparum malaria are very high and appear to be a relatively specific hallmark of the disease.In addition to IL10, patients with P. falciparum malaria also had especially high levels of sCD25 and sTNFRI.Both these proteins are soluble receptors with sCD25 corresponding to the IL2 receptor and sTNFRI corresponding to the soluble TNF receptor 1 or CD120a, which binds TNF-α.It has been suggested that sCD25 is a marker of T-cell activation [40] and has been shown to be increased during different infections, inflammatory diseases, and cancer [41][42][43].Its purpose remains relatively unclear, but it has been suggested to sequester IL2 and thereby inhibit excessive T-cell activation while simultaneously skewing toward the survival of CD25 high regulatory T cells [44].sTNFRI is expressed by most cells while sTNFRII is mainly induced in a subset of cells during inflammatory responses.Both receptors are elevated in blood during malaria and correlate with parasitemia in both symptomatic and nonsymptomatic infections [45,46] and with the clinical stage and progression of HIV and sepsis [47].sTNFRI is suggested to bind and deactivate excessive TNF to reduce overall inflammation [46].CD40L is another membranederived protein belonging to the TNF superfamily.It can have both immunostimulatory and immunoinhibitory effects and soluble CD40L has been associated with the induction of regulatory T cells and immunosuppression in HIV and cancer [48,49].One of the main sources of both membrane-bound and soluble CD40L is platelets [50].In this study, soluble CD40L was somewhat elevated in enteric bacteria compared with healthy controls, but the main difference was a relatively specific and significant reduction in plasma during P. falciparum malaria.A potential reason for this strong reduction in soluble CD40L could be thrombocytopenia, as observed in the current study and previous studies of malaria [51,52].Consistent with this, other platelet-derived factors, including ENA78, RANTES, and TARC were also highly correlated with platelet numbers.CD40L and platelet numbers also provided similar importance scores in the Boruta feature selection, suggesting that either can be used in a malaria signature.Taken together, the high levels of IL10, sCD25, sTNFRI and II, suggest that there is a greatly expanded regulatory or immunosuppressive response generated during acute malaria, perhaps as a counter-effect to the strong immunostimulation coming from high levels of parasites in the blood [53].However, consequently, it has also been proposed that the strong inflammatory and anti-inflammatory response could affect the long-lived adaptive B-and T-cell compartment and reduce the generation of protective immunity [54][55][56].Although it is difficult to determine the exact effect of this combined response, it is clear that repeated malaria episodes lead to reduced activation of innate immune responses [57][58][59].This could be an effect of innate training [60,61] or cellular dysregulation [58,62], but could also be influenced by adaptive responses affecting innate activation [19,63,64].
MIG, also called CXCL9, was elevated in all groups compared with healthy controls, potentially working as a general marker of infection.However, the levels were also different between the infections with enteric bacteria and P. falciparum malaria having significantly higher levels than both viral and bacterial respiratory tract infections.MIG is induced by IFN-γ and mainly mediates lymphocyte recruitment via binding to its receptor CXCR3 [65].MIG is often also co-expressed with IP10 (also called CXCL10), which was among the top three features selected for dengue virus infection.Like MIG, IP10 was also elevated in all infections, but more so in the dengue group.Since the IFN-γ levels were not higher in dengue compared with enteric bacteria or P. falciparum malaria, the increased IP10 levels could come from induction via direct sensing by pattern-recognition receptors [66].In support of this, the level of type I IFN (IFNa2) was elevated in dengue virus infection, but not in the other groups.IFNa2 was also important in the classification of dengue versus the other groups.However, since the levels were overall relatively low and variable between donors, dengue was poorly separated from the other disease groups.The inclusion of clinical laboratory variables greatly improved dengue classification, mainly due to low CRP and WBC levels during infection compared with other febrile illnesses.
A universal marker or combination of markers that could identify specific pathogens would be highly valuable.Clinically available markers, especially the most widely used CRP, WBC, and differential counts, can provide some indication of whether acute fever is due to a bacterial or viral infection but they remain relatively unspecific [67][68][69][70][71]. Recent studies, however, indicate that other host-derived plasma proteins could provide improved bacterial versus viral disease stratification [23,72].In this study, we did not observe signatures that were unique to viral or bacterial infections as a group.However, since profiles enriched in several highly important pathogens were identified, these combinations of markers could be further analyzed to improve our understanding of disease-specific immune responses and potentially the identification of disease etiologies.Additionally, when paired with commonly used clinical laboratory variables, the combinations could further improve classification for some diseases, suggesting that rather than completely changing current markers, the inclusion of additional host proteins could potentially provide clinical value.
A strength of this study is that we have explored immune responses in the plasma of individuals with similar symptoms but due to a variety of different microbiological etiologies, contrasting with many studies where only one pathogen and fewer proteins are studied [73][74][75].Furthermore, travelers provide a unique opportunity to study host responses following limited exposure and in the absence of re-exposure, in contrast to studies in areas endemic to the disease.The study population was also healthy in general with a median age of 37 years, and therefore with little impact from other chronic diseases or medication, which is often prevalent in patients at the hospital level of care.Conversely, the study also has several limitations.The groups are relatively small for each disease and unbalanced in the number of study participants.They are also not perfectly matched in age or gender.It is therefore important to note that the study primarily has an observational exploratory aim, rather than identifying clinical signatures translating to patient stratification.
In conclusion, our results show that the mapping of plasma protein profiles in febrile patients can identify biomarker combinations that indicate different etiologies.Additionally, we identified proteins that were uniquely high or low between the infection, indicating different biological functions in the host's response to infection.Future studies, with larger and more balanced groups with independent training and testing sets, will be important to narrow down the disease signatures to key proteins that could be further developed for clinical tests.

Figure 1 .
Figure 1.Levels of inflammation-associated proteins in plasma in different types of infections.(A) Heatmap of normalized individual cytokine levels.Each column represents an individual and each row represents a cytokine.Columns are ordered based on disease etiology and rows are ordered using hierarchical clustering based on Euclidean distance.(B) Heatmap of normalized group-wise median cytokine levels.Each column represents a disease etiology group while each row represents a cytokine.

Figure 2 .
Figure 2. Comparison of inflammatory proteins between disease groups.(A) Log-transformed levels (pg/mL) of inflammation-associated proteins were compared between all groups and significant differences, as determined by the FDR-adjusted Mann-Whitney tests, are shown as dots where the size of the dot indicates the p-value and the color indicates a positive (red) or negative (blue) fold-change between the compared groups.BRTI refers to bacterial respiratory tract infection.(B) Log-transformed protein levels are shown for individual donors in each group.Box plots indicate the median and interquartile range.

Figure 3 .
Figure 3. Analysis of protein profiles by principal component analysis (PCA).(A) PCA scatter plot based on principal components (PC) 1 and PC2 from all measured plasma proteins (n = 49) indicating the overall data variation between donors (dots) and disease groups (color).(B) PCA scatter plot of PC1 and PC2 based on the 29 proteins that were significantly up-or downregulated between groups, as indicated in Figure 2A.

Figure 4 .
Figure 4. Selecting signatures for detecting individuals infected by each pathogen using the Boruta feature selection algorithm.(A-D) Variable importance plots from the Boruta feature selection algorithm fitted jointly to data for all proteins in detecting dengue, bacterial gastroenteritis, influenza, and P. falciparum malaria, respectively.Proteins are ordered from left to right by their importance for classification.The importance measure is defined as the Z-score of the mean decrease in accuracy (normalized permutation importance).Blue boxes correspond to the minimal, average, and maximum Z-scores of shadow features.Red boxes indicate variables not contributing significantly to accurate classification.Green boxes indicate the proteins contributing significantly to the accurate identification of each infection type.

Figure 5 .
Figure 5.Evaluating performance in identifying individuals infected by each pathogen based on a combination of protein responses.Individual panels display (A) cross-validated receiver operating characteristic (ROC) curves and (B) aggregated precision-recall (PR) curves for the identification of dengue (red), bacterial gastroenteritis (blue), influenza (purple), and malaria (orange).Random forest classifiers fitted to data on selected proteins that were identified using feature selection for each pathogen.Gray curves in (A) correspond to the ROC curves obtained from the fivefold crossvalidation method and the aggregation of all five ROC curves for the classification of each pathogen is shown with a colored thick ROC curve.The area under the ROC/PR curve (AUC) shows the performance of the classifier.An AUROC/AUPR of 0.5 indicates a classifier that performs no better than random, and an AUROC/AUPR of 1 indicates a perfect classifier.AUPR, area under the precision-recall curve.

Table 1 .
Clinical characteristics of patients included in the study (n = 124).