Heterogeneous validity of daily data on symptoms of seasonal allergic rhinitis recorded by patients using the e‐diary AllergyMonitor®

Abstract Background Patient‐generated symptom and medication scores are essential for diagnostic and therapeutic decisions in seasonal allergic rhinitis (SAR). Previous studies have shown solid consistencies between different scores at population level in real‐life data and trials. For clinicians, the evaluation of individual data quality over time is essential to decide whether to rely on these data in clinical decision‐making. Objective To analyze the consistency of different symptom (SS) and symptom medication scores (SMSs) at individual level in two study cohorts with different characteristics and explore individual patient trajectories over time. Methods Within the pilot phase of the @IT.2020 project on diagnostic synergy of mobile health and molecular IgE assessment in patients with SAR, we analyzed data of 101 children and 93 adults with SAR and instructed them to record their symptoms and medication intake daily via the mobile app AllergyMonitor®. We then assessed the correlation between different SMS and a visual analogue scale (VAS) on the impact of allergy symptoms on daily life at population and individual level. Results At population level, the Rhinoconjunctivitis total symptom score (RTSS) correlated better with VAS than the combined symptom and medication score (CSMS). At individual level, consistency among RTSS and VAS was highly heterogeneous and unrelated to disease severity or adherence to recording. Similar heterogeneity was observed for CSMS and VAS. Conclusions The correlation of clinical information provided by different disease severity scores based on data collected via electronic diaries (e‐diaries), is sufficient at population level, but broadly heterogeneous for individual patients. Consistency of the recorded data must be examined for each patient before remotely collected information is used for clinical decision making.


Funding information
Euroimmun, Grant/Award Number: 121939 Open access funding enabled and organized by Projekt DEAL. of the recorded data must be examined for each patient before remotely collected information is used for clinical decision making.

K E Y W O R D S
allergic rhinitis, mHealth, patient-generated data, patient-reported outcomes, symptom scores

| INTRODUCTION
To date, allergen-specific immunotherapy (AIT) is the only diseasemodifying treatment for seasonal allergic rhinitis (SAR), 1 a noncommunicable disease affecting millions of citizens around the globe. 2,3 Although a clinical benefit and cost effectiveness have been proven in several settings, 4-8 the treatment is long, expensive and demanding for patients who need to adhere to regular intakes or injections over several years. Therefore, international guidelines suggest, that AIT should only be prescribed for patients whose allergy symptoms are not sufficiently controllable with symptomatic pharmacological treatment and preventative measures, such as allergen avoidance. 9 To enable an informed and shared clinical decision-making, a standardized assessment of disease severity and control is usually performed retrospectively using validated questionnaires and criteria, such as the widely applied allergic rhinitis and its impact on asthma (ARIA) guidelines. 10 Recently, several digital and mobile health technologies have been proposed to facilitate the prospective realtime collection of patient-generated data on symptom severity, medication intake or allergen exposure. [11][12][13][14][15] Patients are being asked to enter their clinical symptoms mainly via user-friendly electronic diaries (e-diary) while exposure data are being collected through national or local pollen monitoring stations or networks. A merged report of the data then gives patients and attending healthcare professionals a comprehensive overview on the individual disease severity, patient compliance and symptom control.
The diagnostic usefulness of clinical e-diaries, however, strongly depends on both, the patient's adherence to compilation and the quality of entered data. While the adherence to symptom recording is easily assessed by measuring the ratio of days with and without recordings during a fixed time period, 16 data quality assessment is more complex and less intuitive. 17 Recently, a large investigation evaluated at population level the validity, reliability and responsiveness of several visual analogue scales (VARs) for allergic rhinitis based on data collected via the mobile app MASK-air ® . 18 The study demonstrated in a large data set, that the examined approach can be reliably used to monitor population subsets of allergic patients, for example, in trials investigating the efficacy of AIT. 19 In addition to the use of patient-generated data at population level, individual recordings may provide valuable insights, making data quality assessment for single patients essential. This assessment is particularly relevant for the appraisal of symptom severity within the daily practice of attending physicians deciding for or against the prescription or cessation of AIT. 20 Before any data-based decision making, the doctor should ascertain whether the amount (adherence) 16 and quality (validity) of collected data is high enough.
A crucial question is, therefore, whether the level of consistency between the information on symptom severity (assessed by means of RTSS) and its impact on daily life (assessed via VAS) is uniform or heterogeneous for an individual patient. To answer this research question, we have analyzed the intra-patient, day-by-day internal consistency of RTSS and VAS in two distinct cohorts of patients with SAR prospectively examined during the pollen season in the context of the @IT.2020 pilot project.

| Study population
The @IT.2020 project aims at developing and testing diagnostic algorithms integrating molecular allergology and digital health for the prescription of AIT in patients with SAR. In its pilot phase, 200 patients with a diagnosis of SAR were recruited before the pollen After the individual monitoring period, all patients were invited to a final visit (T1) during which clinical questionnaires were repeated.

| AllergyMonitor ® app for data collection on symptom severity and medication intake
Allergy.Monitor ® [Technology Project and Software (TPS) Production, Rome, Italy] is a mobile application designed for daily reporting of symptoms and medication intake related to SAR and/or asthma with a front-end (i.e., patient app) and back-office (i.e., doctor's website). In this study, all patients were asked to monitor their symptoms and medication intake via the AllergyMonitor ® mobile app during an individually prescribed monitoring period according to the flowering periods of potentially eliciting allergen sources. The questionnaires included four questions on nasal symptoms (sneezing, rhinorrhea, nasal pruritus and nasal congestion), three on ocular symptoms (red eyes, itchy eyes and watery eyes), and three questions on the personal, scheduled medication intake (antihistaminic drugs, local steroids and systemic steroids). At the end of every data entry, users were asked to indicate the overall impact of their allergy symptoms via a continuous VAR (see below for detailed information).
Patients could only submit a complete questionnaire to ensure the completeness of daily data sets. In the case of not entering any data for two consecutive days, users received an automatic alert message on their mobile phone or by email; after 3 days of missed reporting, the alert was followed by a phone call from the study physician or nurse. Further details on the @IT.2020 pilot study are available elsewhere. 21

| Used symptom (and medication) scores
For the present analysis, we calculated the following, widely used symptom (and medication) scores: (i) the Rhinoconjunctivitis total symptom score (RTSS; range: 0-18) 26 ; (ii) the combined symptom and medication score (CSMS; range: 0-6) 27 ; and iii) the VAR on general perception of allergy symptoms (VAS; range 0-10). 10 While symptom severity assessment was performed using a four-item self-rating scale representing the severity levels (no symptoms, mild, moderate and severe) via four different emoticon faces, the general perception of allergy symptoms was assessed via the question 'How do you feel in relation to your allergy symptoms today?' and a continuous VAS for answer ranging from 0 (very good) to 10 (very bad). The interpretation of VAS does not require the application of a formula. RTSS and CSMS were automatically calculated within the Allergy.Monitor App using the answers on nasal (sneezing, rhinorrhea, nasal pruritus and nasal congestion) and ocular (itching, watery eyes) symptoms as well as medication intake (antihistaminic drugs, local steroids and systemic steroids). For details on the score calculation, please see Table e1.

| Study population and pollen season
The present analysis includes 101 children ( (Table S1). In Rome, grass pollen concentrations in the air ranged from 0 to 199 grains/m 3

| Correlation of RTSS, VAS and CSMS at population level
During the reporting period, with a mean of 70.6 days (95% CI 64.9-74.4), the average adherence to symptom recording was 82.3% (SD 13.7) in Rome. 16 In this centre, the trajectories of the three scores correlated well at population level over time during the grass pollen season ( Figure 1A). The comparison of daily population averages of VAS explain the variation in RTSS with an R 2 of 0.841 ( Figure 1B).  Figure 1F).
In Pordenone, the mean adherence to symptom recording was 85.7% (SD 13.9) during the reporting period which lasted on average 48.2 (95% CI 44.6-51.7) days. 16 Also here, the trajectories of average RTSS, CSMS and VAS show a consistent trend over time ( Figure 1G) and population averages of VAS explain well the RTSS variability with an R 2 of 0.834 ( Figure 1H). In Pordenone, as in Rome, the correlation was lower for CSMS and VAS (R 2 = 0.64) ( Figure 1I) and the majority of data points of RTSS and VAS ranged in the lower third of both scores ( Figure 1J). Interestingly, the variability of individual average values of VAS and RTSS ( Figure 1K), as well as VAS and CSMS ( Figure 1L) is also high in Pordenone, reflecting the trends observed in Rome with correlations of R 2 = 0.36 and R 2 = 0.29, respectively.

| Broad range of correlation between RTSS and VAS at individual level
The analysis of correlation between RTSS and VAS at individual level, performed through linear regression and expressed by the coefficient of determination (R 2 ) (Figure 2A

| Individual patterns of RTSS and VAS trajectories
The above reported heterogeneity of patterns is even clearer when the slopes of RTSS and VAS recorded by index patients are matched on the same timeline ( Figure 3). Patients with a low correlation between the two scores ( Figure 3A

| Lack of association between symptom (and medication) scores and diseases severity or adherence to recording
To answer the question whether a compliant and severely affected patient is also a patient with high quality compilation, we compared the level of consistency between RTSS and VAS scores to the level of adherence and the level of disease severity. We could not identify any association of the level of consistency between RTSS and VAS scores and the level of disease severity or adherence to compilation.
This lack of association became clear in both clinical centres, in Rome ( Figure E1a) and in Pordenone ( Figure E1b).  Rome) contribute to converging information obtained through different questions on disease severity in populations with relatively homogeneous sensitization patterns (grass pollen sensitization in our case). Interestingly, the correlation of RTSS with VAS was higher than that of CSMS with VAS. This is in line with the idea that daily medication has no effect on VAS, as it is easy to use and has no side-effects in patients with allergic rhinitis.

| Heterogeneous consistency at individual level
Interestingly, we found that behind the good average score consistency measured at population level there is a broad heterogeneity of performance at individual patient level, which we found in both study cohorts. This demonstrates that patients are profoundly different in their capacity to provide similar answers to similar questions, repeated every day over a long monitoring period, even when they are participating in an observational study. However, most patients have good or excellent consistency, while patients with completely independent measures (RTSS and VAS) are a minority. Interestingly, the level of consistency is independent from adherence to recording.
We observed patients with a relatively low adherence and high consistency of daily RTSS and VAS and patients with a very good adherence but low consistency between the two measures. This indicates that the quality of the data entered by the patient cannot be predicted on the basis of the quantity of the entered data that is, their adherence to compilation. Similarly, we also observed that the consistency in RTSS and VAS score was totally independent from the disease severity. Again, this contra-intuitive outcome suggests that intention and attention in e-diary compilation are two independent behavioural parameters.

| Implications for the use of e-diaries in trials or epidemiological studies
The observed parallel trajectories at population level highlight, that disease severity scores may be used interchangeably while studying populations or specific subsets, such as in trials or epidemiological studies. This is in line with previous observations, for example an analysis of electronically generated symptom data collected over 10 years with the Austrian and German Hayfever Diary. 29 The authors analysed large data sets of pollen allergic patients over several pollen seasons and concluded that the exact method of symptom score calculation is not critical as all used computation methods showed similar trends over time. Therefore, the authors proposed a general symptom load index. 29  through an e-diary, the allergist must evaluate not only the completeness of the data (adherence), 16  settings; (iii) as children and adults were recruited in geographically different centres, it is impossible to associate differences in the results with age or external factors, such as exposure to allergens or pollutants; and (iv) the presence of sensitizations to perennial allergens (e.g. mites, cat and dog) has not been taken into account in our analysis.
Their role should be studied in depth within future studies.

| CONCLUSIONS
The validity of clinical information provided by different disease severity scores based on data collected via e-diaries, may be sufficient at population level, but is broadly heterogeneous at individual level. Data quality of the individual patient report must be examined before remotely collected information is used in clinical practice.
Future studies in real-life settings are required to further investigate risk factors of poor patient's performance and to elaborate strategies to improve not only patient's adherence, but also the quality of daily self-reported clinical data.

ACKNOWLEDGEMENTS
We thank all participants and the entire study team.