Comparing task‐induced psychophysiological responses between persons with stress‐related complaints and healthy controls: A methodological pilot study

Abstract Aims Chronic stress is an important factor for a variety of health problems, highlighting the importance of early detection of stress‐related problems. This methodological pilot study investigated whether the physiological response to and recovery from a stress task can differentiate healthy participants and persons with stress‐related complaints. Methods and Results Healthy participants (n = 20) and participants with stress‐related complaints (n = 12) participated in a laboratory stress test, which included 3 stress tasks. Three physiological signals were recorded: galvanic skin response (GSR), heart rate (HR), and skin temperature (ST). From these signals, 126 features were extracted, including static (eg, mean) and dynamic (eg, recovery time) features. Unsupervised feature selection reduced the set to 26 features. A logistic regression model was developed for 6 feature sets, analysing single‐parameter and multiparameter models as well as models using recovery vs response‐related features. The highest classification performance (accuracy = 78%) was obtained using the response‐related feature set, including all physiological signals and using GSR‐related features. A worse performance was obtained using single‐signal feature sets based on HR (accuracy = 66%) and ST (accuracy = 59%). Response‐related features outperformed recovery‐related features (accuracy = 63%). Conclusion Participants with stress‐related complaints may be differentiated from healthy controls by physiological responses to stress tasks. We aimed to bring attention to new exploratory methodologies; further research is needed to validate and replicate the results on larger populations and patients on different areas along the stress continuum.

associations have been established between psychological stress and depression, cardiovascular disease, and the course of HIV/AIDS. 3 Another review concluded that both acute and chronic stress research reveals extensive data concerning the stressors' contributions to deteriorated health, including sudden death and myocardial infarction. 4 Together, these findings highlight the need for affordable and effective early detection of stress problems and preventive interventions of stress-related mental health disorders.
Stress-related health problems can be conceptualized into 3 areas along the stress continuum 5 : stress-related complaints, overstrain, and burnout. A main differentiator between these 3 areas is the chronicity of the complaints. For stress-related complaints, the time since the onset of the complaints is less than 3 months; for overstrain, more than 3 months; and for burnout, more than 6 months. 5 Furthermore, persons categorized in the stress-related complaints group do not yet feel any substantial limitation in their social or professional functioning, whereas this is increasingly the case for both overstrained and burnout patients. 5 Physiological signals such as heart rate (HR), blood pressure, and galvanic skin response (GSR) have been investigated to detect stress-related health problems. Studies on autonomous nervous system (re)activity in the context of stress-related health problems have focused especially on the last stage in the stress continuum, ie, burnout. May et al 6 found that school burnout was associated with decreased baseline HR variability (HRV). Contradictorily, Morgan et al 7 showed that persons who score higher on the Maslach Burnout Inventory have significantly higher HRV.
De Vente et al 8 found that burnout patients show higher resting HRs than do healthy controls. Other studies investigating the hypothalamicpituitary-adrenocortical (HPA) activity concluded that burnout patients and controls do not show differences in HPA outcomes. 9 A review analysis, including 22 studies investigating the physiological mechanisms among burnout patients, concluded that, so far, results are contradictory and inconclusive. 10 Authors suggest this could be due to differences between studies in the variety and severity of participants' symptoms, co-morbidity, use of medications, phase in the burnout process, and degree of sick leave. 10 Although preliminary, such research is promising for the detection of burnout. However, in terms of prevention, it could be valuable to detect stress-related health problems already in an earlier stage of the stress continuum. To date, no validated questionnaires exist to identify individuals with stress complaints who are vulnerable to develop overstrain and burnout.
In the current study, we, therefore, sought to identify the specific characteristics of persons with stress-related complaints who are not yet limited in their social or professional functioning, ie, the first stage of the stress continuum. Analogous to previous studies focusing on burnout, 8 we aimed to investigate the patient's autonomic nervous system responses to and recovery from an acute stressor, as especially these measures may have a great potential for ambulatory stress monitoring and dynamically tailored direct feedback and just-in-time behavioural interventions. However, in contrast with most studies in this field, we opted for a less conventional, fundamentally different approach of the data. Traditionally, psychophysiological studies are hypothesis driven, which means that a study is specifically designed to answer a question. 11 The analysis, therefore, is confirmatory rather than exploratory. However, as technology is continuously improving and wearables become widespread, the amount and nature of psychophysiological data that are available have exponentially grown and call for complementary approaches that allow to maximally explore the wealth of data that are nowadays available. Data scientists have already moved towards more exploratory data-mining techniques to develop classification algorithms that can unravel new knowledge hidden in the data. 11 In this methodological study, we will explore and apply this more exploratory approach to analyse the data to evaluate whether persons with stress-related complaints can be differentiated from healthy participants.
Previous studies have mainly investigated single physiological parameters independently, (eg, Morgan et al, 7 and De Vente et al 8 while combinations of multiple physiological parameters and comparisons between single markers could unravel additional insights. 12 Furthermore, previous studies have focused mainly on static features, ie, the comparison of mean HR in rest and stress tasks. However, both physical fitness and stress research strongly suggests that dynamic features such as response and recovery time can provide additional information regarding physical condition determination. 13 Based on the research of McEwen, 14 it was found that failure to shut off allostatic activity after a stress response is one type of allostatic load.
This could be reflected in a longer recovery time of the physiological signals after a stressor for patients. It is, therefore, needed to investigate if such dynamic features can also improve the detection of persons with stress-related complaints.
In this methodological pilot study, we aimed to explore whether a multiparameter classification model that, on the basis of the physiological response to and recovery from 3 standardized laboratory stress tasks, can differentiate between healthy participants and persons with stress-related complaints. We also assessed which physiological signal(s) is most suitable for the characterization of persons with stress-related complaints. We included 3 commonly used physiological signals for stress detection, being HR, GSR, and skin temperature (ST).
We hypothesized that a classification model combining all 3 physiological signals would outperform models based on the individual signals separately. Furthermore, we compared classification performances on the basis of response and recovery-related features. We hypothesized, on the basis of the suggestion of Linden et al, 15 that recovery-related features could provide additional insight into the difference between healthy participants and persons with stress-related complaints and, therefore, increase classification performance. Finally, we used both static and dynamic features for classification. We hypothesized, on the basis of earlier findings in physical fitness research, 13 that dynamic features can improve classification performance. These findings could enhance our understanding of the physiological differences between healthy participants and persons with stress-related complaints and may advise further strategies to use physiological signals for the early detection of stress-related health problems.

| Participants
A controlled laboratory study was conducted with the approval of the Medical Ethical Committee of the UZ Leuven. All participants signed an informed consent form before participating in the study. In this study, 32 participants, of which 20 healthy participants (10 women, 10 men, mean age = 39.8 y, age range 26-57 y) and 12 persons with stress-related complaints (7 women, 5 men, mean age = 38 y, age range 23-56 y) participated. The focus of this research is on early detection of stress-related health problems; therefore, only persons with stress-related complaints but without formal diagnosis of any clinical mental health disorder were included.
Healthy participants were recruited in 2 companies in Belgium.
They were all employees with a mainly sedentary job who volunteered to participate in the study. They did not receive any compensation for their participation in the study. The healthy participants did not report any physical or psychological disease or complaint, as administered through an intake questionnaire, including, for example, questions related to whether participants suffer of have suffered from psychosis, hyperventilation, depression, epilepsy, panic attacks, and burnout. Persons with stress-related complaints were recruited at Tumi Therapeutics, a multidisciplinary ambulatory diagnostic and treatment centre that specializes in stress-related symptoms and syndromes. In return for participation, patients received the psychophysiological diagnostics, which involved the stress tests, free of charge. In addition to the stress test and as part of the standard intake procedure at Tumi Therapeutics, patients also completed a set of questionnaires. Only patients with stress-related complaints (first phase of the stress continuum) were included. Specifically, the following inclusion criteria were applied: (1) the patient experienced somatic complaints, and (2) the complaints started less than 3 months before consultation and The test measures 8 primary symptom levels, ie, sleep difficulties, agoraphobia, hostility, somatization, interpersonal sensitivity, anxiety, cognitive-performance deficits, and depression. The results can be compared with those of a healthy and clinical norm group for female and male participants separately. 17 The mean results for the selected patients and normal and clinical norm groups are reported in Table 1.
The included patients scored higher on the subscales than did the healthy norm group but scored lower than did the clinical norm group, for all scales, except for somatization and sleep difficulties, for which they scored higher than did the average clinical norm group. The Nijmegen questionnaire for hyperventilation 18 was used to assess several singular stress complaints such as chest pain, being short of breath, and blurred vision. Included patients scored positive on the Nijmegen questionnaire for hyperventilation, having 18 points or more. All patients confirmed their complaints started less than 3 months before consultation, and all patients were still capable of fully functioning in their social and professional lives. Further, a clinical interview based on the Mini International Neuropsychiatric Interview, which is based on DSM-IV criteria, 19,20 was conducted to exclude the existence of any psychiatric disorders. Organic diseases were excluded on the basis of doctor's reports, physical examination, medical tests, and self-reporting.

| Procedures
The protocol consisted of 3 stress tests of 2 minutes each: a Stroop Color-Word test, 21 a math test, and a stress talk, which were presented using the NeXus 10 MK II software (Mind Media, Herten, The Netherlands). The tasks were given in the same order to all participants. During the Stroop Color-Word test, colour words were written in an incongruously ink colour; eg, the word red was written in the colour blue. Participants had to respond with the real ink colour, eg, blue in the previous example. During the math test, participants had to successively subtract 7 from the number 1081. To induce additional stress, the experimenter intervened by saying "wrong" or "faster" during the first 2 tasks. During the stress talk, participants had to talk for 2 minutes about a stressful life event. All 3 tests are commonly used to induce stress in laboratory settings. 22 The 3 tests were separated by rest phases of 2 minutes. Before the first and after the last stress test, a resting phase of 2 minutes was included. The timeline of the experiment is shown in Figure 1.
For the healthy participants, the protocol additionally included a counting task before the first rest phase and after the last rest phase, as presented in detail by Smets et al. 23 The goal of this counting task was to control for the physiological response to speaking. We have shown that a stressful task with speech can be distinguished from a nonstressful speaking task, ie, counting. 23 Since the counting task did not significantly differ from a rest phase, it was removed to reduce thermistor. This is a small point probe, secured by placing tape over the measuring tip to avoid signal contamination by air flow.
Heart rate was measured at 128 Hz using a blood volume pulse sensor at the ring finger of the nondominant hand. The sensor used photoplethysmography, which is a light-based technology to sense the rate of blood flow as controlled by the heart beats. With this signal, instant HR was detected in real-time by the NeXus software.
Participants were asked to keep the hands still, as all signals are susceptible to motion artefacts. Physiological channels were simultaneously streamed to disk and displayed on a PC monitor. Offline, all channels were visually inspected to ensure good quality. There were no missing data. All sensors were attached at least 15 minutes before starting the protocol, allowing the participants to adapt to their position and wearing the equipment.

| Feature computation
We applied an exploratory approach towards the signal analysis and feature computation, meaning the outcome for each feature is not hypothesized beforehand but rather explored. Before feature extraction, the physiological signals were standardized with zero mean and unit variance per participant to obtain time series on the same scale.
Then, the time series were divided into rest and stress blocks of 2 minutes each, according to the task performed in each segment. This resulted in a total of 7 blocks, 4 rest blocks (R 1 , R 2 , R 3 , and R 4 ), and 3 stress blocks (S 1 , S 2 , and S 3 ). The first rest block (R 1 ) was excluded, since for the healthy participants this task was preceded by a counting task, whereas for the patients, this was the start of the experiment.
Next, 2 types of features were calculated: static and dynamic features.
The static features describe the distribution of the physiological signals, eg, the mean and standard deviation, in each block. For each signal, 18 static features were calculated, including the mean and standard deviation, as well as differences of means between pairs of rest or stress blocks (see Table 2). These trends were calculated to explore whether healthy participants and patients differ in the cumulative effect of consecutive stress tasks.
The dynamic features represent the transition between different blocks, eg, the transition from rest to stress as response features, and the transition from stress to rest as recovery features. As these features have been shown valuable in physical fitness research, 13 Table 2.

| Statistical analysis
The goal of this study was to investigate whether healthy participants could be differentiated from persons with stress-related complaints on the basis of physiological data. Logistic regression (LR) using the Scikitlearn library of Python 2.7 was used for the analysis. 27 In LR, the probability of the outcome of the healthy participants versus patients is modelled as a function of the features weighed by coefficients obtained with a training set. 28

| RESULTS
To evaluate whether physiological data could differentiate healthy controls from persons with stress-related complaints, classifiers using  LR based on 6 feature sets were developed. After unsupervised feature reduction, 26 features were retained: 10 static and 16 dynamic.
The accuracy, sensitivity, and specificity for each set are presented in  Significant differences for the t test and medium to large effect sizes based on Cohen d were found for the 5 most important features (others did not show significant differences). These include 4 ST-and 1 GSR-related features. The t test was found significant for P < .05, and an effect size d > 0.5 was considered medium and d > 0.8 large. 30 In Figure 4, the boxplots of these features are shown, comparing the standardized feature values of healthy participants and patients. This indicates a stronger increase in GSR slopes (ie, a stronger GSR) for patients. The performance is evaluated using accuracy, sensitivity, and specificity. Classifications based on the GSR and response-related features give the best performance.

FIGURE 3
Feature importance of the response feature set based on the relative contribution to the logistic regression model. Feature names contain 3 parts, separated by an underscore: (1) the physiological signal for which the feature was computed, ie, HR, GSR, or ST; (2) the feature (see Table 2); and (3) the stress task(s) for which the feature was computed: S1 = stress task 1 (ie, Stroop Color-Word test), S2 = stress task 2 (ie, math test), S3 = stress task 3 (ie, stress talk). GSR indicates galvanic skin response; HR, heart rate; ST, skin temperature  Table 2); and (3) the stress task(s) for which the feature was computed: S1 = stress task 1 (ie, Stroop Color-Word test), S2 = stress task 2 (ie, math test), S3 = stress task 3 (ie, stress talk an exponential approach was also proposed. Detailed investigation of the most important features for the model based on response-related features revealed that feature slopes and trends are the most relevant ones (Figure 3). The 5 most important features showed significant differences and medium-to-large effect sizes for the healthy participants compared with the patients.
A general observation of the results shows that patients often show a more rigid response to stress than did healthy participants (ie, less variation between rest and stress). This could reflect one type of allostatic load, being the inadequate response of the allostatic systems as described by McEwen. 14 These results highlight the opportunities of using physiological stress responses as a means to discover new insights regarding the process of stress-related health disorders.
The current study was a methodological pilot study, which was executed in a laboratory setting and with a limited number of patients (n = 12). In the future, a possible application of this methodology could be large-scale population screenings for early detection of stressrelated health problems. Therefore, to use this methodology in practice, it should be investigated whether similar results can be obtained in real-life conditions, outside the laboratory. To this end, wearables such as Empatica E4 (Empatica, Milan, Italy) could be used for ambulatory physiological measurement of HR, GSR, and ST. Additional challenges will be related to signal quality. 34 In the current study, only persons with stress-related complaints were included. All patients confirmed their complaints started less than 3 months before consultation, and all patients were still capable of fully functioning in their social and professional lives. However, since this information is based on self-report, it could be incorrect as patients might be unaware of problems in their functioning. Further, we suggest additional research to investigate whether the results generalize to larger populations and patients on different areas along the stress continuum (ie, overstrain and burnout). We aimed with this methodological pilot study to bring attention to new exploratory methodologies; further research is needed to validate and replicate the results.
We conclude that our pilot study demonstrated the potential of physiological signals during the response to a stress task to discriminate healthy participants from persons having stress-related complaints. Our analysis also showed that a multiparameter classification model based on response-related features can outperform models based on single parameters (HR and ST) and models based on recovery-related features only. Investigation of the separate features can provide more insights and enhance our understanding of the physiological differences between healthy participants and persons at risk of stress-related health problems. Although further research is needed to investigate if these conclusions generalize to a larger population and to multiple clinical diagnoses, these results highlight the potential of using physiological signals and an exploratory approach to gain more insight into the difference between healthy participants and patients. Further longitudinal research using wearable technology to investigate the development of the 3 stages on the stress continuum could provide a powerful technique for better understanding the development of stress-related disorders. Such research could unravel early detection points for early diagnosis and prevention.