Driving performance and neurocognitive skills of long‐term users of sedating antidepressants

Abstract Objective To assess driving performance and neurocognitive skills of long‐term users of sedating antidepressants, in comparison to healthy controls. Methods Thirty‐eight long‐term (>6 months) users of amitriptyline (n = 13) and mirtazapine (n = 25) were compared to 65 healthy controls. Driving performance was assessed using a 1‐h standardised highway driving test in actual traffic, with road‐tracking error (standard deviation of lateral position [SDLP]) being the primary measure. Secondary measures included neurocognitive tasks related to driving. Performance differences between groups were compared to those of blood alcohol concentrations of 0.5 mg/ml to determine clinical relevance. Results Compared to controls, mean increase in SDLP of all antidepressant users was not significant, nor clinically relevant (+0.75 cm, 95% CI: −0.83 cm; +2.33 cm). However, users treated less than 3 years (n = 20) did show a significant and clinically relevant increase in SDLP (+2.05 cm). No significant effects were observed on neurocognitive tasks for any user group, although large individual differences were present. Most results from neurocognitive tests were inconclusive, while a few parameters confirmed non‐inferiority for users treated longer than 3 years. Conclusion The impairing effects of antidepressant treatment on driving performance and neurocognition mitigate over time following long‐term use of 3 years.

Epidemiological studies showed that tri-and tetracyclic antidepressant use is associated with an increased relative risk of 1.4-2.3 of becoming involved in a traffic accident (Bramness, Skurtveit, Neutel, Mørland, & Engeland, 2008;Chang et al., 2013;Leveille et al., 1994;Ray, Fought, & Decker, 1992). Experimental studies in patients confirm that these drugs can produce significant driving impairment after treatment initiation but also indicated that such impairment may decrease over 2 weeks of treatment, possibly due to tolerance (Brunnauer et al., 2008;Veldhuijzen et al., 2006). In line with this, a study found that patients treated chronically with sedating antidepressants (clomipramine or imipramine) showed only minor impairment of psychomotor performance and memory as compared to healthy controls (Gorenstein, De Carvalho, Artes, Moreno, & Marcourakis, 2006). This raises the question whether long-term use of sedating antidepressants is still associated with clinically relevant impairment of driving.
Experimental studies have systematically assessed the clinical relevance of the effects of antidepressants on driving behaviour by means of comparison to alcohol, given its well documented dose dependent association with crash risk (Blomberg, Peck, Moskowitz, Burns, & Fiorentino, 2009;Borkenstein, Crowther, & Shumate, 1974).
These studies focussed on driving performance after single and repeated doses of a range of antidepressants (Ramaekers, 2003). Results showed that tricyclic antidepressants, such as amitriptyline, produce moderate to severe impairment of driving performance equivalent to driving under the influence of a blood alcohol concentration (BAC) of 0.5 mg/ml or more during the first days of treatment as compared to placebo. However, driving impairment was virtually absent after one week of repeated dosing, likely due to tolerance development. For tetracyclic antidepressants, such as mirtazapine, clinically relevant driving impairment was observed at the onset of a nocturnal treatment regimen. This mitigated over 3 weeks of repeated dosing, but never fully disappeared, suggesting that tolerance was incomplete.
Results from experimental driving studies have also been used for classifying fitness to drive of individuals receiving antidepressant treatment. These classification systems (de Gier, Alvarez, Mercier-Guyon, & Verstraete, 2009;Ravera et al., 2012) use a graded level warning system that expresses drug-induced impairment in BAC equivalents. The common classifications are: no/minor influence (category 0/I, BAC < 0.5 mg/ml), moderate influence (category II, 0.5 mg/ml ≤ BAC ≤ 0.8 mg/ml), and severe influence (category III, BAC > 0.8 mg/ml). Users of antidepressants that are classified as category III, are advised to not operate a vehicle, given that driving may be impaired for approximately 24 h after intake (Gómez-Talegón, Fierro, Del Río, & Álvarez, 2011). A limitation of existing drug categorisation systems is the lack of information regarding the effects of long-term drug usage on driving performance. For example, mirtazapine and amitriptyline are classified as category III drugs because of their acute effects on driving performance. Impairments may however dissipate after prolonged use, in which case classification of these antidepressants as category III may be too conservative for drivers receiving long-term treatment, limiting their mobility unnecessarily.
The objective of the present study was to evaluate driving performance of long-term users of category III antidepressants, as compared to that of a group of normative healthy controls. Longterm usage was defined as longer than 6 months. The secondary objective was to evaluate driving performance separately for those participants who had been using antidepressant for less than 3 years, and those whose use exceeded 3 years. The criterion of 3 years was based on Dutch laws, stating that antidepressant users are unfit to drive when treated for less than 3 years but can request an individual driver fitness evaluation after more than 3 years of stable usage (Ministry of Infrastructure and Water Management, 2000). Driving performance was assessed by a standardised highway driving test in actual traffic and various neurocognitive tests related to driving. The present data were collected as part of a larger study on the long-term effects of benzodiazepines and antidepressants on driving performance. Data on long-term benzodiazepine use and driving are published separately (van der Sluiszen et al., 2019).

| Design
The study was designed as a multi-centre trial (Universities of Maastricht, Utrecht and Groningen) in the Netherlands to compare on-the-road driving and driving related skills between long-term (>6 months) users of antidepressants and healthy controls. To explore the potential difference in impairment before and after 3 years of use, antidepressant users were divided into two groups based on duration of treatment, that is, long-term use less than 3 years (LT3-) and long-term use more than 3 years (LT3þ).

| Participants
Category III antidepressant users were recruited via patient organisations, hospitals, practitioners affiliated with UPPER (Koster, Blom, Philbert, Rump, & Bouvy, 2014) and regional advertisements. Healthy controls were recruited via flyers and advertisements in local newspapers. Participants were informed about the study's goal, procedures and potential hazards. The study was approved by the Medical Ethics Committees of Maastricht University and the Maastricht Academic Hospital and was conducted in agreement with the code of ethics on human experimentation established by the Declaration of Helsinki (1964), amended in Edinburgh (2000, Seoul (2008) and Fortaleza (2013). Written informed consent was obtained from each participant before enrolment. Participants received a financial compensation for their participation in the study.

| Antidepressant users
Thirty-eight long-term category III antidepressants users were recruited (17 in Maastricht, 14 in Groningen, and 7 in Utrecht).
Thirteen used amitriptyline, and 25 used mirtazapine. Initial 2 of 12 - VAN DER SLUISZEN ET AL. screening was based on a medical history questionnaire evaluated by research physicians (MDs) responsible for the medical well-being of participants at each site. The inclusion criteria were: use of a category III antidepressant over a period of at least six months with a frequency of at least two times a week (≈90 days a year), possession of a valid driver's licence for at least 3 years, driving an average of 3000 km per year, normal or corrected to normal vision, body mass index (BMI) between 17 and 35 kg/m 2 . Although Dutch law deems category III antidepressant users who have been treated for less than 3 years unfit to drive, many of them drive a motor vehicle simply because they are unaware of this legal provision and because this provision is not actively enforced by the Dutch government. Participants were excluded if they used concomitant medication classified as International Council on Alcohol, Drugs and Traffic Safety (ICADTS) category III. Concomitant medication classified as ICADTS category 0/I was allowed, whereas ICADTS category II was evaluated by a research physician on individual basis.
Additional exclusion criteria were: alcohol use >21 standardised units per week, smoking >20 cigarettes a day, use of illegal drugs in the past 3 months.
Before test days, antidepressant users were requested to take their medication at their usual time of day, that is, in the evening or morning. Their usual dosing regime was established at the screening visit and monitored by self-report on the practice and test day.

| Controls
Sixty-five controls formed a normative group with comparable age, gender distribution and years of driving experience as the antidepressant users group. Inclusion criteria were: a valid driver's licence for at least 3 years; driving an average of 3000 km per year, normal or corrected to normal vision and BMI between 19 and 29 kg/m 2 .
Exclusion criteria were: diagnosed with a neurological-, psychiatricor sleeping disorder, alcohol use >21 standardised units per week, smoking >10 cigarettes a day and illegal drug use and psychoactive medication (e.g.,: antidepressants, benzodiazepines, anticonvulsants, antihistamines, opioids) in the past 3 months.

| On-the-road driving test
In the standardised on-the-road highway driving test (Figure 1) (O'Hanlon, 1984;Ramaekers, 2017;Verster & Roth, 2011) participants drive a specially instrumented car over a 100 km primary highway circuit (i.e., A2 near Maastricht, A12 near Utrecht, and A28 near Groningen) They are accompanied by a licensed driving instructor having access to dual controls. The participant's task is to maintain a constant speed of 95 km/h and a steady lateral position between the delineated boundaries of the slower right-hand traffic lane. The vehicle's speed and lateral position relative to the left lane delineation is continuously recorded. These signals are digitally sampled at 4 Hz and pre-processed off-line to mark data recorded during overtaking manoeuvres or disturbances caused by roadway or traffic situations. The pre-processed dataset is used to calculate the mean and variance of lateral position of clean (unmarked) data, for each successive 5-km segment and, as the square root of pooled variance over all segments, for the test as a whole. The primary outcome variable is the standard deviation of lateral position (SDLP, in cm) which is a measure of road tracking error, or 'weaving'. SDLP scores of prematurely terminated tests are calculated from the data collected until termination of each ride.
Drug-induced impairments in the standardised highway driving test have been compared to that of a well-known benchmark drug (i.e., alcohol) that is known to jeopardise traffic safety and shows a clear exponential dose-dependent relationship with accident crash risk (Blomberg et al., 2009;Borkenstein et al., 1974). The clinical relevance of performance changes in the highway driving test have previously been determined by establishing the relationship between BAC and SDLP (Louwerens, Gloerich, DeVries, Brookhuis, & O'Hanlon, 1987). A recent meta-analysis of nine alcohol-calibration studies revealed that a mean increment in SDLP of 2.5 cm (95% C.I. 2.0-2.9 cm) was observed during the standardised highway driving test at a BAC of 0.5 mg/ml and has been defined as the minimal cut-off value to present clinical relevance . The highway driving test has been used in more than 100 studies and has proven sensitivity to alcohol, antidepressants and many other sedating drugs (Ramaekers, 2017;Roth, Eklov, Drake, & Verster, 2014;Vermeeren, 2004).

| Trailmaking Test
The Trailmaking Test (TMT) is a paper-and-pencil test measuring selective and divided attention, as well as executive functions (Reitan, 1958). The test comprises of two parts. In part A, the task of the participant is to connect as fast as possible 25 circles that contain the numbers 1 to 25, by means of connecting the circles in ascending order. In part B, the 25 circles contain letters (A to L) and numbers (1 to 13). Participants are required to connect as fast as possible the 25 circles in an alternately ascending fashion (i.e., 1-A-2-B-3-C and so on). The maximum time allowed for part A is 5 min, and for part B it is 6 min. The primary outcome measures for parts A and B is the time (in seconds) needed to complete each task, as measured by a handheld stopwatch.

| Digit Symbol Substitution Test
The Digit Symbol Substitution Test (DSST) is a paper-and-pencil test measuring executive attention and processing speed (Wechsler, 1958). Participants are presented with rows of digits and have to respond by writing the corresponding symbol in a blank space, according to a key presented at the top of the paper. The primary outcome measure is the number of correctly filled-in symbols in 90 s. VAN DER SLUISZEN ET AL.

| Adaptive Tachistoscopic Traffic Perception Test
The Adaptive Tachistoscopic Traffic Perception Test (ATTPT) of the Vienna Test System assesses visual orientation ability, visual observational ability, speed of perception and skills in obtaining an overview (Schuhfried, 2009). Participants are briefly presented with pictures of traffic situations on a computer screen. After each picture participants are required to indicate what was in the picture, by choosing from five answer options (i.e., cars, cyclists, pedestrians, traffic signs and/or traffic lights). Pictures are presented adaptively, meaning that the difficulty of the pictures is adapted to the abilities of the participant (i.e., participants who perform poorly, receive pictures containing less complex traffic situations; vice versa for participants whom perform well). The primary outcome is the number of correctly identified elements. Time to complete the task is 10 min.

| Reaction Test
The Reaction Test (RT) of the Vienna Test System assesses reaction time and motor time in response to simple and complex visual or acoustic signals (Prieler, 2008). Before the start of the test, participants are instructed to lay their index finger on a pressure-sensitive key (i.e., rest key). During the test, participants are required to press a target key, with their index finger, whenever a target stimulus is presented. After pressing the target key, they must return their index finger immediately to the rest key. By means of using a rest key and target key, it is possible to distinct between reaction time (time between the presentation of the target stimulus and the moment the index finger is removed from the rest key) and motor time (the time between releasing the rest key and pressing the target key). The current experiment uses three versions of the reaction test, namely: S1, in which participants have to respond whenever a yellow circle is shown on screen; S2, in which participants have to respond whenever they hear a tone and S3, in which participants have to respond whenever they see a yellow circle on screen and a hear a tone in combination, all other stimuli combinations are to be ignored. Time to complete all three versions of this task is 10 min. Outcome measures for each test are reaction time and motor time.

| Determination Test
The Determination Test (DT) of the Vienna Test System measures reactive stress tolerance, divided attention and mental flexibility (Neuwirth & Benesch, 2007). The test measures the ability to sustain attention over a period of approximately 10 min. Participants are presented with visual stimuli of varying colour and sounds with a different pitch, in a serial order. For each stimulus, a pre-defined button has to be pressed. The presentation of stimuli is adaptive to the reaction speed of the participant, meaning that the inter-stimulus-interval is shortened when participants make correct and fast responses, and is slowed down when participants make mistakes or respond slowly. During the task, participants are presented with the following stimuli and have to press the following corresponding

| Risk-Taking Test Traffic
The Risk-Taking Test Traffic (RTTT) measures risk-taking behaviour in potentially dangerous driving situation (Hergovich, Bognar, Arendasy, & Sommer, 2005). Participants are presented with 24 items (i.e., video clips) that show diverse driving situations, which are F I G U R E 1 Schematic drawing of the highway driving test. The standard deviation of lateral position (SDLP) is an index of road tracking error or 'weaving'. Drugs that induce sleepiness or sedation cause loss of vehicle control, leading to increased road tracking error. Figure and description taken with permission from van der Sluiszen et al. (2019) described in words before they are shown on-screen. Each driving situation is shown twice. During the first time, participants observe the entire driving situation. During the second time, participants are required to press a key on the keyboard, indicating the distance from the potential hazard at which the driving manoeuvre that has just been described becomes critical or dangerous (i.e., the point at which the participant would no longer perform the manoeuvre). The first item, of the 24 items serves as a practice item. Time to complete the task is approximately 15 min. The variable 'willingness to take risk in driving situations' is measured by obtaining the distance between the moment of a potential hazard, measured in hundreds of a second, and the moment the participant presses the key indicating the potential hazard becomes critical or potentially dangerous. This distance is a measure of subjectively accepted level of risk. Higher scores indicate higher levels of subjectively accepted risk.

| Psychomotor Vigilance Test
The Psychomotor Vigilance Test (PVT) is based on a simple visual reaction time test (Dinges & Powell, 1985). It measures the ability to sustain attention over a period of approximately 10 min. Participants are required to respond to a visual stimulus presented at a variable interval (2-10 s) by pressing a button with the dominant hand. The visual stimulus is the presentation of a counter that starts running from 0 to 60 s at 1-ms intervals. Participants are required to respond to this visual counter as soon as they perceive it on screen by pressing the corresponding button. If a response is made the counter stops, stays on screen for 500 ms as visual feedback for the participant, and disappears. During this period, a variable interval is presented and afterwards the next counter appears on screen. This cycle repeats until 100 stimuli have been presented on screen. If a response has not been made within 60 s, the clock resets and the counter restarts. Primary outcome measures are mean response speed and number of lapses (defined as responses with RT ≥ 500 ms) (Basner & Dinges, 2011). Performance on the PVT has been calibrated for dose-effects of alcohol (Jongen, Vuurman, Ramaekers, & Vermeeren, 2014).

| Beck's Depression Inventory
The Beck's Depression Inventory (BDI; Beck, Steer, & Carbin, 1988) is a 21-item self-report questionnaire measuring depression related symptomology. Answer options for each question range from 0 to 3.
The obtained total score for the BDI serves as an indicator for the presence of depression related symptoms, ranging from 0 to 63.
Higher scores indicate the presence of more symptoms of depression.

| State-Trait Anxiety Index-Trait
The State-Trait Anxiety Index-Trait (STAI-T; Spielberger, Gorsuch, & Lushene, 1970) is the Trait dimension of the 40-item selfreported STAI questionnaire. The STAI-T contains 20 questions that measure trait anxiety (i.e., how individuals feel in general). Answer options for each questions range from 1 to 4, with total scores ranging from 20 to 80. Higher total scores indicate more anxiety related symptoms.

| Pittsburgh Sleep Quality Index
The Pittsburgh Sleep Quality Index (PSQI; Buysse, Reynolds, Monk, Berman, & Kupfer, 1989) is a self-report questionnaire that assesses the quality and patterns of sleep over the last month, by rating the following seven domains: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbance, use of medication and daytime disturbance. A summary score ranging from 0 to 21 can be derived, with higher scores indicating poorer sleep quality.
A summary score ≥5 indicates a poor sleeper.  Next, non-inferiority analyses were used to determine whether the 95% confidence interval (CI) of performance differences between antidepressant users and controls exceeded the criterion level of clinical relevance, that is, an equivalent performance change as seen at a BAC of 0.5 mg/ml. When evaluating the 95% CI of differences between groups, three interpretations are possible (Figure 2). Antidepressant users' performance was considered not impaired (i.e., non-inferior) when the upper limit of the 95% CI of the difference from controls was below the alcohol criterion for impairment. Their performance was considered impaired (i.e., inferior) when the lower limit of the 95% CI of the difference from controls was above zero and the upper limit exceeded the alcohol criterion for impairment.

| Groningen Sleep Quality Scale
When the 95% CI of the difference from controls included both zero and the alcohol criterion for impairment, the results were considered inconclusive. The non-inferiority limit for the on-the-road driving test Clinical relevance of impairment of neurocognitive performance was also based on direct comparison with the impairing effects of alcohol obtained at a BAC of 0.5 mg/ml. In a separate study (Verster et al., 2016) an alcohol-calibration was performed to determine which neurocognitive parameters were able to detect impairment at a BAC of 0.5 mg/ml. Results of the calibration study showed that the only parameters sensitive for the impairing effects of alcohol were: TMT-A, DSST, RT-S1, RT-S2, RT-S3, DT and PVT. Consequently, these are the only parameters that provided non-inferiority limits for the present study. The clinical relevance of results will only be discussed for these parameters.
All statistical analyses were conducted by using the IBM Statistical Package for the Social Sciences for Windows (version 24; IBM Corp.). Power calculations were performed using G*Power version 3.1 (Faul, Erdfelder, Lang, & Buchner, 2007). F I G U R E 2 Hypothetical example of the qualification of clinical relevance of performance differences between antidepressant users and controls. The dotted line indicates the change in performance after alcohol intake (relative to placebo). A (druginduced) change in performance will be classified as inferior when the 95% CI includes the alcohol criterion but not zero (Ainferiority). Non-inferiority is concluded when the 95% CI does not include the alcohol criterion (B-non-inferiority). If the 95% CI includes the alcohol criterion as well as zero, the qualification of clinical relevance is undecided (C-inconclusive). Figure and description taken with permission from van der Sluiszen et al.

| Matching of controls
Analyses showed no significant effect of age, gender or driving experience in the ANCOVA for either SDLP, ATTPT, RTTT and PVT mean reaction time. For these parameters, the entire control group was used as a reference for comparison with the long-term users groups. For the remaining parameters, matched healthy controls were used for each long-term users (sub)group.

| Highway driving test
Data of the highway driving test were missing for one person in the control group due to problems with the recording system. None of the individual driving tests were prematurely terminated.
Mean (�SE) scores for SDLP of both groups are shown in Figure 3.
Mean SDLP of all antidepressant users did not differ significantly from controls (F 1,100 ¼ 0.89, p ¼ 0.35). The upper limit of the 95% CI of the difference between both groups (þ0.75 cm, 95% CI: À 0.83 cm; F I G U R E 3 Left: Mean (�SE) SDLP for controls and antidepressant user groups. Right: mean (95% CI) differences in SDLP between antidepressant user groups and controls. The dotted line indicates the change in performance after alcohol intake (relative to placebo). Symbols above bars indicate significant difference from controls, p < 0.05. BAC, blood alcohol concentration; LT3À , users treated less than 3 years; LT3þ, users treated longer than 3 years; SDLP, standard deviation of lateral position þ4.00 cm) and À 0.69 cm (À 2.74 cm; þ1.35 cm) for LT3þ users. Noninferiority testing revealed that only for LT3À users, the lower and upper limit of the mean difference in overall SDLP exceed zero and the þ2.5 criterion respectively, indicating clinically relevant impairment. Table 4 shows the mean (�SE) of all performance parameters for each antidepressant users (sub)group and healthy controls. Comparisons between antidepressant users and controls showed no significant performance difference between both groups. Table 5 shows an overview of the 95% CI of mean changes between antidepressant users and (matched) controls on alcohol sensitive parameters only, including inferiority limits and analyses. The 95% CI of mean changes of all alcohol sensitive parameters included zero and exceeded the BAC 0.5 mg/ml criterion, indicating inconclusive results.

| Neurocognitive performance
Subsequent analyses based on treatment duration showed no significant performance difference between the normative control group and the LT3À or LT3þ user groups, respectively. Similar to the results for the group as a whole, non-inferiority analysis of alcohol sensitive parameters showed that the 95% CIs of the difference between LT3À users and controls on all alcohol sensitive parameters included zero and the alcohol criterion indicating inconclusive results.
For the LT3þ users subgroup, non-inferiority was observed for the parameters of the RT-S2, RT-S3 and the PVT (mean RT þ Lapses).

| DISCUSSION
The current study compared the driving performance of long-term users of sedating antidepressants to that of a normative control group consisting of healthy participants. The goal was to evaluate whether the classification of the investigated antidepressants in category III may be too conservative for drivers who use their medication for a prolonged time. Overall, mean SDLP of long-term antidepressant users did not differ significantly from the control group. Significant increments in SDLP were found, however, for antidepressant users who had been treated for less than 3 years, but not for antidepressant users treated for longer than 3 years.
Furthermore, antidepressant users showed no significant differences in neurocognitive performance in comparison to controls, although individual variations were large as evidenced by wide 95% CIs around mean differences.
The clinical relevance in performance between antidepressant users and controls was determined by comparison to a threshold based on the influence of a BAC of 0.5 mg/ml. In the present study, when looking at the whole group of antidepressants users, the mean increase in SDLP was 0.75 cm in comparison to healthy controls.
The 95% CI of this mean difference included zero and did not include the alcohol criterion. This indicates that performance of antidepressant users during the on-the-road driving test is considered non-inferior for the group as a whole, given that the level of impairment associated with the legal limit of alcohol in traffic was not reached.
However, clinically relevant driving impairment was found in individuals who had been using antidepressants for less than 3 years.
The mean (95% CI) difference in SDLP between controls and LT3À users was þ2.05 cm (þ0.11 cm; þ4.00 cm) which includes the BAC 0.5 mg/ml criterion of clinical relevance. For LT3þ users the mean and 95% CI remained below the alcohol criterion. This suggests mitigation of driving-related impairment over time, which corresponds with a decreasing accident risk found in epidemiological studies following long-term antidepressant treatment (Barbone, McMahon, & Davey, 1998;Rapoport et al., 2011)  In summary, antidepressant users who were treated for less than 3 years showed clinically relevant impairment during the on-the-road driving test, but this was absent in the subgroup of users treated for longer than 3 years. The lack of clinically relevant impairment in antidepressant users treated longer than 3 years was further supported by results from neurocognitive tests, although most outcomes of the neurocognitive tests remained inconclusive. These findings support the idea that duration of treatment can be taken into account when evaluating the impact of long-term medication usage on individual drivers. The implication would be that classification systems grading the effects of drugs on driving, should allow for differential classification of drug effects on driving based on treatment duration.