Measuring quality of vision including negative dysphotopsia

To adapt the Quality of Vision Questionnaire (QoV) for measuring negative dysphotopsia and to validate the original and modified versions in the Dutch population.


| I N T RODUC T ION
Quality of vision is a complex entity, which comprises objective and subjective assessment and depends on individual perception.Measurements of visual acuity together with assessment of optical clarity of the media and state of the retina provide ophthalmologists with a good indication of the quality of vision.However, despite similar results of ophthalmologic examination, the individual perception of the quality of vision may vary between patients.Therefore, the assessment of the subjective perception of the quality of vision and visionrelated quality of life are essential tools for evaluation of patients with ophthalmological complaints and after ocular surgery.For example, secondary outcomes, including assessment of the quality of vision after cataract surgery help to evaluate the effect of the treatment from the perspective of the patient and to go beyond the measurements performed in the examination room (Hecht et al., 2023).However, these secondary outcomes are still underused and are reported just in minority of clinical studies (Hecht et al., 2023).The role of secondary outcomes is expected to gain more popularity in the next decennia with the current tendency to perform surgery at a younger age, even in absence of significant cataract, for refractive error correction (Hecht et al., 2023;Lundstrom et al., 2021).
The Quality of Vision Questionnaire (QoV), developed in 2010 by McAlinden et al. (2010), is an instrument with good precision and reliability for evaluation of the perception of quality of vision in patients with refractive errors, with cataract or with prior refractive or cataract surgery.This questionnaire was developed using Rasch analysis, a gold standard method for evaluation of the performance of questionnaires and psychometric data analysis (Bond & Fox, 2015;Pesudovs et al., 2007).Compared to a traditional approach that assumes that all items and item response categories contribute in the same way to the measurement, Rasch analysis provides us with exact estimations of each item and each response category difficulty on linear scale as well as with indicators of reliability of these estimations (Bond & Fox, 2015).
The QoV includes three scales (frequency, severity and bothersome) with 10 questions each including positive dysphotopsia (glare, halos, starbursts), image clarity (focusing difficulties, fluctuations in vision, blurred and hazy vision), quality of binocular vision (double vision, depth perception) and image distortion.It is well known, that up to 20% patients after cataract surgery may suffer from another unwanted optical phenomenon, called negative dysphotopsia, which is characterized by seeing of a shadow in the peripheral visual field (Davison, 2000(Davison, , 2002;;Makhotkina et al., 2018;Masket & Fram, 2021).Three questionnaires measuring negative dysphotopsia have been suggested previously, however, these do not provide interval measurement and the validity and reliability of these instruments have not been confirmed by Rasch analysis (Aslam et al., 2004(Aslam et al., , 2007;;Jin et al., 2009;Kinard et al., 2013;Radford et al., 2007).Our experience shows that patients with negative dysphotopsia usually find it difficult to describe their symptoms and to assess their severity.Therefore, a standardized scale is required to obtain a reliable estimation of the impact of this phenomenon on the quality of vision in pseudophakic patients.We believe that addition of negative dysphotopsia items to the original QoV questionnaire of McAlinden et al., may further improve the psychometric properties of this instrument in a pseudophakic population and provide a more comprehensive estimation of the quality of vision in this group.
To the best of our knowledge, a validated Dutch version of QoV questionnaire as well as other wellconstructed Dutch questionnaires measuring the quality of vision (as a uniform, unidimensional concept) have not been published before.Therefore, the aim of this study was to evaluate the Dutch version of the QoV questionnaire in a Dutch population using Rasch analysis and to construct and validate a modified version of this questionnaire that includes negative dysphotopsia items.The QoV questionnaire was translated from English to Dutch in accordance with the back and forward translation method, suggested by Guillemin et al. (1993).Subsequently, the questionnaire was discussed with 10 potential responders to assess the comprehensiveness of the translated version (cognitive debriefing) and after some minor revisions the final version was established.
To develop items assessing negative dysphotopsia after cataract surgery we conducted two focus group interviews with patients, performed an extensive literature review and summarized clinical data of patients with negative dysphotopsia in our clinic.The focus group interviews were conducted according to standardized guidelines and were led by an experienced moderator (Krueger & Casey, 2000;Onwuegbuzie et al., 2009).During focus group interviews, we used photographic examples of negative dysphotopsia ('bright arc' and 'dark arc'), suggested by Aslam et al. (2004) and we allowed patients to modify the pictures by drawing their own negative dysphotopsia images.As a result, five items representing negative dysphotopsia were constructed and, afterwards, added to the QoV questionnaire version for pseudophakic patients (ND-QoV).The items were constructed using the original three scales format with four response categories for each question.Photographic examples of negative dysphotopsia were included to assist patients with recognition of the symptoms and to make the questionnaire less sensitive to misinterpretations (Figure 1).

| Subjects
The study population consisted of three groups: contact lenses wearers (spherical and toric corrections, soft and hard types, monofocal and multifocal designs), patients with cataract and patients after cataract surgery.One hundred eleven patients with contact lenses were recruited from several Visser Contact Lenses Practices in the Netherlands.One hundred fifty-two patients with cataract and 141 patients after cataract surgery were recruited from ophthalmological departments of the University Medical Centers in Maastricht and Leiden.The inclusion criteria were: age of 21 years or older and the absence of ocular comorbidity affecting the vision, such as glaucoma with visual field loss, advanced macular degeneration besides mild drusen of mild pigment alterations, severe amblyopia, central visual loss and physical or mental impairment that may hinder participation in the study.Fifteen patients were excluded because of more than 33% missing responses, which resulted in a final sample size of 389 patients.Monocular best corrected visual acuity at far distance (BCVA) was recorded and the best eye was chosen for analyses.For statistical analysis, Snellen fractions were converted into log MAR values.
The study was approved by the local ethical committee and adhered to the principles of the Declaration of Helsinki.All patients provided written informed consent.

| Rasch analysis
Rasch analysis is one of the probabilistic models that allows ordering of persons according to their ability and items according to their difficulty on an interval scale (Bond & Fox, 2015).The Andrich rating scale model was used for analysis of polytomous data (multiple response categories) in this study (Andrich, 1978;Bond & Fox, 2015).This model provides estimations of each item difficulty and the estimations of the thresholds between categories.These thresholds are the same for all items and the overall item difficulty is set at the point of the equal probability of the highest and the lowest category for each item.In other words, each item has its own overall difficulty estimation, but the threshold structure is shared between items.
Rasch analysis was performed using Winsteps Version 4.3.1.The following principles and parameters of the Rasch analysis were considered: category ordering and function, targeting, fit statistics, person separation, unidimensionality and differential item functioning.
The analysis of the category function was performed to assess if item response categories were well-ordered and well-observed in the data.Targeting was assessed using Wright item-person maps that show whether the range of person's abilities was well-covered by the range of the item's difficulties.The difference between the mean item difficulty and mean person ability of more than one logit was considered as a significant mistargeting (Pesudovs et al., 2007).
Fit statistics represents the magnitude of difference between the observed measurements and the Rasch model expectations and it indicates whether each item provides a meaningful contribution to the construct being measured.Fit statistics (Infit and Outfit) was reported using mean of squared residuals (MNSQ) that indicate the size of misfit.Infit statistic is weighted and therefore less sensitive to outliers than outfit statistics.The Infit and Outfit MNSQ statistics between 0.6 and 1.4 was originally recommended for a rating scale by Wright et al. (1994).Later, the stricter criteria were proposed for evaluation of questionnaires in ophthalmology with acceptable misfit values between 0.70 and 1.3 (Pesudovs et al., 2007).High fit statistics indicate a misfit, i.e. the item is too erratic and might degrade the measurement, whereas low fit statistics indicates that item is overfitting, i.e. muting the measurement but not degrading it.
The person and items reliability indexes indicate whether the ordering of persons based on their ability and items based on their difficulty is replicable (Bond & Fox, 2015).The person and items separation indexes indicates how well the persons are separated using the current set of items and how well the items are separated using the current sample of persons (Boone & Noltemeyer, 2017).For persons, separation index of 1.5 is acceptable, 2.00 is good and 3.00 is excellent (Boone & Noltemeyer, 2017;Duncan et al., 2003).Higher indexes indicate higher measurement precision.
The Rasch model requires unidimensionality of the measurement, that is, all items should measure only one concept, for example, quality of vision but not psychological impact of the eye complaints.The dimensionality analysis provides the information about the amount of the variation in the sample that is explained by the measurement.Principal component analysis of Rasch residuals indicates how much unexplained information is present and whether there is any systematic variation that is not explained by the measurement.The first contrast strength of two items or more (>2.0eigenvalues) indicates a possible presence of a second dimension (Bond & Fox, 2015).
Differential item functioning (item bias, DIF) indicates whether the Rasch item estimations remain stable across the different subgroups in the sample.A DIF of more than one logits was considered to be notable (Bond & Fox, 2015;Linacre, 2016).
The analysis was conducted in two steps.At first, the Dutch version of QoV questionnaire was analysed using a single rating scale model.Afterwards, the data from the QoV questionnaire was combined with the data from the negative dysphotopsia questionnaire (only applied to pseudophakic population) and the combined questionnaire was analysed using a two-rating scale model.

| Validity, test-retest reliability and repeatability analysis
The validity of the questionnaire was assessed using Spearman correlation coefficients between the questionnaire scores, spherical equivalent of refraction (SEQ), BCVA and age of the patients.Additionally, the independent sample t-test was used to assess the differences in questionnaire scores, age, BCVA and SEQ between pseudophakic patients with and without negative dysphotopsia.
For repeatability assessment, 41 patients (from all three groups) completed QoV questionnaires and 13 of them who were pseudophakic also completed ND-questionnaire repeatedly within 10-14 days after the first administration.The test-retest reliability was assessed with a two-way single-measure interclass correlation coefficient (ICC) using SPSS software version 23.0 (SPSS Inc., Chicago, IL).For repeatability assessment, we used the 95% repeatability coefficient (R c ), that is, two times the standard deviation of the differences between two measurements (Bland & Altman, 1986;McAlinden et al., 2015).

| R E SU LT S
Table 1 shows the patients' characteristics of the 389 analysed patients.Fifteen patient with cataract or after cataract surgery had moderate ocular comorbidity affecting their vision that was registered after inclusion in the study: maculopathy (due to drusen, macula pucker, myopia) with visual acuity between 0.3 and 0.6 Snellen in 1 eye (N = 9), glaucoma with mild nasal step (N = 3), previous vitrectomy in the fellow eye (N = 1), unexpected amblyopia with vision acuity 0.3 (N = 1), corneal opacity (N = 1).As the Rasch analysis repeated after exclusion of these patients (data not shown) did not showed any changes in estimations of Rasch parameters that affected conclusion of this study, the data were reported for the whole sample size of patients.During cataract surgery, a monofocal intraocular lens was used in 128 (95.5%) patients, while six patients (4.5%) received a diffractive multifocal intraocular lens.

| Rasch analysis of Dutch version of QoV questionnaire
The categories were well-ordered and increased in a monotonical fashion.All three scales showed good item (range) (−0.2; 0.2) (−0.2; 0.5) (−0.2; 0.1) Abbreviations: BCVA, best corrected visual acuity; N, number; SD, standard deviation, separation and reliability and an acceptable person separation (Table 2).Rasch analysis also confirmed the same amount of mistargeting (around two logit) as the original version.The infit statistics was acceptable for all items except the 'Double vision' item, that was one of the most debilitating symptoms (Figure 2) in our population and the highest response categories were not well-observed for several items (data not shown).Unidimensionality analysis showed that more than 50% of the variance was explained by the measurement for all scales.There was no indication for presence of a secondary dimension, because the first contrast strength was no more than two eigenvalues.
There was a noticeable DIF for one item in 'Frequency' scale, two items in 'Severity' scale and three items in 'Bothersome' scale (Table 3).DIF was mainly observed between contact lens users and patients with cataract or after cataract surgery.Only two items, 'Halo (Frequency scale)' and 'Focusing difficulties (Bothersome scale)' has showed a small DIF of 1.1 and 1.2, respectively, logits between cataract and pseudophakic patients.

| Rasch analysis of combined ND-QoV questionnaire
Two items, 'grey strip' and 'black corner' were deleted from the combined questionnaire because of of observations.The highest response categories of the remained negative dysphotopsia items were not wellobserved in our data.Hence, the highest three response categories were merged, that resulted in a dichotomous response scale.A two-rating scale model was afterwards applied for the ND-QoV questionnaire (13 items), with the four response categories structure shared by 10 original items and two response categories shared by three negative dysphotopsia items.Addition of negative dysphotopsia items to the questionnaire did not change the category probability curves of the original QoV items.The modified questionnaire had similar precision, mistargeting and remained unidimensional (Table 2, Figure 3).There was noticeable DIF similar to the unmodified version of QoV (Table 4).

| Validity, test-retest reliability and repeatability analysis
There was no correlation between age and frequency, severity or bothersome scores of all versions of QoV questionnaire.There was also no correlation between SEQ of refraction and questionnaire data.We have found a mild positive correlation between BCVA and all three scores of the questionnaires data (r = 0.3, p < 0.01).
Eighteen of the 134 pseudophakic patients reported one or more negative dysphotopsia symptoms in the ND-QoV questionnaire.There was no significant difference in age, SEQ or BCVA between these patients and those without negative dysphotopsia (Table 5).Patients with negative dysphotopsia had significantly lower quality of vision (i.e. higher score) for all scales for both original QoV and combined ND-QoV versions.
The two-way, single measure interclass correlation coefficients (ICC) were more than 0.9 for all scales of both full and modified version of the QoV (Table 2).The ICCs for all scales of the ND-QoV questionnaire were lower (from 0.6 to 0.7).Exploration of the data revealed one outlier in the pseudophakic group who responded in a more negative fashion on all 13 items during the second administration.The reason of these response changes could not be explained by available clinical data.Because of the low sample size for reliability analysis of ND-QoV (13 pseudophakic patients) this outlier had a large influence.Exclusion of this patient lead to improvement of ICCs to 0.8.The 95% R c was between 7.1 and 8.4 units (on the 0-100 scale) for original (Table 2).For the ND-QoV version, the 95% R c was between 9.1 and 12.2 units.Exclusion of the outlier, led to improvement of 95% R c to values between 6.0 and 7.1 units.

| DI SC US SION
In this study, we have constructed a Dutch version of the QoV questionnaire and evaluated it with Rasch analysis in patients wearing contact lenses, having cataract or a history of cataract extraction.Rasch analysis of the translated version confirmed good psychometric properties of the QoV questionnaire in a Dutch population.Although lower than in the original version, all three scales have shown good item separation and reliability, acceptable person separation and items fit statistics.Our validation sample did not include healthy spectacles wearers and patients after refractive surgery and was smaller (389 patients compared to 900 patients) than in the original paper (McAlinden et al., 2010).As variability of the symptoms is directly related to the larger sample size and the presence of eye diseases, the smaller sample size provides less information about the items that can explain the lower summary statistics in the current paper (Bland & Altman, 1986).Differential item functioning was reported between healthy patients (contact lens users) and patients with history of cataract (cataract, pseudophakic), comparable to the original study (McAlinden et al., 2010).The questionnaire data showed a good fit to the Rasch model, and therefore, the questionnaire provides linear unidimensional measurement of three different aspects of quality of vision (frequency, severity and bothersome).
Because the purpose of this study was to provide the estimations comparable to the original version in order to facilitate further comparisons between international studies, these patients were not included.
In this study, we also presented an extension of the QoV questionnaire for pseudophakic patients (ND-QoV), which includes questions about negative dysphotopsia.The first photographic questionnaire, Forced Choice Photographic Questionnaire for Photic Phenomena, including negative dysphotopsia items was suggested by Aslam et al in 2004 and had a good repeatability but was not validated by Rasch analysis (Aslam et al., 2004).In the current study, we combined the negative dysphotopsia questions with the QoV questionnaire in order to provide a more reliable assessment of the subjective quality of vision in pseudophakic patients.We have found that inclusion of negative dysphotopsia did not change the QoV precision (Table 3).The items fitted well to the model and provided additional steps on the person item-map (Figure 3).Principal component analysis has confirmed the unidimensionality of the ND-QoV.
Interestingly, the pseudophakic patients with negative dysphotopsia had significantly lower quality of vision in both QoV and ND-QoV.This likely indicates that  all questions into a single underlying trait 'quality of vision', and since the presence of negative dysphotopsia adversely impacts it, a QoV questionnaire without negative dysphotopsia items still detects that quality of vision is disturbed.This speaks to the important concept that measurement is made at the latent trait level, not the question level.Therefore, the actual specific questions are less important than expected, although specific questions about negative dysphotopsia (in this case) should increase sensitivity to this issue.Alternatively, the difference observed with the original version may indicate, that patients with negative dysphotopsia either may also suffer more from other symptoms accompanying negative dysphotopsia or might have difficulties with neuroadaptation to the pseudophakic state leading to the more extensive perception of visual disturbances.

Item/scale
In this study, we have found a low to mild positive correlation between questionnaires scores and visual acuity that was lower than in the original version.
These small correlations support the construct validity of the questionnaire, as the several symptoms (e.g.hazy vision, blurred vision, focusing difficulties) are related to the visual acuity whereas other symptoms, for example, depth perception, starburst, represent other aspects of the perception of the quality.Therefore, this supports the hypothesis, that the questionnaire data can be an essential contribution for assessment of the patients' quality of vision that goes beyond the visual acuity measurements.
A limitation of this study is that we did not include healthy spectacle wearers and patients after refractive surgery.As a consequence, the population was different from that in the original study (both in size and content), which could lead to differences in item estimations.Therefore, we were not able to validate the questionnaire for administration in these patient groups.
The sample size of the group of pseudophakic patients was insufficient for reliable assessment of  thresholds in four categories response scale for negative dysphotopsia items as only 18 patients were symptomatic and most of them had mild symptoms.Because of the small number of observations of severe symptoms, three highest response categories for negative dysphotopsia items had to be merged to provide a dichotomous measurement.However, the endorsement of the higher response categories by a few patients in this study indicates the relevance of the excluded response options for patients with severe negative dysphotopsia.
Bearing in mind the low (less than 2%) prevalence of severe symptoms, a follow up study with a large sample size including patients with variable expression of negative dysphotopsia may be required to validate the four-response category scale and to allow better discrimination between patients with a different extent of this phenomenon (Davison, 2000(Davison, , 2002;;Makhotkina et al., 2018).
In conclusion, Rasch analysis has shown that the translated version of Quality of Vision questionnaire is a valid and reliable instrument in a Dutch population.The precision of the questionnaire can be further improved by addition of negative dysphotopsia items for the assessment of quality of vision in pseudophakic patients.

AC K NO W L E DGE M E N T S
Presented as a poster at the XXXVIII Congress of the ESCRS, Amsterdam, The Netherlands, October 2020.
No author has a financial or proprietary interest in any material or method mentioned.

T A B L E 2
Rasch analysis results and reliability and repeatability analysis for original English, translated and extended Dutch versions of QoV.

a
McAlinden, C., K. Pesudovs, and J.E. Moore, The development of an instrument to measure quality of vision: the Quality of Vision (QoV) questionnaire.Invest Ophthalmol Vis Sci, 2010.51(11): p. 5537-45.b ND-QoV versions contains 10 original items and three negative dysphotopsia items.cReliability and repeatability of ND-QOV was assessed in subgroup of pseudophakic patients and reported with, and after exclusion (in brackets), of 1 outlier.

F
I G U R E 2 Persons-item maps for the three scales of translated QoV questionnaire.M, mean of person or item distribution; S, one standard deviation from the person or item mean; T, two standard deviation from the person or item mean.T A B L E 3 QoV questionnaire: Items with DIF of >1 logit between groups.

F
I G U R E 3 Persons-item maps for the three scales of ND-QoV.M, mean of person or item distribution; S, one standard deviation from the person or item mean; T, two standard deviation from the person or item mean.T A B L E 4 ND-QoV questionnaire: Items with DIF of >1 logit between groups.
a Independent samples T-test.