Using surrogate biomarkers to improve measurement error models in nutritional epidemiology

Nutritional epidemiology relies largely on self-reported measures of dietary intake, errors in which give biased estimated diet–disease associations. Self-reported measurements come from questionnaires and food records. Unbiased biomarkers are scarce; however, surrogate biomarkers, which are correlated with intake but not unbiased, can also be useful. It is important to quantify and correct for the effects of measurement error on diet–disease associations. Challenges arise because there is no gold standard, and errors in self-reported measurements are correlated with true intake and each other. We describe an extended model for error in questionnaire, food record, and surrogate biomarker measurements. The focus is on estimating the degree of bias in estimated diet–disease associations due to measurement error. In particular, we propose using sensitivity analyses to assess the impact of changes in values of model parameters which are usually assumed fixed. The methods are motivated by and applied to measures of fruit and vegetable intake from questionnaires, 7-day diet diaries, and surrogate biomarker (plasma vitamin C) from over 25000 participants in the Norfolk cohort of the European Prospective Investigation into Cancer and Nutrition. Our results show that the estimated effects of error in self-reported measurements are highly sensitive to model assumptions, resulting in anything from a large attenuation to a small amplification in the diet–disease association. Commonly made assumptions could result in a large overcorrection for the effects of measurement error. Increased understanding of relationships between potential surrogate biomarkers and true dietary intake is essential for obtaining good estimates of the effects of measurement error in self-reported measurements on observed diet–disease associations. Copyright © 2013 John Wiley & Sons, Ltd.


Introduction
The exposure of interest in nutritional epidemiology is typically the long-term average or 'usual' daily intake of a given nutrient, food, or food group. However, there is no gold standard measurement, and studies of the association between dietary intake and disease rely heavily on self-reported measures of intake, which may be subject not only to random errors but also to errors that depend on the true exposure level and on person-specific biases. Errors in measures of dietary intake result in biased estimates of diet-disease associations. Random error causes associations to be underestimated and possible failure to detect associations [1]. Systematic error, on the other hand, can result in underestimated or overestimated associations [1]. It is important to try to quantify and correct for the effects of measurement error on observed diet-disease associations.
Self-reported measurements of dietary intake are obtained using food frequency questionnaires (FFQs) or using records of actual intake over a day or series of days [2]. Types of food record include Sometimes, X i and T i are appropriately transformed prior to using this approach. The estimate of Â 1 found by using E.T i jX i / in place of T i in the diet-disease model is equal to Â 1 = XT (approximately in the case of logistic and proportional hazard regression). We refer to the correction factor XT as the regression dilution ratio (RDR) and estimate it by O XT D cov.T i ; X i /=var.X i /.
When T i cannot be observed even in a validation study, we can still estimate the RDR if additional exposure measurements are available. This is carried out by replacing T i in (1) with an error-prone but unbiased 'reference' measurement, that is, a measurement that is subject only to random error, which may be available in a validation sample within a cohort. This requires a model for the error in the main and reference measurements and a number of assumptions.
Suppose that the error in the main measurement X i can be modelled as where the errors i have mean 0, have constant variance, and are independent of T i and of each other. Parameterˇrepresents errors dependent on true intake. When˛D 0 andˇD 1, (2) is the classical measurement error model, and X i is an unbiased measure of T i . Suppose that a second measurement X 0 i is available, which is subject only to classical measurement error, that is, X 0 i D T i C 0 i . In the case of classical measurement error in the main exposure, X 0 i may be a repeat measurement of X i obtained using the same instrument. Alternatively, if X i is subject to a nonclassical error, X 0 i may be a different type of measurement, for example, a less error-prone but more expensive measurement, which provides an unbiased measure of T i and is available in a validation study. Under the crucial assumption that the errors in X i and X 0 i are independent, we can estimate the RDR XT by a regression of X 0 i on X i [1,8]. We now put this into the context of a nutritional epidemiological study. Most commonly, researchers have compared FFQ measurements (X i ) with measurements obtained from food records X 0 i available in a subset of the cohort to correct for the effects of error in the questionnaire measurements using the

Plan of the paper
In this paper, we describe a measurement error model for self-reported measurements using surrogate biomarkers. We propose the use of sensitivity analyses to investigate the effects of certain commonly made assumptions about the types of errors in dietary measurements. In Section 2, we give an overview of measurement error models for two and three dietary measurements and describe our extended model for two types of self-reported measurement and a surrogate biomarker. This work was motivated by data on self-reported measures of fruit and vegetable intake from questionnaires and 7-day diaries and a surrogate biomarker (plasma vitamin C) in the EPIC-Norfolk study, a prospective UK study of diet and cancer with over 25 000 participants. This study is unique in that diet diary and plasma vitamin C measurements are available for a large number of participants at two time points. Previous studies of measurement error involving food record measurement and biomarkers have been limited by small numbers of subjects. Fruits and vegetables are of major interest in nutritional epidemiological research [35,36]. In Section 3, we illustrate the proposed model and sensitivity analyses using the EPIC-Norfolk data. To our knowledge, this is the first time that surrogate biomarkers have been considered for use in measurement error models for food intakes rather than nutrient intakes. This is important because the possibility of a recovery biomarker for a food seems remote. The main focus is on estimation of RDRs for use in correcting observed diet-disease associations for the effects of measurement error in the dietary assessment. In particular, in this paper, we show the effects on RDR estimates of deviations from two commonly made assumptions in measurement error models involving surrogate biomarkers. It is also of interest to make comparisons of the degrees of error in FFQ and 7-day diary (7DD) measurements of fruit and vegetable intake. We conclude with a discussion in Section 4.

Overview of models for FFQ, food record, and biomarker
Throughout the paper, the following notation will be used: T i as the true long-term average intake for individual i, Q ij the FFQ measurement for individual i at time point j , R ij the food record measurement for individual i at time point j , and M ij the biomarker measurement for individual i at time point j . To simplify some of the notation, we let X refer to any of the error-prone measurements, fQ; R; M g.
First, consider a measurement error model for FFQ and food record only, in which the food record is treated as the reference measure for the FFQ, that is, the food record is assumed to provide an unbiased measure of true intake T i : A number of authors have considered models of a similar form to (3) [9,13,15,17,25]. The assumptions required to identify the parameters of model (3) depend on how many repeated measures of each type are available. If only one measurement of each type is available .Q i1 ; R i1 /, we require the assumption that the errors Q i1 and R i1 are uncorrelated (or have some known correlation) to estimate the RDR. In this case, the RDR is QT D cov.R i1 ; Q i1 /=var.Q i1 /, and we can estimate it from a linear regression of R i1 on Q i1 . Note, however, that not all of the individual parameters of the model can be estimated in this case, that is, the model is not fully identified. The availability of a second FFQ measurement, so that we have .Q i1 ; Q i2 ; R i1 /, allows us to estimate all of the model parameters only under the additional assumption that the errors in repeated FFQ measurements . Q i1 Q i2 / are uncorrelated. This is a highly undesirable assumption. A slight modification is that we can allow a nonzero correlation between the errors in FFQ and the food record measurements made at the same time point only .corr. Q i1 ; R i1 / ¤ 0/. In this case, we do not estimate the RDR QT as described earlier. The addition of a second food record measurement, so that the data are .Q i1 ; R i1 ; R i2 ; R i2 /, allows estimation of a nonzero correlation between errors in repeated FFQs .corr. Q i1 ; Q i2 / ¤ 0/, provided we assume that errors in repeated food records are uncorrelated .corr. R i1 ; R i2 / D 0/ and that errors in FFQ and food record have zero correlation .corr. Q ij ; R ik / D 0/. Again, as a slight modification, we can allow for a nonzero correlation between errors in FFQ and food record measurements made at the same time point. Provided that two food record measurements are available, we fully identify the model without a second FFQ measurement under the same assumptions required for the situation with two measurements of each type. If only a single FFQ is available alongside two food records, then of course, we cannot estimate a correlation between errors in repeated FFQs. We can replace all assumptions of zero correlation described here by assumptions that the given correlation takes a known nonzero value. Kipnis et al. [15] discussed the use of sensitivity analyses to assess the effects of error correlations on RDRs. Authors have extended model (3) using recovery biomarker measurements [22][23][24][25][26][27]37], and we can write it in the following form: Under this model, we assume the errors in the biomarker ( M ij ) to be independent of those in the selfreported measurements ( Q ij ; R ij ), which is reasonable because of the nature of a recovery biomarker. With this assumption, the RDRs for using an FFQ or food record as the main measurement in a dietdisease model are QT D cov.M ij ; Q ij /=var.Q ij / and RT D cov.M ij ; R ij /=var.R ij /, which can be estimated by regressions M ij on Q ij and M ij on R ij , respectively. We can estimate these RDRs using only one measurement of each type. To estimate all of the parameters of model (4) requires a repeated biomarker measurement and the assumption that errors in repeated biomarker measurements are uncorrelated (corr. M i1 ; M i2 / D 0), which again is assumed reasonable because of the nature of a recovery biomarker. The preceding assumptions allow estimation of correlation between the errors in FFQ and food record (corr. Q ij ; R ik /), as well as between repeated measures using the self-reported instruments (corr.

3841
The nature of surrogate biomarkers means that surrogate biomarker measurements cannot be assumed to have errors independent of T i ; hence, several investigators [13,14,25,[31][32][33][34]37] have described measurement error models for FFQ, food record, and a surrogate biomarker of a form similar to For identifiability of models of this form, we must have one type of measurement that is assumed to have errors independent of T i , or more generally, systematic errors of a known form. This has invariably been chosen to be the food record measurement despite evidence to the contrary [22,27]. Model (5) is identified under the same assumptions about error correlations as described for model (4). Under model (5), estimation of RDRs is more complex, and we will outline it further below.
In models of the forms described in (3), (4), and (5), some authors have parametrized the model slightly differently by separating the error terms ( X ij ) into a random part and a person-specific part [15,23,24,26,27,33,34,37], at least in the self-reported measurements. For example, we could alternatively write model (5) in the form Spiegelman et al. [33] discussed this model in detail. The terms q i and r i represent person-specific error in the FFQ and food record, respectively, and e Q ij and e R ij are random errors. Notice that there is no person-specific error term in the part of the model for the surrogate biomarker. When there are at least two measurements of each type (j > 2), this model is identified under the assumptions that e M ij is independent of e Q ij and e R ij for all j and of q i and r i . The person-specific errors in FFQ and food record may be correlated (corr.q i ; r i / ¤ 0) but are independent of all other error terms. The random error terms in FFQ and food record may be correlated for measurements made at the same time point (corr.e Q ij ; e R ij / ¤ 0) but must be uncorrelated otherwise (corr.e Q ij ; e R ik / D 0; .j ¤ k/).
We can also extend the preceding models to include adjustment for covariates, which is required to estimate RDRs when the underlying exposure-disease model includes covariates (Section 2.4). We further discuss the use of covariates in the following.
For a situation where two or more repeated measurements of each of Q, R, and M are available, Rosner et al. [34] proposed extending the surrogate biomarker model in (5) by including time-specific true intake, T ij , denoting average daily intake for individual i at time point j . Rosner et al. [34] separated the errors in measurements into person-specific and random components and also included covariates Z ij . Their model is of the form where q i , r i , and m i are person-specific error terms and e Q ij , e R ij , and e M ij are random error terms. This differs from the previous model in (6) in its inclusion of a person-specific error term for the surrogate biomarker. There are different assumptions regarding error correlations required when a person-specific error term is included for the surrogate biomarker. Model (7) is identified under the assumptions outlined as follows. We assume random error terms e X ij to be independent of T ij and Z ij , of each other, and of T ij . We allow person-specific errors in the self-reported measurements to be correlated (corr.q i ; r i / ¤ 0) but assume them to be independent of m i . Note that the random effect term m i allows for correlation in the errors between repeated surrogate biomarker measurements. We estimate the variance of m i at the cost of allowing correlation between random errors in FFQ and food record made at the same time point, that is, it is assumed corr.e Q ij ; e R ij / D 0. This is in contrast to the model in (6). Kipnis et al. [23], Rosner et al. [34], and Spiegelman et al. [33] summarized measurement error models that have been used for FFQs and food records using recovery or concentration biomarkers.

The extended model
In this section, we describe a model that extends that of Rosner et al. [34]. We describe the notation for a study with at least two measurements of each type for each individual. However, provided there are at least two surrogate biomarker measurements, we can calculate all relevant model parameters when FFQ and food record measurements are available at only one time point.
As in the previous discussion, we let T ij denote the average daily intake for individual i at time point j , where the expected value of T ij is a long-term average daily intake, T i . A longitudinal true exposure is appropriate because individual dietary intake clearly fluctuates somewhat over time, and depending on the length of follow-up during which dietary measurements were obtained, it may be desirable to allow for this. For individual i at time point j , let Z T ij denote a vector of covariates associated with the true dietary intake, and let Z Q ij , Z R ij , and Z M ij denote vectors of covariates associated with errors in the three types of dietary measurement. The four sets of covariates may differ, and they may be time dependent. The longitudinal measurement error model with covariate adjustment is In the following, we discuss the model further, including assumptions required for identifiability. Table AI in Appendix A shows how the model parameters are identified. In this paper, we choose not to separate the errors in the dietary measurements into person-specific and random parts, in contrast to some previously suggested models, including models (6) and (7). Appendix A outlines an alternative version of model (8), which includes person-specific error terms.
In (8), the implicit model for long-term average daily intake is where N Z T i denotes the long-term average covariate value for individual i. When the Z T ij are time constant, the terms h ij are within-person error terms representing the deviation in true dietary intakes at time point j from the long-term average daily intake, T i . In general, the deviation in true intakes at time j from the usual intake is We assume that the h ij have zero mean, are independent of T i and of each other (corr.h ij ; h ik / D 0; j ¤ k), and are independent of all covariates and of errors X ij . We let 2 T jZ denote the variance of T i and 2 h j the variance of h ij . We therefore allow the variability in true dietary intakes to differ across time points, through 2 h j . Different types of dietary measurements that are made at the same time point have additional correlation via the h ij term, compared with dietary measurements made at different time points.

Scaling and intercept parameters.
The scaling parametersˇX reflect errors in dietary measurements that are associated with the true intake. We assume these to remain the same over time, although the model allows different intercept terms so that the mean measurements may vary over time. As in the simpler models described earlier, under model (8), we cannot estimate one of the sets of parameters .˛Q j ;ˇQ/, .˛R j ;ˇR/, and .˛M j ;ˇM /, and it has previously been assumed that˛R j D 0;ˇR D 1.
In this paper, we use sensitivity analyses to assess the effects of varying the scaling parameterˇR on estimates of other parameters. We discuss the selection of suitable values for consideration in sensitivity analyses in Section 2.5. WhenˇR takes a value other than 1, we do not attempt to also fix˛R j because these parameters are highly dependent on the dietary exposure, with the consequence that˛Q j ,˛M j , and˛T cannot be estimated. These parameters only affect the other intercept parameters.

Error terms.
In model (8), the error terms X ij combine both person-specific and random error into one. Our use of combined errors is to some extent a personal preference. However, this formulation for the errors also allows for a little more flexibility in how we define the correlations between errors in longitudinal measurements (Appendix A). This is at the expense, of course, of being able to study person-specific and random sources of error separately. We do not focus on this aspect in this paper. We now outline the features of the error terms in model (8). We assume the errors X ij to arise from a normal distribution with mean 0 and variance 2 X j . We assume all errors to be uncorrelated with T ij and with all covariates. We use the following notation for correlations between errors in repeated FFQ and food record measurements given the covariates, where Z X i D fZ X ij g: The error correlation structure summarized in (9) allows the correlation between repeated self-reported measurements made at different time points to change over the course of follow-up and correlations between errors in FFQ and food record to differ according to whether they are made at the same time point or at different time points. There are two prices for estimating the error correlations in (9). These are that we cannot estimate correlations between errors in self-reported and biological measurements or correlation between errors in repeated biomarker measurements. These error correlations therefore need to be handled either by assuming that they are zero or by using sensitivity analyses. We make the assumption that the correlation between errors in self-reported and biological measurements is 0: Inclusion of important covariates in the measurement error model makes this a reasonable assumption. For example, if we are concerned that men and women report dietary intake differently for the same true level of intake and also that sex affects the correlation between true intake and the surrogate biomarker, then conditioning on sex eliminates the concern. We denote by MM the correlation between repeated surrogate biomarker error terms: The inclusion of covariates Z M ij in the extended model may reduce correlation between errors in the repeated surrogate biomarker measurements; however, this will be highly dependent on the surrogate biomarker in question. Many potential surrogate biomarkers are affected by intrinsic individual characteristics, for example, genetics, not easily captured in the covariate adjustment, and it seems unlikely that all sources of correlation between repeated surrogate biomarker measurements could be accounted for solely by true intake and the chosen set of covariates [2,29]. From the preceding discussion, we are not happy to assume that MM D 0, and we propose using sensitivity analyses to assess the effects of the value of MM . We discuss the choice of suitable values in Section 2.5.

Covariates.
There are three main purposes for covariate adjustment in the measurement error model: to allow individual characteristics to be associated with different degrees of error in self-reported measurements; to make some of the model assumptions more realistic, as discussed in the previous section; and to enable us to use regression calibration when the diet-disease model is adjusted for potential confounders. Relating to the third point, we must include any confounders in the diet-disease model in Z T ij . We further discuss this in detail in Section 2.4. It may also be of some secondary interest to investigate the effects of certain covariates on true intake, errors of reporting in the self-reported measurements, and error in the surrogate biomarker.
In model (8), we are unable to estimate the parameters R because of the requirement for identifiability that one of the three measurement types has systematic bias of a known form. We could fix the parameters R in additional sensitivity analyses, but it is difficult to fix a potentially large number of parameters about which we have little information. We can show that the value of R only affects parameters T , Q , and M (Table AI). We assume here that reporting in food record measurements R ij is not dependent on covariates, that is, R D 0. Provided that the variables of concern are included in Z M ij , our inability to estimate R does not affect the assumption in (10).

Comparison with earlier models.
Finally, we summarize the main differences between our model and that of Rosner et al. [34]. Our primary extension is the promotion of the use of sensitivity analyses involving the scaling parameterˇR and the biomarker error correlation MM . We allow a random term h ij in the model for T ij , which allows for true individual intake to fluctuate over time. Inclusion of this term naturally allows for dietary measurements made at the same time point to be more highly correlated than those made at different time points. In our model, the error terms X ij combine both person-specific and random errors and are allowed to have different variances over time. We allow errors in FFQ and food record measurements to have different, presumably higher, correlations when made at the same time point than when made at different time points. By not allowing this, Rosner et al. [34] were able to include a person-specific error term in the part of the model for the surrogate biomarker. In model (8), we allow for different sets of covariates to feature in different parts of the model, where Rosner et al. [34] assumed just one set of covariates. We should take care in the choice of covariates because the omission of important covariates that are unknown or unmeasured could induce correlations among T i , h ij , and X ij .j D 1; : : : ; J / and between Z T ij and h ij or T i . Note that the use of covariates Z T ij in the measurement error model (8) is not to provide a model for true intake but rather to allow the estimation of covariate-adjusted RDRs and to improve model assumptions.

Fitting the measurement error model
One approach to fitting model (8) is by maximum likelihood assuming a multivariate normal distribution for the dietary measurements conditional on the covariates. The measurements may need to be transformed to meet the assumption of multivariate normality. This method can be computationally intensive if there are many covariates. In the later example, we use the method proposed by Rosner et al. [34] in which we first obtain the residuals, X ij , from linear regressions of X ij on Z X ij and Z T ij and then use X ij in place of X ij in model (8) but with the covariate terms omitted. We then estimate parameters T by fitting a mixed-effects linear regression of R ij on Z T ij and dividing the resulting estimates byˇR. We estimate parameters Q by fitting a mixed-effects linear regression model with response variable Q ij ˇQR ij =ˇR and explanatory variables Z Q ij . We estimate parameters M in a similar way. To perform these regressions,ˇQ andˇM are replaced by their estimated values.
An alternative approach to fitting models of the kind described in the preceding sections is to use an estimating equations approach, which is based on method-of-moments principles, which is described by Spiegelman et al. [33]. We comment further on the two approaches in Section 3.4.
Usually, only a subset of participants contributing to the diet-disease analysis will have all measurements involved in the measurement error model. Some previous studies have fitted measurement error models using only the data from the subset of individuals with each of the dietary measurements at each time point under consideration, for example, from a validation study, which is usually a small proportion of the total study population [22,34]. Clearly, this is inefficient, and furthermore, it relies on the questionable assumption that measurements are missing completely at random. In this paper, we use a 'full-cohort' approach to fitting the measurement error models, including all individuals with an incomplete set of measurements. This likelihood-based analysis is valid and efficient under the weaker assumption that measurements are missing at random [38].

Correction for measurement error in diet-disease models
In this section, we outline how RDRs are estimated under the extended measurement error model. Model (8) allows the dietary exposure to be defined as dietary intake at a particular time point j , T ij , or as the long-term average intake, T i . Diet-disease models typically also adjust for potential confounders, say Z C ij , which are assumed to be measured without error. Under the method of regression calibration for a covariate-adjusted diet-disease model, we replace true intake in the diet-disease model by its expectation, conditional on both the observed error-prone measurement and Z C ij . The regression calibration model used to find this expectation may take one of the forms where X ij denotes the main error-prone exposure measurement available for all individuals being used to estimate the diet-disease association and Z C ij are the covariate measurements from the same time point. Typically, in large-cohort studies, X ij is an initial FFQ measurement, or in case-control studies, it may be a food record measurement. To reiterate, the RDRs X j T j jZ and X j T jZ indicate the effect of measurement error in the observed dietary measurement on the estimated diet-disease association, for example, a log odds ratio, when the interest is in time-specific intake or long-term true intake, respectively. The Fibrinogen Studies Collaboration [39] has previously suggested the use of time-dependent measurement error corrections. Model (8) accommodates calculation of the RDRs in the preceding models for FFQ or food record measurements at each time point j , provided Z C ij is a subset of Z T ij . We can show that our inability to estimate the parameters R in model (8) does not affect the RDRs X j T j jZ ; X j T jZ . Copyright  Estimates of correlations between dietary measurements and true intake are also informative. Note that we can write corr.Y i ; T i jZ/ D corr.Y i ; X ij jZ/=corr.X ij ; T i jZ/, where Y i denotes an outcome of interest. The correlations corr.X ij ; T i jZ/ therefore determine the power of a study to detect dietoutcome associations using X ij . We denote unconditional correlations by C X j T j D corr.X ij ; T ij / and C X j T D corr.X ij ; T i / and conditional correlations by C X j T j jZ D corr.X ij ; T ij jZ C ij / and C X j T jZ D corr.X ij ; T i jZ C ij /.
Appendix B outlines the calculation of the RDRs and correlations for the situation in which

Sensitivity analyses: choosing values forˇR and MM
We have proposed the use of sensitivity analyses to assess the effects of different values ofˇR and MM on the estimated measurement error model, in particular on the RDRs. In this section, we use results from other studies to inform us about plausible values for these parameters.
Measurement error models for an FFQ, a food record, and a recovery biomarker have been used in a small number of studies to investigate the structure of error in self-reported measures of total energy intake, where the recovery biomarker is obtained using doubly labelled water [19], and intakes of protein, potassium, and sodium, for which urinary measurements provide recovery biomarkers [20,21]. Energyadjusted protein ('protein density') has also been considered. The models used were similar to (4), thus providing estimates ofˇR. Table I presents the summary of the estimates ofˇR; they range from 0.34 to 0.81 across studies, nutrients, and type of food record. We could use the results from these studies to inform us about the range of values forˇR that may be plausible in our sensitivity analyses.
We have come across only one study in which estimates of MM have been obtained. Rosner et al. [34] fitted their model (7) to measures of vitamin C intake from repeated FFQ and 7-day diaries, with plasma vitamin C as the surrogate biomarker using data on 323 individuals from the EPIC-Norfolk study.
They assumed the 7DD measurement to have no scaling bias (ˇR D 1). These authors chose to estimate MM and instead assumed that correlations between errors in FFQ and 7DD measurements are the same whether or not the measurements were made at the same time point, that is, QR j D Á Q j R k ; 8j; k. They considered raw-adjusted and calorie-adjusted vitamin C intake, with adjustment for sex, age, body mass index (BMI), height, smoking status, and use of vitamin C supplements in all parts of the model. Without covariate adjustment, the estimates of MM in models for raw-adjusted and calorie-adjusted vitamin C intake were 0.54 and 0.57, respectively, and in the covariate-adjusted models, the corresponding estimates of MM were 0.32 and 0.39. Given the evidence from recovery biomarker studies that QR j ¤ Á Q j R k ; 8j; k, this approach does not seem appropriate in general and, aside from eliciting expert advice for specific surrogate biomarkers, it remains unclear how values of MM should be chosen for a sensitivity analysis. Using a surrogate biomarker model similar to (5), Wong et al. [32] performed sensitivity analyses using values MM D 0; 0:2; 0:4 in simulation studies.

Illustration: plasma vitamin C as a surrogate biomarker for fruit and vegetable intake in EPIC-Norfolk
In this section, we apply the methods outlined in Section 2 to data on intake of fruit and vegetables in the EPIC-Norfolk study, using plasma vitamin C as the surrogate biomarker. EPIC-Norfolk is a cohort of 25 639 individuals recruited during 1993-1997 from a population of individuals aged 45-75 years in Norfolk, UK [40]. During follow-up, study participants were invited to attend health checks at which dietary intake was assessed using an FFQ and a 7DD and blood samples were provided. Data are currently available from two health checks. Briefly, the first FFQ was mailed to study participants and returned either before or at the date of the first health check, which took place shortly after recruitment. At the first health check, the first day of the diary was completed as a 24-h recall with a trained interviewer and the remainder completed during subsequent days. The second health check took place 3-4 years later, when the FFQ and 7DD were handed out and later returned by post. At each health check, measures of average daily intake of fruit and vegetables (g/day) were derived from the FFQ and 7DD, and plasma vitamin C (mmol/l) was measured within a few days of the blood sample being provided. Bingham et al. have described the dietary assessment methods in detail [41].

Use of plasma vitamin C as a surrogate biomarker
Bates et al. [29] reported that 'of all the vitamins, vitamin C exhibits possibly the strongest and most significant correlation between intake and biochemical indices, so that its intake can be predicted with moderate precision from the wide range of biological values that are encountered within the population of a Western country'. Approximately 80-90% of vitamin C intake is absorbed when intake is below 100 mg/day, and absorption saturates at around 140 mg/day [42]. Plasma vitamin C has been suggested as a suitable surrogate biomarker for fruit and vegetable intake, as fruits and, to a lesser extent, vegetables are major contributors to dietary vitamin C [43][44][45], which in turn is correlated with plasma vitamin C [29]. However, plasma vitamin C is affected not only by vitamin C intake but also by absorption, metabolism, and genetics [46] and by individual characteristics, including sex, age, smoking status, and BMI [29,43,47,48].

Covariates
In model (8), we allow true fruit and vegetable intake to depend on sex, age, BMI, smoking status, and education level, which is used as an indicator of social class. These were chosen firstly because they are thought to be associated both with true dietary intake and with measurement errors and secondly because in studies of diet-disease associations, they would commonly feature in the set of potential confounders; hence, a regression calibration that is conditional on these variables is of general interest. Intake of fruit and vegetables can also differ across seasons of the year [49]. Z T ij is therefore a vector of covariates for sex, age, BMI, smoking status, education level, and season of measurement.
The vector Z Q ij contains covariates for sex, age, BMI, smoking status, and education level, which have been associated with reporting of fruit and vegetable intake on FFQs [50][51][52]. Errors in 7DD measurements could depend on similar covariates [53,54], but the parameters R cannot be estimated.
As noted earlier, plasma vitamin C has been observed to be affected by sex, age, smoking status, and BMI. In our data, there was a significant upward shift in the plasma vitamin C measurements during Copyright   the period over which the second health check took place (Table II), which may relate to changes in the laboratory over time; the cause is not clear, and we do not elaborate on this here, but we include an adjustment for month and year of measurement. In model (8), Z M ij therefore denotes a vector of covariates for sex, age, BMI, smoking status, and month and year of measurement.
The provision to allow different sets of covariates to be associated with true intake, self-reported intake, and errors in the surrogate biomarker was motivated by wanting to allow true dietary intake to be seasonal, whereas it did not seem plausible that season would affect dietary measurement errors. Similarly, we wanted to allow true intake and FFQ reporting, but not errors in the surrogate biomarker, to be associated with education level. Age, BMI, smoking status, and season were recorded at both health checks.

Application of model (8)
The use of a longitudinal exposure T ij in model (8) was motivated by the long period (3-4 years) between the repeated dietary measurements in the EPIC-Norfolk study. The use of error correlations between dietary measurements, which may differ over time, was motivated by knowledge about the timing and ordering of the self-reported measurements and the thought that self-reporting errors may change over a long period.
We assume a multivariate normal distribution for the dietary measurements to fit model (8). Plasma vitamin C measurements are approximately normally distributed on the untransformed scale, whereas FFQ and 7DD measurements are approximately normally distributed on the log scale. For individual i at health check j .j D 1; 2/, we let Q ij and R ij denote log-transformed FFQ and 7DD measurements, respectively, and M ij denote plasma vitamin C. Normality also holds approximately for the residuals after adjustment of Q ij , R ij , and M ij for covariates .Z T ij ; Z Q ij /, Z T ij , and .Z T ij ; Z M ij /, respectively. Estimated RDRs for FFQ and 7DD measurements apply to log-scale fruit and vegetable intake. If untransformed intake is of interest in the diet-disease model, then we can apply a 'back-transformation', which is outlined in Appendix B.
A total of 25 604 individuals have at least one of six measurements (FFQ, 7DD, or plasma vitamin C at the first or second health check), and all six measurements are available for 2000 individuals. Processing of diet diaries is extremely expensive and time-consuming and is ongoing. The analyses are based on 25 275 individuals with at least one dietary measurement and complete covariate information at times of dietary measurement. Table II summarizes the number of individuals with each of the six measurements, and the means and standard deviations of the measurements. We treat very high (75th percentile plus two times the interquartile range) plasma vitamin C measurements (105 at the first health check and 106 at the second health check) as missing because these could be due to the use of vitamin C supplements [55], but quantitative data on supplement use were not available. The covariates, excluding month and year of plasma vitamin C measurements and season of measurement, are summarized in Table III. For each type of dietary measurement, means and variances differ significantly between health checks 1 and 2 (Table II). In model (8), we therefore allow for different intercepts˛Q j ;˛M j ; j D 1; 2 and different error variances 2 X j ; j D 1; 2. We performed sensitivity analyses using different fixed values forˇR and MM . As recorded in Table I, Day et al. [22] found values ofˇR of 0.47, 0.69, and 0.81 for sodium, potassium, and protein, respectively. Sodium intake is thought to be badly measured by the 7DD in EPIC-Norfolk because the instructions provided with the diary were not clear regarding reporting of salt added at the table and during cooking. HigherˇR values for protein and potassium may therefore be more plausible. It was not clear what may be suitable values for MM . We choseˇR D 1; 0:75; 0:5 and, following Wong et al. [32], MM D 0; 0:2; 0:4. Although plasma vitamin C is regarded as a promising candidate for use as a surrogate biomarker, as discussed before, it is affected not only by the individual characteristics, which can be accounted for in a set of covariates, but also by intrinsic individual differences.

Results
We show estimates of the derived conditional RDRs and correlations in Table IV, main model parameters in Table V, and parameters associated with covariates in Table VI.

Regression dilution ratios and correlations.
Note firstly that our large sample size has enabled us to obtain parameter estimates with high precision. The parameters of the measurement model depend strongly on the assumptions we make aboutˇR and MM . As a consequence, the degree of correction for measurement error using RDRs is heavily dependent on the model assumptions (Table IV). Depending on what is assumed about bothˇR and MM , the RDRs range from 0.15 to 0.98 for FFQ measurements and from 0.13 to 1.02 for 7DD measurements, using the version of the RDR suitable for when long-term intake (T i ) is the main exposure in the diet-disease model. RDRs for time-specific exposure (T ij ) tend to be a little higher. We might expect this because we would expect FFQ and 7DD measurements to provide better measures of intake at the time of completion than the long-term average intake. Interestingly, the differences between the two types of RDRs are similar for FFQs and 7DDs, suggesting that FFQ reporting is also biased towards recent intake.
For the use of the FFQ as the main measurement, the RDRs are similar at the two health checks, whereas for 7DD, the RDRs are consistently somewhat higher at health check 2. The reasons for this are unclear. Given that the 7DD is generally considered to be a superior measurement to the FFQ and has been found to be in recovery biomarker studies, it is surprising to find that RDRs for 7DD and FFQ are very similar in this study. This suggests that the degree and type of measurement error in FFQ and 7DD may differ considerably across different foods and nutrients.
Correlations of dietary measurements with T ij are independent ofˇR. As MM increases, the correlations for FFQ and 7DD increase, and those for plasma vitamin C decrease. Correlations between FFQ measurements and T ij are similar at the two health checks, whereas those between 7DD measurements and T ij are higher at the second health check. 7DD measurements are more highly correlated with true intake compared with the FFQ measurements, indicating higher power to detect diet-disease associations, even though the RDRs are similar. Copyright

Main model parameters.
The value ofˇR affects only the scaling parameters Ǒ Q and Ǒ M and the variance of true intake, whereas MM affects all parameter estimates except Ǒ Q (Table V). The ratio Ǒ Q =ˇR is estimated to be 0.66 regardless of the assumptions. Estimated error variances for FFQ measurements made at health checks 1 and 2 are practically identical, whereas for 7DD measurements, the variance is higher at health check 1. This could be due to changes in the way individuals reported their dietary intake on the 7DD at the two health checks or to systematic changes in fruit and vegetables intake. Estimated error variances are greater for 7DD than for FFQ, perhaps because of the short-term nature of 7DDs. The estimated variability of errors in plasma vitamin C measurements is markedly higher at health check 2. Recall from Section 3.2 that observed plasma vitamin C measurements were more variable at the second health check.
High correlations between errors in repeated FFQ ( O Q 1 Q 2 ) and 7DD ( O R 1 R 2 ) suggest strong individual tendencies to make specific types of reporting error on both instruments. The result that O Q 1 Q 2 > O R 1 R 2 suggests that person-specific errors are more likely to persist across repeated FFQs than across repeated food records. Estimated correlations between errors in the FFQ and 7DD tendency for individuals to make similar types of error on both instruments. Error correlations for FFQ and 7DD at the same health check ( O QR j ) are higher than for measurements made at different health (Table VI). Increasing age, being female, and a higher level of education are associated with higher fruit and vegetable intake. Former smokers and, to a greater degree, current smokers have significantly lower fruit and vegetable intake compared with never smokers. Intake differs significantly across seasons, being highest in summer and lowest in winter. Estimates O Q are independent of bothˇR and MM , and greater reporting error in the FFQ is significantly associated with increasing age, higher BMI, and being female.

Covariates. The association of covariates Z T ij with true intake ( O T ) is independent of MM
Estimates O M depend on MM but notˇR. Conditional on true fruit and vegetable intake, increasing age, higher BMI, and being male are strongly associated with a lower plasma vitamin C level across all values of MM . When MM D 0 and 0.2, former and current smokers have higher plasma vitamin C given T ij compared with never smokers, but when MM D 0:4, current smokers have a statistically significantly lower plasma vitamin C level than former and never smokers given T ij . Other evidences that smokers have lower plasma vitamin C for the same vitamin C intake compared with nonsmokers [48] lead us to suggest that MM D 0 and 0.2 may be too low. The situation is complicated, however, because Copyright    the underlying exposure is fruit and vegetable intake, not vitamin C intake, and also by our inability to estimate parameters R , the values of which affect O M . A comparison with analyses without covariate adjustment (results not shown) shows that adjustment for covariates Z Q ij has only a very minor effect on the error variances 2 Q j and correlations between errors in FFQ and 7DD. However, adjustment for Z M ij greatly reduced the error variances 2 M j , suggesting that the covariates account for a substantial proportion of variability in plasma vitamin C conditional on true fruit and vegetable intake. The former finding suggests that, had we been able to estimate R , the results, including estimates O M , would be only slightly changed.

Which sensitivity values are most plausible?
It is worth considering what might be a plausible set of values forˇR and MM . The earlier results suggest that MM D 0:4 may be most plausible because it gives the expected negative association between smoking and plasma vitamin C. Higher values of MM brought significant difficulties in the maximum likelihood estimation, suggesting a poor fit. There is little in these results to favour one value ofˇR over another. In the recovery biomarker study of Day et al. [22] (Table I),ˇR D 0:75 is the mean of the estimatedˇR values for potassium and protein intake and may be considered a suitable choice to favour here.

Subset analysis
The use of vitamin C supplements could distort the association between fruit and vegetable intake and plasma vitamin C [55]. Although detailed data were not available on supplement use, a binary indicator of vitamin C supplement use has been created from responses to a health and lifestyle questionnaire in which 11 382 (45%) were identified as vitamin C supplement users. Plasma vitamin C is significantly higher among vitamin C supplement users: for example, at health check 1, mean plasma vitamin C was 49.0 mmol/l (SD 18.9) among nonusers and 57.8 mmol/l (SD 19.2) among users. We separately refitted model (8) for users and nonusers of vitamin C supplements.
The conditional RDRs from the preceding subset analyses are shown in Table VII for the case wherě R D 0:75 and MM D 0:4. RDRs are considerably lower for vitamin C supplement users than for nonusers because the estimated variability in true intake and in error correlations is lower among supplement users. Adjustment for vitamin C supplements using a binary indicator, as in [34], is not appropriate because of the large range of doses of vitamin C that individuals may receive from supplements. Future analyses may be able to adjust for vitamin C supplement dose in the measurement error model.

Summary of findings
We have outlined a measurement error model for self-reported dietary intake from FFQs and food records using a surrogate biomarker. We place particular emphasis on using sensitivity analyses to assess the impact of two assumptions that are usually made in measurement error models using surrogate biomarkers: that error in food record measurements is independent of true intake (ˇR D 1) and that errors in repeated surrogate biomarker measurements are uncorrelated given true intake and covariates ( MM D 0). The impact of assumptions aboutˇR in models using surrogate biomarkers does not appear to have been considered previously. Wong et al. [32] appeared to have been the only other authors to have considered the effects on the RDR of different values for MM , although they did not give a practical application.
Our extended model was motivated by and illustrated using data from the EPIC-Norfolk study on fruit and vegetable intake, measured using FFQs and 7DD and using plasma vitamin C as the surrogate biomarker. To our knowledge, this is the first use of a surrogate biomarker model for foods rather than nutrients. The model was fitted using data from over 25 000 individuals across two time points.
We showed that the choice of values forˇR and MM can have severe consequences for the estimated RDRs and hence for conclusions about the effects of measurement error on observed diet-disease associations, for example, a log odds ratio. The RDRs for FFQ measurements at health check 1 ranged from 0.15 to 0.95: that is, the measurement error in FFQs could result in anything from a very large attenuation in the log odds ratio to almost no attenuation. Our results indicate that we would make a potentially large overcorrection for the effects of measurement error under the usual assumptions thatˇR D 1 and MM D 0, which resulted in the lowest RDRs. Our results also suggest that for fruit and vegetable intake, the degree of measurement error in FFQs and 7DDs is similar, although the 7DD measurements are a little more highly correlated with true intake.

Using surrogate biomarkers in practice
In light of our results, one might consider what can be gained by using surrogate biomarkers to estimate measurement error models for self-reported dietary measurements. The use of surrogate biomarkers allows us to relax the assumptions that errors in FFQs and food records are independent and that errors in repeated measurements using the same instrument are independent, which in the past have been made in validation studies involving only FFQ and diet diary. However, this comes at the price of assuming that errors in surrogate biomarker measurements are independent of errors in self-reported measurements conditional on covariates, and an inability to estimate the correlation between errors in repeated surrogate biomarker measurement, MM . In the illustration of the model, we showed that different assumptions about the values of these parameters can result in RDRs covering a wide range. This is an important result; it suggests that we should be highly sceptical about results obtained under the usual assumptions and cautious to draw firm conclusions about the degree of error in self-reported measurements unless more information can be obtained about the relationship between the surrogate biomarker in question and true intake. There may, however, be other examples in which the values of the sensitivity parameters do not have such an extreme effect on RDR estimates. For some surrogate biomarkers, we may be happy to assume that MM is close to 0 given the use of a carefully chosen set of covariates Z M ij .
Experiments to better understand the association between dietary intake and biological measurements will be invaluable in learning more about the values ofˇM and MM . Tasevska et al. [56] defined a new class of biomarkers called predictive biomarkers. These are biomarkers intermediate between recovery and concentration biomarkers. A predictive biomarker has a relationship with true intake that is more complex than that of a recovery biomarker but that is relatively stable and relates to true intake in a dose-response manner [56,57] such that the relationship can be estimated from a feeding study. There currently appears to exist only one such biomarker, for total sugar intake, which has been used by Tasevska et al. [57] to fit models similar to that in (8) (although without some of our extensions) but where the parameters in the part of the model for the biomarker (M ) were assumed known, having been estimated in a feeding study. In particular, this enables the estimation of intake-dependent error in the food record measurement,ˇR. Our model in (8) could be applied directly in this situation, but without the requirement for sensitivity analyses. Like recovery biomarkers, predictive biomarkers refer to intake of nutrients, rather than to food groups, such as fruit and vegetables. For surrogate biomarkers for food groups, for example, plasma vitamin C, it may be possible to gain information about the possible range of suitable values to consider forˇM and MM for use in the measurement error model. Measurement error models such as that described in this paper will become more useful still when more is learnt about potential surrogate biomarkers and when more predictive biomarkers are developed.

Limitations
We outline here some potential limitations of model (8). We assume for identifiability that the terms h ij are independent across time. This would not be the case if unobserved covariates were associated with the value of h ij . We fitted model (8) again to the EPIC-Norfolk data under the assumption that the correlation between h i1 and h i2 is 0.5, which gave RDRs practically identical to those in Table IV. The h ij are also assumed independent of the covariates; hence, the model does not allow for the possibility that certain groups of people may have more variability in their dietary intake than others. This assumption may not have been of great importance in the illustration because the estimated within-person variability in true fruit and vegetable intake was small compared with the between-person variability. A related matter is that the variability of errors in dietary measurements is assumed independent of covariates, whereas it seems plausible that error variability may be affected by individual characteristics. Both of these extensions could be incorporated into model (8), and this would result in RDRs being dependent on covariates.
For surrogate biomarkers, it may be the case that the slopeˇM in model (8) in fact also depends on individual characteristics and that a random slope may be appropriate, sayˇM i . We believe that it would not be possible to estimate such a parameter in this model. However, interactions between covariates Z M ij and T ij in the part of the model for the surrogate biomarker could be incorporated. As before, this would result in RDRs being dependent on covariates.
As outlined in Section 2.3, the measurement error model was fitted by maximum likelihood, assuming a multivariate normal distribution for the dietary measurements. In this example, we used the selfreported dietary measurements on the log-transformed scale. A possible disadvantage of the maximum likelihood approach is that it requires that the dietary exposure is appropriately transformed and then used on the transformed scale in the outcome model of interest (Appendix C). As noted, an alternative approach to fitting the measurement error model is to use the estimating equations approach outlined by Spiegelman et al. [33], which is based on the method of moments. The advantages of the estimating equations approach are that no distributional assumptions are required and hence that the choice of scale in the outcome model is not dictated by this. However, the estimating equations approach still makes assumptions about the scale on which the error models are assumed to hold. If the assumptions of multivariate normality are met, then the maximum likelihood approach will offer some gains in efficiency relative to the estimating equations approach. One of our motivations for using the maximum likelihood approach was that under model (8), there are multiple solutions for some parameters under the methods of moments, although Spiegelman et al. [33] did suggest a weighted combination of estimates for use in this situation. A comparison of the two approaches under different measurement error settings may be warranted.
Finally, the methods presented in this paper have focused on studies in which a single dietary exposure of interest is measured with error. Analyses in nutritional epidemiology often involve more than one dietary exposure, all of which are likely to be measured with error. In particular, associations are often adjusted for total energy intake. Rosner et al. [58] extended the method of regression calibration to enable multivariate measurement error correction. For this situation, multivariate versions of the models such as those considered in this paper are required. These are possible but require assumptions regarding correlations between errors in self-reported measures of different food or nutrients, and models incorporating biomarkers have been scarcely used [59]. This presents a major challenge for measurement error correction methods in nutritional epidemiology. See, for example, Thompson et al. [17] and Zhang et al. [60] for multivariate measurement error models not incorporating biomarkers. Day et al. [61] investigated the potential implications of correlated errors and other factors in multivariate measurement error models. In the context of using a recovery biomarker, Carroll et al. [62] have considered multivariate measurement error models, although their focus was on using measurements of intake of multiple correlated nutrients to gain precision in a univariate regression calibration.

Conclusions
Estimation of RDRs, and therefore correction of diet-disease associations for the effects of measurement error, is highly sensitive to model assumptions. Depending on the assumptions made, we may conclude that the observed association between fruit and vegetable intake is either grossly underestimated or even somewhat overestimated when using FFQs or food records as the main measurement. The common assumptions that the food record measurement is not subject to scaling bias (ˇR D 1) and that errors in repeated surrogate biomarker measurement are independent ( MM D 0) may result in a large overcorrection for measurement error. We should take extreme care when interpreting results from estimated measurement error models and the resulting corrected diet-disease associations. The use of sensitivity analyses seems vital to understanding the potential impact of error in self-reported dietary measurements on observed diet-disease associations. Gaining further knowledge of how potential surrogate biomarkers are associated with true dietary intake and development of more predictive biomarkers are essential for increasing our understanding of how error in dietary measurements affects observed diet-disease associations and in obtaining good estimates of the true effects of diet on health outcomes. Copyright There is more than one set of estimates using methods of moments. *We cannot estimate the intercepts˛T ,˛Q j , and˛M j unless˛R j is specified, and we write these terms as unknown constants here.
Statist. Med. 2013, 32 3838-3861 † Z T j denotes the variance-covariance matrix for the vectors of covariates Z T ij and † Z Q j denotes the variance-covariance matrix for the vectors of covariates Z Q ij , and † Z T j Z Q j is the matrix of covariances between covariates Z T ij and Z Q ij with .k; l/th term cov.Z T kij ; Z Q lij /. The corresponding RDRs using T i , Q j T jZ , and Q j T are calculated in the same way as for T ij but with 2 h j omitted in the numerators, and using Z N T i , the mean of Z T ij across time points, in place of Z T ij . We find the RDRs for the food records by replacing Q by R and noting that we assume that R D 0.
Conditional and unconditional correlations of FFQ measurements with T ij are We calculate C Q j T jZ in the same way as C Q j T j jZ but with 2 h j omitted in the numerator and C Q j T in the same way as C Q j T j but with 2 h j omitted everywhere except in var.Q ij /. We find the conditional and unconditional correlations of T i and T ij with R ij and M ij by replacing Q by R or M and noting that R D 0. We approximated variances of RDRs and correlations using the delta method. In the case of the unconditional RDRs and correlations, we assumed that estimates of T , Q , and M are independent of all other parameter estimates to obtain the variance estimates.

Appendix C. Measurement error correction on a transformed scale
Suppose that to fit the chosen measurement error model by maximum likelihood, a transformation, say h. /, is applied to the main dietary measurement X . The regression calibration model is therefore of the form h.T / D 0 C XT h.X / C e; say, where h.X / is normally distributed. Conditional on X , it follows that h.T / is normally distributed with mean 0 C XT h.X / and variance 2 e , the variance of the residuals e. To perform a regression calibration based on dietary intake on the untransformed scale, we need to find the expectation E.T jX/ either by integration or by using a Taylor approximation. When h.x/ D log.x/, the required expectation can be found exactly and is given by where XT is the RDR found using transformed measurements and 0 and 2 e can be estimated using the estimated parameters of the fitted measurement error model. We can therefore estimate the true diet-outcome association by fitting the model E.Y jT / D g.Â 0 C Â 1 exp 0 C XT log.X / C 2 e =2/ ; where, for example, for a linear model, g.x/ D x and for a logistic model, g.x/ D e x =.1 C e x /. It follows that Â 1 can be also estimated by fitting the diet-outcome model using X XT as the exposure measurement and then dividing the resulting parameter estimate by exp 0 C 2 e =2 . This procedure requires estimation of 0 , which depends on an estimate of˛T , which cannot be estimated unless˛R j is specified alongsideˇR in the sensitivity analyses. Calculating the variance of the resulting estimate to account for the variability in the estimates of 0 , XT , and 2 e is more difficult than in the case where no back-transformation is required. We could obtain an estimate by bootstrapping.