SEARCH

SEARCH BY CITATION

Keywords:

  • Chinese;
  • preference-based measure;
  • SF-6D;
  • standard gamble;
  • validity

ABSTRACT

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

Objectives:  The SF-6D is a preference-based measure of health (PBMH) derived from the SF-36 for economic evaluation. The aim of this study was to find out whether it was feasible, acceptable, reliable, and valid to use the standard gamble (SG) method to generate preference-based values for the SF-6D in a Chinese population.

Methods:  The SF-6D was translated into Chinese by forward and backward translations. Forty-nine states defined by the SF-6D were selected using an orthogonal design and grouped into seven sets. An age-sex stratified sample of 126 Chinese adults with low education levels valued a set of 7 and the pits (worst) SF-6D health states by the SG method. The data were modeled at the individual and mean levels to predict preference values for all SF-6D states. The quality of data and the predictive power of the models were compared with results from the United Kingdom.

Results:  All respondents completed the interviews with 3% finding the process very difficult and 21% felt some degree of irritation or boredom. A total of 907 SG valuations (90% outof 1008 observations) were useable for econometric modeling. There was no significant change in the test–retest values from 21 subjects. The main mean effect models achieved a good fit with a mean absolute error of 0.054. Some differences between the Chinese and UK preference coefficients were found especially in the physical functioning dimension. The range of SG values predicted by the HK function is slightly longer, with the pits state having a value of 0.152 compared to 0.271 in the UK.

Conclusion:  It was feasible, acceptable, reliable, and valid to value the SF-6D with the SG method in a Chinese population with relatively low education levels. The results supported the feasibility and validity of valuing PBMH in Asian populations. Further studies are required to determine whether the differences in the SF-6D scoring algorithms between the British and Chinese populations are important.


Introduction

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

There is increasing demand for the use of health-related quality of life (HRQOL) as a preference-based measure of health (PBMH) in health economic analyses to assist decision-making on health policy and resource allocation [1–4]. The Quality Adjusted Life Year (QALY) is recommended for use in cost-effectiveness analyses of medical interventions by both the National Institute for Clinical Excellence in the United Kingdom [5] and the Panel on Cost-effectiveness in Health and Medicine of the US Public Health Service [2]. To calculate QALYs, HRQOL needs to be valued on the full health to dead scale based on preferences obtained from the general population, where full health is one and being dead is zero. It is possible for health states to have a negative value where respondents regard them as worse than being dead [2,3,6,7].

A common way to derive a preference-based index in clinical studies has been to use a PBMH [8,9]. A PBMH has two components: a multidimensional health state classification that describes a patient's health in terms of a level on each dimension and a set of preference-based weights obtained from members of the general population. Widely used examples of PBMH include the EQ-5D [10,11], HUI [12,13], and the SF-6D [14,15]. These have usually been valued in their country of origin, though there is increasing interest in obtaining values from other countries [12,16,17].

The MOS 36-item Short-Form Health Survey (SF-36) is a widely used HRQOL measure [18,19]. Brazier et al. in the United Kingdom extracted items from the SF-36 to form the SF-6D Health Survey that can be used to generate a six-dimension (6D) multiattribute health states for preference-based valuation [14,15].The SF-6D offers a method for deriving a preference-based index from any SF-36 data set. A preference-based scoring algorithm for the SF-6D was successfully derived from the general population in the UK by the standard gamble (SG) method. The SF-36 has been validated in Chinese populations in Hong Kong, Taiwan, and Singapore [20–22]. There were significant differences between the Chinese normative scores from those of the United States and UK, which suggested cultural differences in social expectations and values [22,23]. It is likely that significant differences in the preferences for different health states may also exist, and the UK SF-6D preference-based values may not be applicable to the Chinese.

Two questions need to be answered before the SF-6D can be applied to Chinese populations. The first is whether Chinese people can generate preference-based values for multiattribute SF-6D health states, and the second is whether valuation of a representative sample of the SF-6D states could be modeled to produce a scoring algorithm for all possible SF-6D states. SG is the preferred method for preference-based valuation of health but most studies were carried out in Western populations, and little is known about the feasibility, reliability, and validity of this measurement method in Asian populations including the Chinese.

The aim of this pilot study was to find out whether it was feasible, acceptable, reliable, and valid to value SF-6D health states with SG in a Chinese population of relatively low education levels. If the results were positive, PBMH such as the SF-6D could be adapted to the world's largest ethnic group who can then be included in global and multiethnic pharmacoeconomic studies.

Methods

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

Subjects

Chinese adults aged 18 to 75 years old registered with a primary care practice in Hong Kong were stratified into six age-sex groups (male or female aged 18–40, 41–60, or 61–75 years). The computer generated a random list of each group who were contacted in the order of the list by a research nurse for eligibility and willingness to participate in the study. People were excluded if they could not be reached after three telephone calls, refused to participate, could not communicate in Cantonese, or were not ambulatory. People, who were illiterate, defined by self-reported inability to read the newspaper, were also excluded after the first 2 weeks of study because it was found that they had great difficulty in completing the exercises.

A total of 126 subjects with 21 from each of the six age-sex groups were recruited from 867 people identified from the lists giving an overall response rate of 14.5%. The majority of the nonrespondents were due to noncontacts. We did not undertake formal sample size calculation. However, previous experience with the SF-6D and more recent work in valuing the HUI2 has demonstrated that 15 values per state is sufficient to estimate an additive model. Three subjects from each group valued each of the seven sets of health states with a total of 18 subjects per health state. A convenient sample of 21 subjects representing different age-sex groups repeated the interview 4 to 8 weeks after the first interview.

The sociodemographic characteristics of the subjects are shown in Table 1. Compared with the general population [24], the study sample was older because it was age-stratified. It also had lower education levels (17 subjects had no formal schooling) and fewer professionals/associate professionals reflecting the patient population of a Government subsidized primary care practice, and fewer people were single because the subjects were older. The sample included eight (6.3%) illiterate subjects who were recruited in the first 2 weeks of the study.

Table 1.  Sociodemographic characteristics of subjects
 All subjects (n = 128)Test–retest subjects (n = 21)HK general adult population* (n = 5,148,653)
  • *

    Data from 2001 Hong Kong population Census. The distribution of occupations was calculated from 4,085,731 people whose occupation data were available.

  • Percentages may not add up to 100 because of rounding.

Mean (SD) age in years48 (17.6)47 (17.5)42.3
Male/female (%)50/5048/5249/51
Education level (%)
 No formal education13.59.58.4
 1–6 years28.623.823.7
 7–13 years43.642.952.7
 >13 years14.323.815.2
Social class by occupation (%)
 Professional/assistant professional5.64.819.7.
 Skilled workers36.538.132.1
 Semi skilled/unskilled workers34.138.127.8
 Not in paid employment23.819.120.4
Marital status (%)
 Single23.819.031.5
 Married or living as married67.566.760.6
 Divorced or separated3.201.9
 Widowed5.614.35.9

Data Collection Procedure

A Chinese version of the SF-6D Health Survey was developed by single forward and backward translations by professional translators. The English back translation was evaluated by the developer (Brazier) of the SF-6D, who confirmed that it was equivalent to the original. For such a large descriptive system, with 18,000 combinations of dimension levels, it is only possible to value a sample of these states. To select the states for valuation we used the orthoplan procedure in SPSS [25]. This identifies an orthogonal main effects design that permits the estimation of an additive model to estimate the value for all states defined by the SF-6D. 49 states were randomly selected with the criteria that every level of each dimension was included at least once, different combinations of dimension levels in different states, and the combination of states had occurred in existing SF-36 data sets. These 49 states were grouped into seven sets. Each set consisted of a combination of seven different levels of health states. A six-digit number represents each SF-6D health state, each digit denotes the level of one of six SF-6D dimensions in the order from left to right: physical functioning (PF), role limitation (RL), social functioning (SF), bodily pain (BP), mental health (MH), and vitality (VT). The details of the structure and classification of the SF-6D health states are described in the article by Brazier et al. [15].

Three rounds of seven interview sessions were carried out from December 2002 to January 2003. One set of health states was used and six subjects (one from each age-sex group) were interviewed individually in each session. The seven sets of health states were used in rotation to minimize the interviewer learning effect. Each subject first answered the Chinese version of the SF-6D, then ranked 10 health states (the allocated set of 7, the best and pits [worst] SF-6D health states, and death), valued each of the set of 7 in a random order to reduce bias from the order effect and the pits/death SF-6D health states by SG using visual prop charts as shown in Figure 1, answered a structured questionnaire on demography and finally evaluated the interview experience. The details of the ranking and SG procedures are described in the article by Brazier et al. [15].

image

Figure 1. Visual props for standard gamble exercises.

Download figure to PowerPoint

This study used the same interview protocol and schedule as the UK study. The interview schedule was translated into Chinese. One modification was made to the UK method by the addition of symbols to the health state cards to indicate the level of wellness of each SF-6D dimension, as shown in Figure 2, to help respondents interpret each health state better. The SG variant was that developed by the McMaster team that uses an iterative “ping pong” procedure [9,13].

image

Figure 2. An example of one of the Chinese SF-6D health state cards (003) with symbol aids (The original English descriptions are shown in brackets.). Notes: The maximum number of levels of each dimension is indicated by the total number of symbols, and the level of each dimension is indicated by the number of symbols colored with a larger number indicating a better level. Circles instead of stars were used for the dimension on role-limitation to highlight its difference from the other dimensions in that it differentiates between the limitation from physical and mental health.

Download figure to PowerPoint

All study instruments were in Chinese and administered by a trained interviewer in Cantonese. The interviewer rated the subject's understanding and concentration in the interview at the end of the interview.

Outcome Measures and Data Analysis

Feasibility was assessed by the completion rate of the interviews, proportion of states with useable values, duration of the interview, interviewer ratings on the subject's understanding and concentration, and subject ratings on the degree of difficulty of the exercises, quality of answers and the number of health dimensions considered before making the choice. Acceptability was assessed by subject ratings on the amount of effort and degree of irritation or boredom.

Test–retest reliability of the SF-6D health states preferences of 21 subjects were analyzed by the mean difference between test and retest results (statistical significance tested by paired t-test), and intraclass correlation (ICC).

The validity of the valuations was tested by the fitting of data into econometric models, model predictive ability and consistency of model coefficients with the ordinality of the SF-6D. The results were compared with those found in the UK population.

Data were not usable if the respondents valued all states the same, valued less than 2 states or failed to value the pits state. Each respondent's health state values were chained using valuation of the pits onto the full health-death scale. The pits health state was valued on the full health-dead scale where full health is one, dead is zero and it could take a negative value bounded by −1. The values for the 7 intermediate health state values were transformed onto the conventional zero to one scale using the value for the pits state. These adjusted SG values form the dependent variable (y) in the models discussed below.

Modeling

All respondents with useable data were included whether or not they had missing values. Models have been estimated at the mean level; that is, the explanatory variables were used to estimate the mean value given to each of the states by the respondents that valued them. Models have also been estimated at the individual level that takes into account the variation across respondents using a random effects (RE) model. The modeling methods are the same as for the UK study and for a fuller account the reader should consult the article by Brazier et al. [15].

The explanatory variables have been classified into two groups. First, a set of binary dummy variables (xδλ) that describe each level λ and dimension δ of the health state. For example, x31 denotes dimension δ = 3 (social functioning), level λ = 1 (health limits social activities none of the time). For any given health state, xδλ will be defined as:

  • xδλ = 1 if, for this state, dimension δ is at level λ, and

  • xδλ = 0 if, for this state, dimension δ is not at level λ.

In all cases level λ = 1 acts as the baseline for each dimension.

Second, there is a binary dummy variable to take account of any additional effect on health state value when one or more dimension of health is at the “most severe” level. “Most severe” is defined as level 3 for physical functioning, levels 3 and 4 for role limitation and levels 4 and 5 for social functioning, pain, and mental health, and level 5 for vitality.

In both cases the intercept in the regression model is restricted to equal unity. In theory this term represents the value of full health, i.e., when each dimension of the heath state is at level 1. However, in practice estimates of the intercept are usually less than one [15]. The techniques employed in the valuation survey are based on the assumption that state 111111 is equal to one and death is equal to zero. For state 111111 to hold any other value would change the scale, hence the restriction is imposed.

The models presented in this article were estimated by the ordinary linear square (OLS) mean level model with constant forced through unity and via maximum likelihood for the RE model with the “MOST” term included to account for interactions. Explanatory power for the OLS model is expressed in terms of an adjusted R-Squared. However, the overall aim is to predict health state values and this has been assessed in terms of mean absolute errors (MAE) and the proportion of predictions outside 0.05 and 0.10 ranges on either side of the actual value. Predictions were further tested in terms of bias (t-test). All analysis was carried out in STATA 8 and SPSS for Windows 12.0.

Results

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

Feasibility and Acceptability

All 126 respondents completed all parts of the questionnaire and therefore the completion rate was 100%. Three subjects gave the same value for all the health states, suggesting that they probably did not understand the process. The mean time for completing the whole interview was 48.9 min (SD 17.8, ranged from 20 to 145). Table 2 shows the interviewer and subject evaluations on the process of the ranking and SG exercises. There was good concordance between the interviewer ratings and subject evaluations. Most subjects seemed to be able to perform and concentrate on the tasks. The majority (77.8%) of subjects said that they considered three or more dimensions in the SG decision indicating that it was feasible to generate a preference-based value for a multiattribute health state. Although 3.2% of subjects found the task very difficult and 24.6% found it a bit difficult, very few (1.6%) thought the quality of their answers was poor.

Table 2.  Feasibility and acceptability of standard gamble (SG) and ranking exercises
Subject evaluationProportion of subjects (N = 126) Interviewer rating
Problem in performing task (%)Effort and concentration (%)
None (77.8)Some (18.3)A lot (4.0)Great (87.3)Some (9.5)Little (3.2)
  1. Percentages may not add up to total because of rounding.

Challenge level of task
 Easy (35.7%)27.08.7029.45.60.8
 Neutral (36.5%)32.52.41.633.33.20
 Difficult (27.8%)18.37.22.424.60.82.4
Tried best to answer
 Yes (99.2%)77.817.5486.59.53.2
 No (0.8%)00.800.800
Number of dimensions considered in SG
 One (2.4%)0.81.602.400
 Two (5.6%)3.22.404.01.60
 ≥Three (77.8%)67.59.50.873.04.80
 Not sure (14.3%)6.34.83.27.93.23.2
Quality of answers
 Very good/good (42%)36.55.6038.93.20
 Average (56.4%)41.311.93.247.65.63.2
 Poor (1.6%)00.80.80.80.80
Felt bored or irritated
 Yes (21.4%)17.52.41.617.53.20.8
 No (78.6%)60.315.92.469.86.32.4

The process was acceptable to most subjects with only 27 (21.4%) reporting some degree of irritation or boredom, which occurred during the ranking exercise in 12 (9.5%) and the SG exercises in 13 (10.3%) of them. Two subjects who felt irritated or bored could not decide when this occurred.

Test–Retest Reliability

The ranking of the best health state card as the top was consistent in both interviews for all 21 respondents. On the other hand, five (24%) subjects reversed the order of the pits and death cards between the first and second interviews (three ranked death the lowest in the first interview but the pits health state lowest in the second interview; and two ranked the pits health state the lowest in the first interview but death the lowest in the second interview). There were 158 paired health state values for the assessment of the test–retest reliability after exclusion of the 10 unpaired pits/death heath card values. The mean difference was −0.026 (95% CI −0.069–0.017), which was not statistically significant by the paired t-test (t = −1.174, P = 0.242). ICC was 0.787 (95% CI 0.708–0.844), which was above the standard of 0.7 for group comparison [26].

SF-6D Valuations

Each of 126 subjects valued seven SF-6D health states and pits/death resulting in 1008 observations (882 nonpits values and 126 pits/death values). The health state values were evenly distributed between the 49 health states generated by orthoplan. Fourteen subjects (three of them valued all states the same) failed to provide values for the pits state and this resulted in further state values being excluded because they could not be adjusted (although for states regarded as equivalent to full health this does not prevent the values from being used because pits adjustment makes no difference). In all there were 907 (90%) useable health state values. Table 3 shows the mean values and number of useable values of the 49 randomly selected and pits health states of the SF-6D. The mean health state values ranged from 0.094 for pits (645655) to 0.95 for the best state (211111) included in the study.

Table 3.  Mean SG values of the SF-6D health states in HK Chinese
SF-6D stateMeanSDNumber of subjects
  • *

    The pits state (645655) was valued by all subjects but data were missing from 14 subjects.

  • SG, standard gamble.

2111110.95110290.1955720217
1212120.84804690.2356210616
1331320.81825000.2325848515
3324110.77679170.2465021915
3211220.76977940.2693257817
1134110.76367650.3058027617
2415310.76250000.2474751716
3123320.75477940.2735209617
2133230.75198530.2759582217
4121520.73536760.2574927017
1222330.72393380.2776364917
2321110.72195310.2764047316
1325240.68595590.3183774017
2352240.68481620.2682618917
1241250.68000000.2789674717
3411230.67335940.2266737416
1421540.66323530.3202672417
2214520.64669120.2937755817
1353120.63648440.2611573716
4213140.63406250.1864199416
4251310.62960940.2776770216
4145220.62750000.2842053117
5341130.61816180.2872438117
5223210.60867190.2541691816
5111140.60864580.3676036718
1156530.59875000.3014787817
5454220.59558820.3737832117
4326210.59187500.3231458814
4314430.58569850.3484966217
4432150.57608330.2290951115
1443410.57575000.2254884615
6331220.56820310.2841155816
2121450.55773440.2984320916
1315420.55710940.2990734716
6225130.56316180.3519699417
5235510.55425000.3758792715
3236440.55257350.3414589817
3342510.54257810.3073944516
1224250.53531250.3182237516
6112210.53158090.3616019417
5122420.53070310.2956144616
1116210.52640630.2882142816
6313550.52632350.3048810217
2246120.51475000.2196203115
6426120.49468750.2822467516
3155150.49312500.2678735716
6144340.48641670.3998082315
5316350.46125000.2251492615
6251410.33750000.2434132315
645655*0.09352680.41353434112

The skewness in the data can be seen at the individual level. A histogram for the 907 individual health state values is shown in Figure 3. They show that negative values did occur, but as in the UK a large proportion lie between 0.9 and 1.0 (20%). There were few valuations at 1.0 (37/907) indicating that most respondents were willing to risk a worse health state to have a chance of a better state.

image

Figure 3. Histogram for adjusted standard gamble (SG) valuations (n = 907). Notes: In total, 907 (90%) of 1008 observations from 126 subjects each of whom valued seven SF-6D health states and pits/death were useable after exclusion of data from 101 observations for which the adjusted value could not be calculated because the subjects did not value pits. Each column indicates the number of observations that had SG scores within the range bounded by the column.

Download figure to PowerPoint

Modeling

The results of the RE and OLS modeling of the individual level valuation data are presented in Table 4, including the beta coefficients estimated for each dimension level, model predictive ability (MAE and number of absolute errors greater than 0.05 or 0.10) and the number of inconsistent preference-based coefficients, compared with results from the UK. All the beta coefficients had the expected negative sign in the models and 21 out of 23 were significant at the 10% level in the RE model. The MAE in the RE model were 0.089 compared to 0.078 in the UK. The only significant inconsistent coefficients were found between MH2 (−0.069) and MH3 (−0.042) in the RE model; MH3 is hypothesized as a lower health state level (should have a larger negative coefficient) than MH2 but the reverse was found in the model. The UK model had four such inconsistencies. As in the UK, there was evidence of some bias (t ≠ 0) in the predictions of the RE model. The results of the mean level OLS model were better in terms of model predictive ability as indicated by a lower MAE (0.054) and lower proportions with absolute errors greater than 0.05 or 0.10, which were slightly better than the UK results. There was also one inconsistency, though this time with the MH5 (−0.095) and MH4 (−0.128), in that MH5 had a smaller negative coefficient though it was hypothesized to be a lower level of mental health than MH4.

Table 4.  SF-6D health state HK and UK models*
 RE model (constant forced to unity)OLS Mean model (constant forced to unity)
Hong Kong (1)UK (2)Hong Kong (3)UK (4)
  • *

    The coefficients in bold are significant at t0.10.

  • No R2 statistics (GEE estimation).

  • mean is zero by definition.

  • Inconsistencies, number of significant coefficients whose rank order is not consistent with that of the health state levels.

  • adj R2, adjusted R2; AE, Absolute errors; GEE, generalized estimating equations; MAE, Mean absolute errors. OLS, ordinary linear square; RE, random effects.

PF2−0.020−0.058−0.060−0.060
PF3−0.045−0.051−0.073−0.020
PF4−0.066−0.088−0.099−0.060
PF5−0.132−0.061−0.157−0.063
PF6−0.219−0.160−0.232−0.131
RL2−0.032−0.056−0.065−0.057
RL3−0.009−0.076−0.053−0.068
RL4−0.049−0.078−0.067−0.066
SF2−0.030−0.066−0.052−0.071
SF3−0.027−0.048−0.036−0.084
SF4−0.083−0.066−0.113−0.093
SF5−0.115−0.109−0.131−0.105
PAIN2−0.048−0.042−0.075−0.048
PAIN3−0.022−0.046−0.068−0.034
PAIN4−0.081−0.055−0.082−0.070
PAIN5−0.092−0.103−0.103−0.107
PAIN6−0.155−0.178−0.183−0.181
MH2−0.069−0.043−0.069−0.057
MH3−0.042−0.055−0.037−0.051
MH4−0.138−0.115−0.172−0.121
MH5−0.146−0.125−0.098−0.140
VIT2−0.024−0.040−0.026−0.094
VIT3−0.062−0.030−0.031−0.069
VIT4−0.076−0.040−0.060−0.069
VIT5−0.117−0.087−0.137−0.106
N907351850249
adj R20.9520.508
Inconsistencies1415
MAE0.0890.0780.0540.074
AE > 0.0534/50 (68%)122/249 (49%)20/50 (40%)118/249 (47%)
AE > 0.1020/50 (40%) 59/249 (24%) 8/50 (16%) 52/249 (21%)
t−8.01 −6.717

The SF-6D preference value can be calculated by summating the corresponding coefficients (with the negative signs) of the dimension-levels with 1. Using the OLS Mean model, the minimum SF-6D value is 0.152 with the HK function, which was lower than the 0.271 with that of the UK.

Discussion

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

The results of this pilot study supported the feasibility, reliability, and validity of using the SG method to generate preference-based health values for the SF-6D in a Chinese population of relatively low education levels. SG is the recommended method of preference-based health valuation because it is most consistent with the original von Neumann-Morgenstern theory [9]. However, there was concern about whether SG could be applied to people with low education levels because it demands the respondent to think in abstract terms of probability [27]. Our study sample included a large proportion of people who had less than 6 years of formal education, with 17 (13.5%) of them having none and eight (6.3%) being illiterate, all but three (2.4%) of whom were able to complete the exercises and give valid values. Lo et al. also used the SG method to generate preference values for the rhinitis symptom utility index in Chinese patients with allergic rhinitis but their subjects were young (aged 18–61 years) and better educated (74% had >6 years' education) [28].

We found the use of symbols to indicate the relative level of each dimension in the health state card was effective in helping subjects with low literacy level to grasp the condition described. On the other hand, use of symbols could introduce bias. The choice of symbols was particularly challenging for the SF-6D that has different number of levels for different dimensions. We used the same “star” symbol for all dimensions except that of role-limitation (RL) in order not to highlight any particular dimension. A different symbol in the form of two rows of circles, one for physical and one for mental health effect, was used for the RL dimension because its composition was very different from those of other dimensions. It was explained very clearly to the subjects that a level with more “stars” in one dimension did not imply it was better than a level in another dimension that had fewer “stars,” and each dimension level should be considered in the context of the total number of levels in that particular dimension.

The proportion (90%) of useable data in our study was higher than that obtained in the UK study (70%) [15], mainly because we used only two very experienced interviewers who ensured that respondents valued the pits/death state along with all the others. There was a wide range of health state values although most were within a range of 0.4–0.9, which showed that subjects were discriminating between health states. The mean value of the pits health state (0.093) was much lower than that found in the UK (0.21). On the other hand, a lower proportion of subjects (14/126 = 11%) than that in the UK (166/611 = 28%) thought the pits health state was worse than death [15]. This indicated possible differences in health preferences between the two cultures and deserves further investigations.

As for the UK, the mean health state values were broadly consistent with the health levels of the SF-6D. It was interesting to note that state 321122 (0.7697794) had a higher value than state 232111 (0.7219531) even though the former had more dimensions with lower levels than the latter. This showed the complexity of preference-based health valuation in that one dimension (e.g., role limitation) might be valued much more highly than other dimensions and the relative trade-offs between different health dimensions might not be the same at different levels.

The performance of the HK model compared very favorably with that of the UK in terms of its predictive ability measured by the MAE and the number of absolute errors greater than 0.05 or 0.10. The results supported the validity of preference-based valuation by SG of the SF-6D in a Chinese population for the generation of scoring algorithms applicable to the Chinese. More sophisticated models could be tested with data obtained from a larger sample of health states.

There was only one inconsistency between the models coefficients and the SF-6D hypothesized levels, between levels 4 and 5 of the mental health (MH) dimension for the mean OLS model and MH2 and MH3 for the RE model. The number of inconsistencies was fewer than that found in the UK further supporting the validity and quality of the data from the Chinese population. This is very encouraging given the relatively smaller size of the HK sample compared to the UK.

The HK Chinese SF-6D preference-based coefficients found in this study tended to be larger (bigger discount in preference) than those found in the UK, but this finding was not consistent across all dimensions or levels. Nearly all the Chinese coefficients in the physical functioning (PF) and bodily pain (BP) dimensions were larger than those found in the UK, and some of the differences were quite large (>0.05). On the other hand, the Chinese coefficients for the vitality (VT) dimension tended to be smaller than those found in the UK. The range of the SF-6D values predicted by the HK function is longer, suggesting that it might have less floor effect than the UK algorithm. However, this result should be interpreted with some caution because it is based on a small pilot sample. Further studies are needed to confirm the population differences and determine whether they are important.

Limitations

The feasibility and validity of generating preference values with the SG method is only indirect evidence of the validity of the concept of PBMH in the Chinese population. Further qualitative explorative studies are needed to confirm the concept of PBMH and whether it has the same meaning in the Chinese culture as that found in the Western culture.

The study sample was small and the response rate was low, which limit the generalizability of the preference values found. The order of rotation of the seven sets of health states was not randomized and we cannot exclude a possible order effect although it should be minimal. The small number of states valued might affect the accuracy of the econometric modeling. Therefore, the SF-6D preference-based coefficients generated from this pilot study should not be regarded as representative of the general population of HK or any other Chinese population. Further studies with a larger and more representative sample from the general population are needed to determine the HK Chinese specific SF-6D scoring algorithm.

The ranking exercise was carried out before the SG exercises in this study with the intention to familiarize subjects to different health states and to think in terms of relative preferences [9]. Unfortunately, we found that this became an obstacle in the study process especially for people who could not read so much so that we had to exclude subjects who were illiterate from our study after the first 2 weeks, though they were able to carry out the SG. The exclusion of people with low literacy levels could bias the valuation results because people from different socioeconomic backgrounds may have different health preferences. Further studies should be carried out to find out how illiterate people could be included in preference valuation studies and whether the exclusion of the ranking exercise would affect the results of SG.

Conclusion

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References

This pilot study confirmed that it was feasible, acceptable, reliable, and valid to generate preference values with the SG method for the SF-6D in a Chinese population with relatively low education levels. The results are very encouraging, and suggest that the concept of PBMH and preference measurement by SG may also be applicable to other Asian populations with low education levels.

The performance of the econometric models derived from the Chinese data compared favorably to that obtained from the United Kingdom. The findings support the applicability of the SF-6D to Chinese populations, which has a lot of potential because its parent HRQOL measure, the SF-36, is widely used in this ethnic group that makes up nearly a quarter of the world's population.

Some differences were found in the SF-6D preference-based coefficients between the Chinese and UK populations. Further studies are required to establish the Chinese specific SF-6D scoring algorithm, and to find out whether population differences in preference values are clinically important.

We would like to thank Daisy Chou, June Chau, and Annie Cheung for their help in subject recruitment, data collection and analysis. All authors contributed to the planning of the study, supervision of the data collection, data analysis, and writing of the article.

Source of financial support: The study was funded by a seed grant from the Center on Behavioral Health, the University of Hong Kong. The study was approved by the Ethics Committee of the Faculty of Medicine, the University of Hong Kong (EC 1757-01).

References

  1. Top of page
  2. ABSTRACT
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. References
  • 1
    Russell LB, Gold MR, Siegel JE, et al. The role of cost-effectiveness analysis in health and medicine. JAMA 1996;276:11727.
  • 2
    Weinstein MC, Siegel JE, Gold MR, et al. Recommendation of the panel of cost-effectiveness in health and medicine. JAMA 1996;276:12538.
  • 3
    Weinstein MC, Stason WB. Foundations of cost-effectiveness analysis for health and medical practices. N Engl J Med 1977;296:71621.
  • 4
    Patrick DL, Erickson P. Applications of health status assessment to health policy. In: SpilkerB, ed. Quality of Life and Pharmacoeconomics in Clinical Trials. Philadelphia, PA: Lippincott-Raven Publishers, 1996.
  • 5
    National Institute for Clinical Excellence (NICE). Guide to the Methods of Technology Appraisal. London: National Health Services, 2004.
  • 6
    Feeny DH, Torrance GW, Labelle R. Integrating economic evaluations and quality of life assessments. In: SpilkerB, ed. Quality of Life and Pharmacoeconomics in Clinical Trials. Philadelphia, PA: Lippincott-Raven Publishers, 1996.
  • 7
    Palmer S, Byford S, Raftery J. Types of economic evaluation. BMJ 1999;318:1349.
  • 8
    Torrance GW. Utility approach to measuring health-related quality of life. J Chronic Dis 1987;40:593600.
  • 9
    Torrance GW, Feeny DH. Utilities and quality-adjusted life years. Int J Technol Assess Health Care 1989;5:55975.
  • 10
    EuroQol Group. EuroQol—a new facility for the measurement of health-related quality of life. Health Policy 1990;16:199208.
  • 11
    Brooks R, EuroQol Group. EuroQol: the current state of play. Health Policy 1996;37:5372.
  • 12
    Furlong WJ, Feeny DH, Torrance GW, et al. The Health Utilities Index (HUI) system for assessing health-related quality of life in clinical studies. Ann Med 2001;33:37584.
  • 13
    Furlong W, Feeny D, Torrance GW, et al. Guide to Design and Development of Health State Utility Instrumentation. Centre for Health Economics and Policy Analysis Paper 90–9. Hamilton, ON: McMaster University, 1990.
  • 14
    Brazier J, Usherwood T, Harper R, et al. Deriving a preference-based single index from the UK SF-36 Health Survey. J Clin Epidemiol 1998;51:111528.
  • 15
    Brazier J, Roberts J, Deverill M. The estimation of a preference-based measure of health from the SF-36. J Health Econ 2002;21:27192.
  • 16
    Tsuchiya A, Ikeda S, Ikegami N, et al. Estimating an EQ-5D population value set: the case of Japan. Health Econ 2002;11:34153.
  • 17
    Badia X, Roset M, Herdman M, et al. A comparison of United Kingdom and Spanish general population time trade-off values for EQ-5D health states. Med Decis Making 2001;21:716.
  • 18
    Ware JE, Snow KK, Kosinski M, et al. SF-36 Health Survey Manual & Interpretation Guide. Boston:The Health Institute, New England Medical Center, 1993.
  • 19
    Ware JE. SF-36 health survey update. Spine 2000;25:31309.
  • 20
    Lam CLK, Gandek B, Ren XS, et al. Tests of scaling assumptions and construct validity of the Chinese (HK) version of the SF-36 Health Survey. J Clin Epidemiol 1998;51:113947.
  • 21
    Thumboo J, Fong KY, Machin D, et al. A community-based study of scaling assumptions and construct validity of the English (UK) and Chinese (HK) SF-36 in Singapore. Qual Life Res 2001;10:17588.
  • 22
    Tseng H, Lu JF, Gandek B. Cultural issues in using the SF-36 Health Survey in Asia: results from Taiwan. Health Qual Life Outcomes 2003;1:7280.
  • 23
    Lam CLK, Lauder IJ, Lam TP, et al. Population based norming of the Chinese (HK) version of the SF-36 Health Survey. H K Pract 1999;21:46070.
  • 24
    Census & Statistics Department Hong Kong. Main Tables of the 2001a Population Census. Hong Kong: Census & Statistics Department, HKSAR, 2002;1346.
  • 25
    SPSS Inc. SPSS for Windows 12.0. Chicago, IL: SPSS Inc, 2003.
  • 26
    McDowell I, Newell C. The theoretical and technical foundations of health measurement. In: McdowellI, NewellC, eds. Measuring Health—A Guide to Rating Scales and Questionnaire. New York: Oxford University Press, 1996.
  • 27
    Torrance GW. Social preferences for health states: an empirical evaluation of three measurement techniques. Socio-Econ Plan Sci 1976;10:12936.
  • 28
    Lo PSY, Tong MCF, Revicki RA, et al. Rhinitis symptom utility index (RSUI) in Chinese subjects: a multiattribute patient-preference approach. Qual Life Res 2006;15:87787.