Development of a work instability scale for rheumatoid arthritis




To explore the concept of work instability (WI), a state in which the consequences of a mismatch between an individual's functional abilities and the demands of his or her job could threaten continuing employment if not resolved, in people with rheumatoid arthritis (RA). To develop the Work Instability Scale (WIS).


WI in people with RA was explored through qualitative interviews, which were then used to generate items for the WIS.


Through Rasch analysis and validation against a gold standard of expert vocational assessment, a short 23-item, self administered, RA WIS was developed.


The WIS can be scored in 3 bands indicating low, medium, and high risk of work disability.


Work disability (WD), premature work cessation, is a common outcome in rheumatoid arthritis (RA). It has been reported that approximately one-third of patients will not be working 3 years following diagnosis (1), although the figure may be lower in the US (2). However, 1 study has reported a 31% rate of WD just 1 year after onset (3). Although the high risk of WD in RA is well recognized, the majority of studies in this area have concentrated on the prevalence of WD and risk factors (3–5). Very few studies have focused on identifying those individuals with RA most at risk of WD and the process of job retention. The development of new effective but expensive treatments for RA (6, 7) requires reliable measures of treatment outcome, and WD is one of the most costly outcomes of RA in terms of both direct and indirect costs.

It has been reported that it is more difficult to return to employment those people who have previously lost work than to retain newly disabled people in their customary work (8, 9). In the UK, 80% of people of working age moving onto the state Incapacity Benefit will never return to work. For those who do eventually stop working, this may be the final outcome of a long and sometimes complicated process involving adaptations within jobs and changes in job tasks (10). If individuals are having to make adjustments or job changes, this usually indicates that there is a mismatch between their functional capabilities and the demands of the job. This state may be called work instability (WI).

WI as a term is virtually unrecognized in the literature, although it has previously been described as a construct to take account of RA-related work changes (11). WI in RA has also been considered by some authors in terms of an interruption or cessation of work resulting from the consequences of RA (12). However, it could be argued this is extreme WI because it has necessitated a break from work either temporarily or permanently. A number of investigators have explored work performance problems (13–15). A number of reasons for these potential problems have been suggested: inappropriate management of disability; boredom and frustration; lack of support; fluctuating medical conditions; deteriorating disabilities; and poor self esteem (14).

Both the terms work instability and work performance problems relate to an individual's ability to fulfil their normal work tasks so, consequently, the basic concepts overlap. It could be argued that using the word “problems” indicates doubts and difficulties and that “performance” is rarely used without an adjective implying judgement, for example good or bad performance (13). The term work instability is preferred because instability may be viewed as a transient state, potentially reversible. It is during this state that the individual is vulnerable to job loss and arguably this is the time at which interventions are of greatest preventive importance in relation to work. The following definition of WI is offered: Work instability is a state in which the consequences of a mismatch between an individual's functional abilities and the demands of his or her job can threaten continuing employment if not resolved.

The concept of WI arose from the recognition that the vocational impact of functional incapacity must relate to the individual's work demands. The term was coined to describe the extent of any mismatch between functional incapacity and work demands at a point in time, and its potential impact on job retention and security. Because the term is intended to describe the effects of an association between 2 variables (functional incapacity and work demands), it was considered that a theoretical framework for classification in relation to certain diagnostic groups could be devised. Further, it was considered that a system of WI classification or rating could lead to the identification of whether, when, or what interventions might facilitate or enable retention at work, with its consequent psychosocial and economic benefits. Such interventions could include clinical management, vocational rehabilitation, psychosocial support, and ergonomics applications, including work organization issues; task, equipment, and tool design; and the provision of enabling devices.

Currently, the only way to assess those at risk of WD is with a vocational assessment by an experienced therapist, rarely accessible to rheumatologists in the UK. If a valid and reliable Work Instability Scale (WIS) could be produced that could indicate the level of risk of WD, this would provide clinicians an effective screening tool to facilitate early, appropriate referral for job retention measures. We have developed such a scale validated against a gold standard of full vocational assessment.



Subjects for all the stages of the study were recruited from rheumatology clinics in West Yorkshire, UK using existing databases of early arthritis patients. Centers for recruitment included a large teaching hospital (Leeds) and smaller district hospitals (Bradford, Pontefract and Pinderfields Hospital in Wakefield). All subjects had confirmed RA (American College of Rheumatology, formerly American Rheumatism Association, criteria) of 4 years or less duration (16). Ethical approval for the study was obtained.


The design of the study incorporated 5 stages (Figure 1).

Figure 1.

Stages of the study. WIS = Work Instability Scale; HAQ = Health Assessment Questionnaire.

Qualitative interviews to generate items for the scale.

A sample of 85 consecutive patients were identified as fulfilling the study criteria. These subjects were sent a detailed information sheet and subsequently contacted by phone by 1 of the interviewers. Purposive sampling of the 85 potential subjects was used to ensure that a wide spread of occupational groups across both sexes was included (Figure 2). A total of 49 subjects fulfilling the entry criteria were interviewed. Subjects were coded by occupation as sedentary (seated for majority of working day), light physical, or heavy physical according to the duties they reported having to complete in a typical working day.

Figure 2.

Occupations of subjects at time of qualitative interview.

Semistructured, in-depth, taped interviews were undertaken. The interviews concentrated on the factors relevant to the patients' struggle to continue in work following the onset of RA. The majority of interviews were conducted in the subjects' homes out of usual working hours, although some interviews were carried out at the worksite depending on individual preferences. Interviews ranged from 40 to 60 minutes in duration. Those who had been unable to work for longer than 6 months and had therefore moved onto the UK Incapacity Benefit were excluded. Because data were collected mostly from subjects currently in work, this study does not suffer from the recall bias criticized in previous studies by Reisine and colleagues (17). Signed consent was obtained prior to the interview.

Full, typed transcripts of the interview tapes were produced and then analyzed using grounded theory approach and specialized computer software (18, 19). The aim of this stage of the work was to identify the main themes reported by patients in relation to maintaining their work, and within those themes, to identify potential items for the WIS. Once the qualitative analysis was complete, 4 members of the research team from different professional backgrounds worked through the transcript analyses identifying short statements indicative of WI to form items on the first draft WIS. There was more than 75% agreement between the different members of the team, the qualitative interviewers tended to highlight more statements. There was an initial list of 100 statements. Where the same concept was duplicated, the simplest wording was chosen. Seventy-six statements were finally identified as potential items. All the statements selected were chosen to be applicable to all occupational groups.

First postal questionnaire.

A postal questionnaire was sent to 625 patients. The aims of this stage of the study were to test the scaling properties of the draft WIS, to facilitate item reduction, and to provide preliminary evidence of construct validity. It was felt a 76-item draft WIS would be too long, so the 76 potential items were formed into 2 draft WI scales. Each of the 2 scales had 46 items: 16 common items and 30 unique items. The postal questionnaire also included the Health Assessment Questionnaire (HAQ) (20) and general questions relating to disease duration, current symptoms, perceived quality of life (visual analog scale), and employment status.

Vocational assessments acting as the gold standard (criterion validity).

A sample of 38 subjects who returned the questionnaire and were in work agreed to participate in a full vocational assessment. These were representative of the larger sample by age and sex. These vocational assessments were completed by 2 professionals who were both chartered physiotherapists and registered ergonomists. The vocational assessments followed an agreed format (see Table 1) and took between 30 minutes and 1 hour to complete. To ensure consistency between the 2 assessors, a 2-day program was carried out in which they observed each other completing assessments, and also carried out assessments on the same patients blind to each others' scores. At the end of these 2 days of assessments (7 subjects assessed), the 2 experts achieved 100% agreement.

Table 1. Core elements in the full vocational assessment for work instability
Health situation
 Medical history
 Details of current condition
 Current symptoms
  Aggravating and easing factors
  Pattern of symptoms
 Clinical management
Work situation
 Job description
  Task analysis
  Hours worked/shift patterns/rest breaks
  Work organization/task variety/control
  Management culture/style
 Postural analysis
 Physical work factors
 Getting to work and access in and around the workplace
 Other perceived stressors

The experts carried out full vocational assessments on the remaining 31 subjects blind to the results of the postal questionnaire. At the end of each assessment, they gave the subject a WI level of 0–4 representing increasing risk of WD. This scoring system was devised following consultation with the experts who regularly carry out this type of assessment and reflects their opinion following an assessment covering all the areas listed in Table 1. The explanation beneath each level of WI reflects the experts' view of the level of intervention recommended to assist the individual in terms of job retention. For WI scores of 3 and 4, the experts view is that either some aspects of the job or the majority of the job is unsuitable for the individual given his or her current functional limitation(s) (Table 2). The questionnaire responses were then validated against the result of the gold standard assessment.

Table 2. Levels of work instability (WI) used by experts completing gold standard vocational assessments
0 WILevel 1 WILevel 2 WILevel 3 WILevel 4 WI
No problems at work.Minor problems at work requiring advice only.Modifications to work practices and/or provision of alternative equipment required.Some aspects of the job are unsuitable. For the remainder, modifications to work practices and/or provision of alternative equipment required.Mismatch is such that even with changes, modifications, and alternative equipment, the majority of the job is unsuitable and the individual is unlikely to cope with current work tasks.

Discriminant and construct validity.

Responses to the 31 questionnaires completed by the patients involved in the criterion validity stage were examined to see which items discriminated across the categories of risk. Each item of the draft WIS was tested for its potential to discriminate across the various levels of risk defined by the experts. Items that showed an increasing proportion of positive responses as the level of risk increased were retained. The data relating to these items from the first postal questionnaire were then fitted to the 1 parameter item response theory model, the Rasch model (21). This statistical approach was used to define and quantify the construct of WI.

The Rasch model is a unidimensional model that asserts that the easier the item, the more likely it will be affirmed, and the more able the person, the more likely he or she will affirm an item compared with a less able person. It assumes that the probability of a given respondent to give a “correct” answer to a particular item is a logistic function of the relative distance between the item location parameter and the respondent location parameter. In other words, the probability that a person will affirm an item is a logistic function of the difference between, in this instance, the person's level of WI (θ) and the level of WI represented by the item (b), and only a function of that difference,

equation image

where pi(θ) is the probability that respondents with ability ϕ will affirm item i, and b is the item difficulty parameter.

From this, the expected pattern of responses to an item set is determined given the estimated (θ) and b. When the observed response pattern coincides with or does not deviate too much from the expected response pattern, then the items constitute a true Rasch scale (22). Test of fit to the Rasch model is preceded by a number of overall tests and by tests of fit for individual items. Overall item and person fit statistically approximate a normal distribution with a mean of 0 and standard deviation of 1 when data fit the model. A third statistic looks at item trait interaction, testing that the hierarchical ordering of the items remains the same for discrete groups of patients across the trait. This is reported as a chi-square statistic, and probability should be greater than 0.01 (no significant difference). Taken with confirmation of local independence of items (no residual associations in the data after the Rasch trait has been removed), this confirms unidimensionality (23, 24).

The emerging WIS was also tested for differential item functioning (DIF) for age and sex. The basis of the DIF approach lies in the item response function, the S-shaped trace of the proportion of individuals at the same level of WI who affirm a particular item. Under the assumption that the WI is unidimensional and that the item measures WI specifically, then (except for random variations) the same curve is found irrespective of the nature of the group for whom the items are plotted (25). Items that displayed DIF by age and sex were removed from the scale.

Finally, the scale scores were examined across the 5 levels of WI used by the experts, and values were examined to determine appropriate cut points to identify those needing modifications at work. Sensitivity and specificity were calculated using these cut points on the gold standard sample.

Reliability (test–retest questionnaire).

The shortened RA WIS was sent to 229 patients at 2-week intervals to determine the reliability of the revised scale.


Qualitative interviews for item generation.

Of a total of 85 potential subjects approached, 49 subjects (58%) fulfilling the criteria and matching the theoretical sampling frame agreed to be interviewed; 4 transcripts were unusable due to technical problems, so results are based on the 45 remaining transcripts. Mean age was 40 years (range 26–55 years), 27 subjects were female (60%). Disease duration varied from 2 months to 4 years (mean 2.3 years). All participants were in paid work, although 4 were off sick when interviewed. The largest occupational group was sedentary office based work (n = 20); 19 were in light physical work, such as retail and light assembly work; 6 were in heavy physical work, including a chemical engineer (Figure 2).

Analysis was started as soon as the first 5 interviews were transcribed, so that an iterative process led to an expanding thematic structure to the data analysis as the interviews progressed. The main themes that emerged were the importance of job flexibility, good working relationships, and symptom control. From the analysis, 76 statements with simple yes/no responses were identified as potential items for the first draft WIS. Table 3 shows 4 examples of the type of statements made by subjects that were used as items on the WIS. These examples reflect increasing risk of WD and illustrate how items on the WIS are applicable to all occupational groups.

Table 3. Four examples of statements from the interview transcripts used as items in the final rheumatoid arthritis Work Instability Scale
I'm getting up earlier because of the arthritis.  
I can get my job done, I'm just a lot slower.  
I don't have the stamina to work, like I used to.  
I feel I may have to give up work.  

First postal questionnaire.

In the second stage, 625 questionnaires, including 1 of the 46-item draft WI scales, were sent out and 475 returned (response rate 76%). The 2 versions of the draft WIS were color coded (yellow and blue) and the sample was randomly allocated to a color of questionnaire. The response rates for the different colors were identical. There was no significant difference in the age of those returning blue or yellow forms (t = 0.573, P = 0.567); or sex (χ2= 1.02, P = 0.334). There was also no significant difference in disease duration across color of questionnaire (t = 0.509, P = 0.611). Of the 475 potential subjects who returned the questionnaire, 206 (43%) were in work, these can be considered as full study participants. Mean age of those in work was 44 years (range 16–60 years), 73% were female, and the mean HAQ score was 1.2. Thirty-eight of this group who were in work volunteered for the next stage of full vocational assessment.

Gold standard vocational assessments.

Thirty-one subjects remained for this stage of the study following assessment of 7 subjects during the 2-day process to ensure consistency between assessors. The subjects were representative of the larger sample in the previous (postal) stage of the study by age and sex and had a range of occupations (18 sedentary occupations, 11 light physical, and 2 heavy work); their mean HAQ score was 0.59.

The experts completing these assessments identified 5 levels of WI, each representing progressively increasing risk of WD (Table 2). Thirty-six items from the potential 76 items were found to discriminate across these 5 categories. Data from the subjects from the larger postal survey relating to these 36 items were then fitted to the Rasch model, which identified 23 items on a single construct of WI that were free from item bias for age and sex. Fit to the model was confirmed by excellent item fit (mean 0.056, SD 0.092) and person fit (mean −0.062, SD 0.595) statistics. Item trait interaction chi-square of 34.1, (degrees of freedom = 46; P = 0.90) showed the classic property of invariance for the scale. That is, the hierarchical ordering of items remained the same across the trait of WI.

A score of 10 or more on the 23-item WI scale was shown to have 82% sensitivity to the need for workplace modifications, a score of 17 or more gave 95% specificity. The RA WIS can therefore be scored in 3 bands. A score less than 10 indicates low WI, indicative of low risk of WD. Scores in the range of 10–17 indicate moderate WI, indicative of medium risk of WD. Scores above 17 are high scores of WI and these individuals can be considered at high risk of WD.

Postal test–retest questionnaire.

Postal test–retest questionnaires were sent at 2-week intervals to an additional 229 potential subjects; 123 returned both questionnaires, a return rate of 53%. Fifty-one (41%) of the subjects returning both completed questionnaires were working. Analysis of their responses on the 23 items of the WIS showed a correlation of 0.89 (Spearman's rho) indicating a high degree of reliability for the RA WIS.


Work instability may be defined as a state arising from a mismatch between an individual's functional abilities and the demands of the job, the consequences of which could threaten continuing employment if not addressed. Currently there is no reliable way of screening for, or monitoring, WI in RA despite the high levels of WD in this population. A related instrument, the Work Limitations Questionnaire, has been recently published (26) that aims to measure the on-the-job impact across a range of chronic health problems.

The project reported here has produced a simple 23-item disease-specific RA WIS. The items on the scale have been derived from a validated technique involving qualitative interviews with relevant patients. The latest item response techniques have been used to assess the scaling properties of the instrument and examine the scale for differential item functioning. The scale has also been validated against a gold standard of full vocational assessment by occupational health experts in the absence of a comparable validated instrument. It appears likely that the 23-item RA WIS produced will demonstrate cross cultural validity because a recent qualitative study in New York that interviewed predominately sedentary workers found many of the same themes to be relevant (27).

Currently, too many people with RA face a potentially devastating change in lifestyle if the impact of the disease is not contained during the few months following onset. With the nonsteroidal antiinflammatory based therapeutic cascade being replaced by early aggressive treatment and the emergence of new treatments, particularly the effective but expensive tumor necrosis factor (TNF) therapies, the risk of job loss in RA may diminish. To evaluate the overall cost effectiveness of these therapies with an annual cost up to £10,000 ($15,000) per patient, a reliable measure sensitive to changes in WI will be essential. Maximizing job retention is important for a number of reasons; there are significant financial and psychosocial costs to individuals forced to leave the workforce early, with a fall in self esteem and an undermining of independence (28–30). It is also important for society as a whole, because there are increased claims for unemployment and disability benefits and lower tax revenues. Employers who lose staff with valuable experience and expertise may incur additional costs for recruitment and training of new staff.

The 3 main themes identified from the qualitative interviews (job flexibility, good working relationships, and symptom control) intuitively make sense but also lend support to a multifaceted and holistic approach to disease management. Employers looking for ways of supporting patients in work can be encouraged to promote flexibility (of working time, task variations, and time pressure) and to support good working relationships through team development. The physician seeking to improve WI can target symptom control directly.

The vocational assessment carried out by an ergonomically trained and experienced therapist is used as a reference standard. Vocational assessment within the UK Employment Service, Disability Service, is a scarce resource, is costly in professional time, and usually requires travel to the regional center. The development of a short, self administered questionnaire with this sensitivity and specificity is encouraging because it is likely to be useful both as a screening and a monitoring measure for WI. When used as a screening test, 4 of 5 individuals with a score of 10 or more will require workplace modifications. With a score of 17 or more, only 1 in 20 will not need such modifications.

The work reported in this article has not included sequential use of the WIS in patients with progressive disabling RA or “before and after” scores to assess the effect of either change in treatment or workplace adaptations on reducing the mismatch between ability and job demands that the scale is measuring. We acknowledge that so far the instrument has been tested on a relatively small sample from the north of England, consequently the generalizability of the scale is currently unknown and requires further testing. Cross-cultural studies will need to be undertaken to test the validity of the RA WIS elsewhere. In addition, studies on the responsiveness of the instrument are required.

However, we believe there are a number of settings in which this instrument will be useful. In the outpatient clinic or therapy setting it will offer a simple, self report questionnaire that may be used as a screening tool to alert the clinician to the need for more detailed work assessment. It may also be useful in the employment setting to assist, in the UK, the Disability Employment Adviser or other employment adviser in predicting work suitability and in seeking interaction with health care professionals who may be in contact with the client. As a research tool it may provide researchers a way of measuring this very important aspect of RA-related disability. This will be particularly important in studies that seek to introduce the major indirect costs associated with loss of work into the cost effectiveness calculation for expensive treatments such as the biologic TNF therapies.

Copies of the full RA WIS scale with guidance notes and instructions for scoring are available from the corresponding author.


We thank all the patients and staff who helped during the project.