Translation, adaptation and psychometric testing of a tool for measuring nurses’ attitudes towards research in Indonesian primary health care

Abstract Aim The purpose of this study was to translate, adapt and psychometrically test the Nurses’ attitudes towards and awareness of research and development within nursing (ATRAD‐N) version II for measuring nursing research and research utilization in Indonesian primary health care nurses. Design Cross‐sectional survey. Method The translation process was conducted by applying the forward and back‐translation method. Adaptation and content validity was assessed by six experts in Indonesia. The psychometric testing was performed using factor analysis and Cronbach's alpha coefficient on a sample of 92 primary health care nurses in South Kalimantan, Indonesia in 2013. Results The translated instrument showed acceptable content validity with index of .97. The factor analysis (Principal Component Analysis with Direct Oblimin rotation) obtained a five‐factor structure that differed from those identified in previous studies. The cumulative percentage of variance was 56.5%. The Cronbach's alpha coefficient for individual factors ranged from .719 ‐ .884. The resulting form of the Indonesian ATRAD‐N was found to have acceptable content validity and homogeneity reliability but not construct valid in Indonesian settings.

report the development of the instrument which was based on a review of literature concerning nursing research and two previous studies conducted in Sweden. In several tests, the instrument was found to have acceptable measures of reliability and validity. In 2005, the instrument was modified and tested to focus on primary health care settings (Nilsson Kajermo et al., 2013). All of the psychometric tests of the instrument have been conducted in developed countries.

| BACKGROUND
Research utilization, the use of research evidence to inform practice, has become a main concern in nursing practice worldwide as the evidence-based practice movement extends from a focus of "what is evidence and how can it be summarized" to "how can evidence be used to inform daily clinical practice" (Estabrooks, 2009;Estabrooks, Wallin & Milner, 2003b;Schneider, 2013). All nurses, even those in rural areas, should be able to use scientific evidence to guide their practice (Olade, 2004). However, research utilization is a complex process that requires synergistic efforts to be successfully implemented (Mehrdad, Salsali & Kazemnejad, 2008). Estabrooks (2009) outlined several determinants influencing research utilization: individuals, organization and innovation (Estabrooks, 2009).
Research utilization is a new concept in Indonesian nursing and research modules have only recently been added to the nursing curriculum (Indonesian National Nurses Association, 2005). There are also few publicly available data that can be used to inform nursing research and research utilization in developing countries like Indonesia (Pearson & Jordan, 2010). As a result, the implementation of evidencebased practice is complex and difficult for Indonesian nurses. As in other developing countries, several factors contribute to this situation (Mehrdad et al., 2008;Tsai, 2000). Poor quality of education and lack of strategies to enhance the use of research findings are two common barriers in research utilization and research participation (McKenna, Ashton & Keeney, 2004;Oh, 2008;Tsai, 2000).
The primary health care system in Indonesia is the Pusat Kesehatan Masyarakat (Puskesmas) (public health centers), the functional health organization unit (Abdullah, Hort, Abidin & Amin, 2012). Puskesmas are front line health service institutions that have responsibility for providing comprehensive and integrated services to the community (Ministry of Health-Republic of Indonesia, 2012). In collaboration with other related sectors, the centres implement national and regional health programs, including those dealing with health promotion, illness prevention, treatment of diseases and rehabilitation to all community groups (Department of Health-Government of Indonesia, 1990, Ministry of Health-Republic of Indonesia, 2012. Nurses are the main health care professionals at the Puskesmas and they carry out most of the national health programs (Assan, Assan, Assan & Smith, 2009;Hennessy, Hicks, Hilan & Kawonal, 2006). Therefore, primary health care nurses have a crucial responsibility for managing the delivery of safe and effective health programs in Indonesia (Hennessy et al., 2006). The context and situation presented above indicate the importance of carrying out a study to assess the attitudes of Indonesian primary health care nurses towards nursing research and the use of research to guide their practice. Such a study will enable us to understand the factors that influence nursing research utilization in Indonesian primary health care settings and facilitate Indonesian nurses to participate in research. This research requires a reliable and valid instrument to measure the variable of interest-attitudes towards research and research utilization-in the context of Indonesian primary health care settings.

| Aim
The aim of this research was to translate, adapt and test the psychometric properties of ATRAD-N in Indonesian primary health care nurses. The objectives of the study were to: 1. translate a previously developed questionnaire, ATRAD-N, from the source language (English) to the target language (Indonesian) 2. evaluate and adapt the questionnaire in terms of items, instruction for administration and scoring rules 3. estimate the content and construct validity of the translated questionnaire and its homogeneity reliability

| Method
The translation process was conducted systematically by applying the forward and back-translation method (Beaton, Bombardier, Guillemin & Ferraz, 2000;Gudmundsson, 2009;Sousa & Rojjanasrirat, 2011), involving two native Indonesian speakers fluent in English and a bilingual translator who blindly back-translated the preliminary initial instrument into English.
Six experts with backgrounds in community health nursing from various universities in Indonesia reviewed the instrument based on local information, context and the culture where the instrument was to be applied. Items 36-39 were deleted because they are not relevant to Indonesian settings. The outcome of the adaptation process was the development of a final Indonesian version of the instrument (Indonesian ATRAD-N). Ten new items relating to biographical details were generated to assess basic information of the respondents.
The Lynn method of content validity scale was used to quantify the indicators of content appropriateness and relevance provided by the experts in this study (Devon et al., 2007;Lynn, 1986). The experts' endorsement was collected and the Content Validity Index (CVI) score was estimated for individual scale items and the entire scale. For a panel of six experts, the level of endorsement required to retain an item based on the proportion of the experts would be a minimum of .83, at the .05 level of significance (Lynn, 1986;Wynd, Schmidt & Schaefer, 2003). Figure 1 illustrates the translation, adaptation and content validity process applied to the Indonesian ATRAD-N.
The process of psychometric testing included validity and reliability tests. Construct validity testing was conducted using factor analysis and homogeneity reliability testing was conducted using Cronbach's alpha coefficient (Devon et al., 2007;Gillespie & Chaboyer, 2013). Cronbach's alpha was calculated for individual factors and the entire scale.
The Indonesian ATRAD-N included both positively and negatively worded statements. The negatively phrased items were reverse scored in data analysis. Returned questionnaires with more than 10% unanswered items were excluded. For questionnaires with less than 10% of items unanswered, missing data were derived using mean estimation. All data were gathered using hard-copy questionnaires. The data were coded and entered into a Microsoft Excel spread sheet and then exported into the Statistical Package for the Social Sciences (SPSS) Version 20.0 for data cleaning, reverse scoring and further analysis.
F I G U R E 1 Translation, adaptation and content validity process of the Indonesian ATRAD-N instrument Exploratory factor analysis was performed using Principal Component Analysis (PCA). The factors obtained were then rotated using Direct Oblimin rotation (Pallant, 2011). The criteria for the significance of factor loadings was set at .55 based on the sample size of 92 respondents (Hair, Anderson, Tatham & Black, 1995). Criteria used in determining factor extraction included targets for the eigenvalue >1 rule, cumulative percentage of variance 50-60%, scree test and parallel analysis (Hair et al., 1995;Kootstra, 2004;Pallant, 2011;Pett, Lackey & Sullivan, 2003;Williams, Brown & Onsman, 2010).
The internal consistency of the instrument was measured using Cronbach's alpha coefficient, comparing each item in the scale with all other items. A minimum score of .70 was set to ensure adequate reliability (Gillespie & Chaboyer, 2013). The demographic data were statistically analysed and tested to compare the mean scores in each and total factors using independent sample t-tests. The correlation between factors derived from the factor analysis was measured using the Spearman rank-order correlation.

| Participants
The subjects of the psychometric tests were recruited using a nonprobability sampling method, from primary health care nurses working in 34 public health centres in the city of Banjarbaru and Banjarmasin, South Kalimantan, Indonesia. In performing factor analysis, Devon et al. (2007) suggested five subjects per item of the questionnaire.
With 34 items in the questionnaire, we expected to have 170 respondents. However, 25 centres declined to take part in this study due to heavy workloads or, in some cases, because they felt this study would not benefit them. A total of 92 nurses were completed the instrument in 2013.

| Instrument
We critically examined available tools and selected ATRAD-N based on its adaptation for use in primary health care settings and its published validity and reliability to measure nurses' attitudes towards research and research utilization. The current ATRAD-N consists of 35 items including a Likert-type (1-5) scale with responses ranging from "do not agree at all" (1) -"agree to a very great extent" (5) and four items including a rating type scale. A higher total score indicates a more positive attitude towards research and research utilization.

| Characteristics of the respondents
A demographic profile of the 92 respondents is presented in Table 1.
Most respondents (69.6%) were female and most (78.2%) were aged between 20-40 years old. They were predominantly (71.7%) educated at the diploma level. Almost 59% of the respondents also had research-related education. The means of the overall response score for each item in the questionnaire (following reverse scoring) ranged from 2.1-4.3. A higher score indicated a more positive attitude towards research and research utilization.

| Results of content validity process
The instrument was designated as valid by the experts with a CVI of .97 for the entire scale. One item, "nursing education programs are too research based" (CVI = .67) was dropped because it did not achieve the .83 level of endorsement required to establish content validity. Thus, the final version of the instrument in Indonesian consisted of 34 items from the ATARD-N and 10 items relating to biographical details of respondents.

| Results of factor analysis
The 34 items in the questionnaire were subjected to PCA. Prior to performing factor analysis, the data were assessed using two statistical measures: Bartlett's test of sphericity (Bartlett, 1954) and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy (Kaiser, 1974) generated by SPSS. Bartlett's test reached a significance point (χ 2 = 1766.7, df = 561, p < .0001), which indicated that the correlation matrix was not an identity matrix. The KMO value was .759, which exceeds the recommended value of .6 and meets the "middling" criterion (Kaiser, 1974) which judges our sample size as sufficient to perform factor analysis.
The factor analysis used three iterative analyses. The criteria for the significance of factor loadings was set at .55 based on the sample size of 92 respondents (Hair et al., 1995). The first iteration of the PCA identified the presence of 10 components with eigenvalues >1 explaining 73% of the cumulative percentage of variance. After Direct Oblimin rotation, the pattern matrix showed 10 factors and five components on which only one item loaded (items 3, 7, 8, 9 and 10). Several items (n = 15) did not load on any factors. The results of the parallel analysis indicated only five components with eigenvalues greater than the criterion value for a randomly generated data matrix of 34 items with 100 respondents. It was decided to retain five components for further investigation.
The second iteration of the PCA was run by adding commands to force items loading onto five components. The pattern matrix showed five components with two to six items loading on each component, explaining 55.2% of the cumulative percentage of variance. However, 12 items did not load on any of the components. Each of these was evaluated for possible deletion. It was decided that item 3 "in the nursing area too much is written and there is too much talk about research and development" could be deleted due to its low communality index (.273). The third iteration of the PCA was performed using 33 items and extracted five components. The cumulative percentage of variance was 56.5% with components 1, 2, 3, 4 and 5 contributing 30.35, 8.23, 6.39, 6.00 and 5.39%, respectively. Factor 1 "Participation and utilization of nursing research", Factor 2 "Nursing professional development", Factor 3 "Language of nursing research", Factor 4 "Developing capacity of nurses" and Factor 5 "Need of nursing research".
To interpret these components, Direct Oblimin rotation was performed. The rotated solution revealed the presence of a simpler solution and found seven items that did not load on any of the components. These seven items were consequently removed from further testing of the resulting 26 item instrument. All unloaded items are listed in Table 3.

| Results of internal consistency (homogeneity reliability test)
The homogeneity reliability of the instrument's 26 items was measured using Cronbach's alpha coefficient. Bjorkstrom and Hamrin (Bjorkstrom & Hamrin, 2001) showed that the ATRAD-N questionnaire has a good internal consistency, with a Cronbach's alpha coefficient of .940. In this study, the overall Cronbach's alpha coefficient of the 26 item instrument was .902. Considering that this study extracted different factors than that of Bjorkstrom and Hamrin (2001), it was decided not to compare the Cronbach's alpha coefficient for each factor between the two studies.

| Results of bivariate analysis
A series of independent-sample t tests was conducted to compare the questionnaire scores for several dichotomous socio-demographic factors. Table 5  Two socio-demographic factors had a significant difference in mean total factor scores: level of education and access to Internet in the workplace. Nurses who were educated at university level had a higher mean value than those who were educated at non-university level (p = .003). Likewise, nurses who had access to the Internet had a higher mean value than those with no Internet access (p = .017).
Further analysis was conducted to describe the strength and direction of the linear relationships between the factors using Spearman rank-order correlation coefficients (Table 6)

| DISCUSSION
Several studies have focused on assessing instruments designed to measure and assess research utilization in practice and individual Major loadings for each item are in bold.

T A B L E 2 (Continued)
factors associated with research utilization (Estabrooks & Wallin, 2004;Frasure, 2008;Squires et al., 2008Squires et al., , 2011. This study contributed to the development of a valid and reliable instrument to measure nursing research and research utilization in developing country. In the first stage, the original English version of the ATRAD-N was translated into Indonesian. The guidelines developed by Beaton et al. (2000), Gudmundsson (2009) and Sousa and Rojjanasrirat (2011) was combined for cross-cultural adaptation of a self-report instrument to achieve a quality translation. Although Sousa and Rojjanasrirat (2011) stress the importance of using translators with knowledge of health care terminology, this was not possible in the current study due to limited translation facilities with this particular specifications. However, no item was found to be difficult to translate as the concepts were not specifically grounded in medical or nursing knowledge. A small number of items had minor semantic and idiomatic discrepancies between the languages, but those items were revised during discussions with the research team.
An important issue to highlight in this discussion is the factor structure of the instrument. The factor structure described by Bjorkstrom and Hamrin (2001) Instead of using a Maximum Likelihood extraction method, we used PCA with Direct Oblimin rotation to replicate the construct validity and find the most psychometrically sound and acceptable approach in this study. Careful consideration was also given to the sample size and correlations among factors when choosing the factor extraction method. It was also necessary to run three iteration factor analyses and to delete one item during those iterative analyses, resulting in a 33-item scale.
The factor loading cut off of .55 used in this study was higher than those used in previous studies (.32-.40) (Bjorkstrom & Hamrin, 2001;Marshall et al., 2007;Nilsson Kajermo et al., 2013). The higher factor loading cut off was necessary to maintain a strict power level of 80% and .05 significance with the sample size of 92 respondents. This significance level for the interpretation of factor loadings was determined following the approach outlined by Hair anderson and helped to ensure the validity of our findings despite a lower than anticipated sample size (Hair et al., 1995).
Seven items did not load on any of the extracted factors because their factor loadings were <.55. It could be argued that those unloaded items were not having an adequate explanation in the construct that they failed to represent in the factor structure. Marshall et al. (2007) also encountered problems maintaining construct validity of the ATRAD-N instrument, due to "abstract constructs" (Marshall et al., 2007). Further, Frasure (2008), in his systematic review, found that the ATRAD-N did not clearly declare its theoretical framework, which is important to define a construct of the instrument.
Variation in the ATRAD-N questionnaire construct are apparent in different settings. Marshall et al. (2007) were unable to present adequate factor structure of the instrument because their factor structure accounted for only 28.3% of the total variance. Nilsson Kajermo et al.
(2013) found a three-factor structure that grouped items based on positively and negatively worded items. Perhaps items in the questionnaire are interpreted differently among the varied nursing settings.
For example, in the original study by Bjorkstrom and Hamrin (2001), the items "The nursing profession is a practical profession and does not have to include research" and "Further training in research and research-based studies is not important for the future" loaded to a factor labelled "the profession", whereas in this study those two items loaded to a factor labelled "need of nursing research". It is unclear whether these two items were about the profession or nursing research. Further refinement and retesting of this instrument would improve its construct validity.
The Cronbach's alpha coefficient for individual factors of the 26 item instrument ranged from .719 -.884, suggesting good internal consistency of the instrument. None of the items had corrected itemtotal correlation scores <.3, indicating that each item correlated well with the total value. However, two items ("I think the questions in this questionnaire are important" (α = .800) and "Participating in development work in nursing does not benefit nursing skills" (α = .792)) had higher individual Cronbach's alpha if item deleted scores than their total factor scores. Removing those items from the instrument may increase the reliability of those factors.
It is interesting to note that the overall Cronbach's alpha score for the questionnaire in this study was >.9, as in the studies of Bjorkstrom and Hamrin (2001) and Nilsson Kajermo et al. (2013). Experts disagree about the ideal score of Cronbach's alpha to determine homogeneity reliability. According to Gillespie and Chaboyer (2013), scores <.7 indicate lack of correlation between items in the instrument and according to Devellis (2003) scores >.9 indicate redundancy of one or more items. Devellis (2003) suggest that an instrument with Cronbach's alpha score >.9 should be shortened because of this strong correlation T A B L E 3 Unloaded items with factor loadings <.55 from the final iteration between items. Some items may be too similar in the instrument used in this study-for example, "The language used in nursing research is too complicated" and "The language of scientific articles are too complex for me"-and it may be better to review the items for redundancy.  However, this extra analysis should be interpreted with caution until it can be confirmed with further studies.
The results of this study indicate a difference in psychometric properties of the ATRAD-N between the primary language (English) and the target language (Indonesia). The adaptation and psychometric testing of the instrument for use in Indonesian primary health care settings did not mirror previous study findings. In its present form, the Indonesian translation of the ATRAD-N should be used with some caution as further investigation of the psychometric properties of the instrument is required. Studies with more respondents should be undertaken to better establish the validity and reliability of the instrument.

| Limitations
The sample size of this study was small (n = 93) given the number of items (n = 34) in the translated questionnaire. Even though there is no agreement on an acceptable ratio of cases to variables for factor analysis, a general rule of thumb from the literature is a minimum of five cases for each variable to be analyzed (Devon et al., 2007;Hair et al., 1995;Kootstra, 2004;Williams et al., 2010). However, confidence in our findings is increased by the results of the Bartlett's test of sphericity and the KMO assessment of "middling" for sampling adequacy, which judges our sample size as sufficient to perform factor analysis.

| CONCLUSION
The respondents for psychometric testing in this study were collected from a different geography, culture and context than previous studies. Following translation, adaptation and psychometric testing, it was found that the ATRAD-N instrument showed acceptable content validity and homogeneity reliability, but not construct validity in Indonesian settings. Thus, further development, refinement and retesting of the instrument would be essential to produce a psychometrically sound instrument.
T A B L E 6 Spearman rank-order correlation coefficient among total factor and individual factors