SEARCH

SEARCH BY CITATION

Keywords:

  • clinical competence;
  • robotics;
  • laparoscopy;
  • computer simulation;
  • education;
  • medical

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

Objectives

  • To evaluate three standardized robotic surgery training methods, inanimate, virtual reality and in vivo, for their construct validity.
  • To explore the concept of cross-method validity, where the relative performance of each method is compared.

Materials and Methods

  • Robotic surgical skills were prospectively assessed in 49 participating surgeons who were classified as follows: ‘novice/trainee’: urology residents, previous experience <30 cases (n = 38) and ‘experts’: faculty surgeons, previous experience ≥30 cases (n = 11).
  • Three standardized, validated training methods were used: (i) structured inanimate tasks; (ii) virtual reality exercises on the da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA, USA); and (iii) a standardized robotic surgical task in a live porcine model with performance graded by the Global Evaluative Assessment of Robotic Skills (GEARS) tool.
  • A Kruskal–Wallis test was used to evaluate performance differences between novices and experts (construct validity).
  • Spearman's correlation coefficient (ρ) was used to measure the association of performance across inanimate, simulation and in vivo methods (cross-method validity).

Results

  • Novice and expert surgeons had previously performed a median (range) of 0 (0–20) and 300 (30–2000) robotic cases, respectively (P < 0.001).
  • Construct validity: experts consistently outperformed residents with all three methods (P < 0.001).
  • Cross-method validity: overall performance of inanimate tasks significantly correlated with virtual reality robotic performance (ρ = −0.7, P < 0.001) and in vivo robotic performance based on GEARS (ρ = −0.8, P < 0.0001).
  • Virtual reality performance and in vivo tissue performance were also found to be strongly correlated (ρ = 0.6, P < 0.001).

Conclusions

  • We propose the novel concept of cross-method validity, which may provide a method of evaluating the relative value of various forms of skills education and assessment.
  • We externally confirmed the construct validity of each featured training tool.

Abbreviations
GEARS

Global Evaluative Assessment of Robotic Skills

FLS

Fundamentals of Laparoscopic Surgery

Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

With the growing adoption of minimally invasive surgery, many urology residency programmes have recognized the need to strengthen laparoscopic and robotic surgical training. Surveys of both US and Canadian urology trainees point to dissatisfaction in minimally invasive surgery exposure and training during residency [1, 2].

In recent years, attention has focused increasingly on the need for training tools for robotic surgery. Despite the rapid and widespread clinical adoption of robotic surgery, methods for training and establishing competency have been slow to develop. At present, no validated or standardized curriculum exists for training in basic robotic surgical skills [3]. As has happened with the concept of Fundamentals of Laparoscopic Surgery (FLS) [4], significant efforts have been directed toward the development of basic robotic exercises for training in robotic surgery [5, 6]. Recently, virtual reality simulation has also been developed and validated in three predominant platforms: the Robotic Surgical Simulator (Simulated Surgical Systems, Williamsville, NY, USA) [7, 8]; dv-Trainer (Mimic Technologies, Seattle, WA, USA) [9-13]; and da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA, USA/Mimic Technologies) [14-17]. For clinical training and evaluation, a validated assessment tool of robotic skills, the Global Evaluative Assessment of Robotic Skills (GEARS), has been created to determine clinical competency and to monitor robotic skills acquisition [18]. GEARS has been previously validated in the clinical setting [18].

Traditional validation of training tools entails a stepwise progression of evaluation: face validity (realism of tool); content validity (utility as training tool); construct validity (ability to discern between novice and expert performance); concurrent validity (performance correlation with the ‘gold standard’). The challenge in robotic surgery today is the lack of an accepted ‘gold standard’ training method. Clinical robotic training (operating on real patients with expert supervision), as the default method of training, may not be the ideal starting point for novice robotic surgeons unfamiliar with the da Vinci interface. While concurrent validity traditionally assesses a novel training tool against the gold standard (by correlation of comparative performance), it may be more practical to evaluate a novel tool against other tools or methods that are being developed simultaneously. By correlating performance across different training methods, one may infer the relative utility of new robotic training tools (cross-method validity).

Inanimate tasks, virtual reality simulation, and in vivo training are three components of robotic training that have been independently developed and tested. In the present study, we externally evaluate these three standardized training methods for their construct validity and explore the concept of cross-method validity by correlating relative performance across the different methods.

Materials and Methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

Urology residents and expert urological robotic surgeons were recruited to participate in a hands-on training course where performance data were prospectively recorded during the National Urology Residents Preceptorship in Laparoscopic and Robotic Surgery hosted at our institution in October 2011. Participants were categorized a priori as expert robotic surgeon or novice. Experts (minimal level of expert standard) were defined as having completed ≥30 robotic cases as primary surgeon [19], while novices/trainees were defined as having completed <30 cases. A protocol for training exercise data collection was approved by the institutional review board. In vivo animal exercises followed an approved training protocol by the Institutional Animal Care and Use Committee.

Three methods of robotic surgical training were selected for this study based on previous rigorous validation: four structured inanimate exercises [5], robotic simulation (virtual reality) [14-17], and in vivo robotic skills assessment (porcine model) using GEARS [18].

Structured Fundamental Inanimate Robotic Skills Tasks (FIRST)

We used a set of four standardized, inanimate training tasks designed to measure robotic skills [5]. The robotic tasks are shown in Fig. 1 and included bimanual horizontal mattress suturing, sharp resection of a clover pattern, peg placement on a three-dimensional dome platform, and needle placement around a circular target. Scoring was based on efficiency and precision, where error penalties were combined with time to completion to generate an overall performance score, where a lower score represented a better performance. A previous study has established face and construct validity of these tasks in a small cohort of experts and trainees [5].

figure

Figure 1. Fundamental Inanimate Robotic Skills Tasks (FIRST).

Download figure to PowerPoint

Virtual Reality Simulation

For the simulated component of our training programme, we incorporated the da Vinci Skills Simulator installed on the da Vinci Surgeon Console. Based on Mimic Technologies’ virtual reality software, the da Vinci Skills Simulator has been subjected to comprehensive face, content, construct, concurrent and predictive validation studies [14-17]. Participants completed four virtual reality exercises selected for their ability to discriminate between expert, intermediate and novice performance based on previous validation studies at our institution: Peg Board Level 2; Ring and Rail Level 2; Suture Sponge Level 3; and Tubes (Fig. 2).

figure

Figure 2. Virtual reality: Mimic exercises on the da Vinci Skills Simulator.

Download figure to PowerPoint

In vivo Assessment Using GEARS

Participants in this study completed a standardized in vivo task designed to emulate a component of a surgical procedure in a porcine model using the da Vinci robot. The objective of the task was to suture a loop of bowel to the peritoneum overlying the kidney, secured by a square knot (Fig. 3). Performance was independently evaluated by two expert robotic surgeons using the six metrics of GEARS [18]: depth perception; bimanual dexterity; efficiency; force sensitivity; autonomy; and robotic control, for a maximum performance score of 30 points. Each expert reviewer was oriented to the GEARS grading system before scoring, and the mean of their composite scores was calculated for statistical analysis.

figure

Figure 3. GEARS performance measurement of in vivo robotic tasks. (A) Study participant is instructed to secure a loop of small intestine to the peritoneum overlying the kidney. (B) Task is completed when the bowel is securely affixed.

Download figure to PowerPoint

Participant performance data and questionnaire responses were prospectively collected and analysed. A 15-point pre-study questionnaire captured self-reported participant demographic information. All statistical analyses were completed using the statistical software SAS, Version 9.2 (Institute Inc., Cary, NC). Kruskal–Wallis and Wilcoxon rank-sum tests were used to examine the differences in non-parametric continuous variables between novices and experts (construct validity [Fig. 4]). Spearman's rank correlation coefficient (ρ) was used to measure the association of performance across inanimate, simulation, and in vivo methods (cross-method validity). Statistical significance was considered with P < 0.05 and all P values were two-sided.

figure

Figure 4. Study design.

Download figure to PowerPoint

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

Demographics

A total of 49 participants completed all three training methods. The robotic surgical experience of the robotic novices (n = 38) and expert robotic surgeons (n = 11) is summarized in Table 1. While trainees had a median (range) of 1 (0–4) years of robotic experience, the expert surgeons had significantly more robotic experience (5 [2–9] years; P < 0.001). Similarly, trainees had completed a median (range) of 0 (0–20) robotic cases as primary surgeon, and experts had completed a median (range) of 300 (30–2000) cases (P < 0.001).

Table 1. Participant demographics
 Trainee: n = 38Expert: n = 11P
Median (range) age, years31 (28–41)42 (34–53) <0.001
Gender
Median (range) years in practice 7 (0–28) 
Median (range) postgraduate year of training5 (3–8)  
Median (range) laparoscopic experience, years2 (0–6)9 (2–15)<0.001
Median (range) laparoscopic experience, cases8 (0–60)425 (10–1500)<0.001
Median (range) robotic experience, years1 (0–4)5 (2–9)<0.001
Median (range) robotic experience, cases0 (0–20)300 (30–2000)<0.001

Individual Method Performance and Construct Validation

Experts consistently outperformed trainees in each method (P ≤ 0.005 [Table 2]), affirming the construct validity of each featured training tool. Expert surgeons outperformed novices overall on all four inanimate tasks where a lower performance score was better (593 vs 985, P < 0.001). Similarly, experts scored higher than novices on the da Vinci Skills Simulator, based on an algorithm-based composite percentile score (87 vs 73%, P < 0.001). Finally, experts outperformed novices on the in vivo exercise, based on expert scoring using GEARS (30 vs 20, P < 0.001).

Table 2. Construct validation: performance data
 Trainee: n = 38Expert: n = 11P
  1. *Lower score is better.

  2. Higher score is better.

Median (sd; range) inanimate tasks performance score*985 (210; 492–1347)593 (124; 348–700)<0.001
Median (sd; range) robotic simulator, mean score73 (10; 56–92)87 (12; 77–94)<0.001
Median (sd; range) in vivo task based on GEARS, total score20 (5; 11–27)30 (0.4; 29–30)<0.001

Correlation Across Methods

Overall performance of inanimate tasks significantly correlated with virtual reality robotic performance (ρ = −0.7, P < 0.001) and in vivo robotic performance based on GEARS (ρ = −0.8, P < 0.001 [Table 3]). Simulation performance and in vivo performance were also found to be strongly correlated (ρ = 0.6, P < 0.002).

Table 3. Cross-method validation: performance correlation data. Spearman's correlation coefficient (ρ) between composite/individual exercises and other methods
 Fundamental Inanimate Robotic Skills Tasks (FIRST)Virtual reality simulationIn vivo task using GEARS scoring
  1. All relationships P < 0.001, except when designated. NS, nonsignificant.

Fundamental Inanimate Robotic Skills Tasks (FIRST) −0.7−0.8
Horizontal mattress suture −0.6−0.8
Pattern cut −0.6−0.5
Three-dimensional ‘Dome and Peg’ −0.7−0.7
Circular target −0.5−0.7
Virtual reality simulation−0.7 0.6
Peg BoardNS NS
Suture Sponge−0.6 0.5
Ring and Rail−0.5 0.6
Tubes−0.6 0.6
In vivo task using GEARS scoring−0.80.6 

Table 3 gives further details of the correlation of individual inanimate and virtual reality exercises to the composite scores of the other method. Performance of each inanimate task had significant correlations with the mean simulation scores (ρ = −0.5 to −0.7, P < 0.003). Likewise, each inanimate task significantly correlated with the in vivo method (ρ = −0.5 to −0.8, P < 0.002). In particular, performance of the three-dimensional ‘Dome and Peg’ inanimate exercise was found to be strongly and significantly correlated with both virtual reality and in vivo robotic performance (ρ = −0.7 for both, P < 0.001). The strongest correlation was noted between inanimate task performance and in vivo robotic performance (ρ = −0.8, P < 0.001).

Performance on three of the four simulator exercises significantly correlated with the composite inanimate scores (ρ = −0.5 to −0.6, P < 0.001) and in vivo scores (ρ = 0.5 to 0.6, P < 0.001).

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

To our knowledge, this is the first study in a single setting to simultaneously correlate the performance of expert and novice/trainee surgeons across inanimate, virtual reality and particularly, in vivo platforms. Cross-correlation of individually validated tools is a novel concept (cross-method validity) that we propose as a method to provide comparative assessment of novel training tools and to establish internal consistency of a training curriculum. For the latter, this means each component of the educational programme is inter-related and directly supports the global goal of robotic surgical skills acquisition. In the present study, we confirmed the construct validities of the three robotic training methods and demonstrated significant cross-method correlation amongst a diverse cohort of both expert surgeons and novice operators.

External validation of the outcomes of construct validation studies is a valuable exercise, especially for emerging and novel training methods, in large diverse cohorts before their widespread adoption. Additionally, while a traditional validation step may compare a novel method to an established gold standard method (concurrent validity), robotic training thus far has lacked an established training method. With several training methods being explored and developed simultaneously (i.e. inanimate, virtual reality and in vivo), an alternative or additional validation step may be to prospectively compare them against each other: cross-method validity. While it may seem intuitive that training methods demonstrating construct validity would also have cross-method validity between them, such a relationship has not previously been demonstrated.

As individual training tools are developed for robotic surgery, residency programmes must determine which of these should be integrated into their robotic surgery curriculum. The American College of Surgeons has proposed a model for surgical skills acquisition, which includes expert demonstration and error avoidance, proficiency-based practice, and structured assessment [20]. As part of the pre-clinical component of robotic training, we propose a multi-method approach (inanimate, virtual reality, and in vivo) to robotic surgery training, which would provide trainees with the opportunity to develop and demonstrate their proficiency with basic robotic surgical skills before proceeding to the clinical arena.

Inanimate training requires minimal additional cost once a robot is available at an institution. Simply constructed homemade materials can be used or, preferably, validated standardized task kits (e.g. the Fundamental Inanimate Robotic Skills Tasks described in the present study), can be used to provide hands-on experience with the controls and handling of the robotic instruments. The featured inanimate exercises are analogous to the widely practiced FLS tasks for laparoscopic training. Standardized validated training tasks and methods for evaluation are important for establishing consistent performance outcomes. One limitation of this method is the requirement of an available robotic system for training, which may pose an accessibility challenge at high-volume centres.

Virtual reality simulation is a novel and emerging method for robotic surgery training. While it involves an additional cost for either a stand-alone simulator (the Robotic Surgical Simulator or the dv-Trainer) or an add-on simulation unit for the robotic surgeon console (da Vinci Skills Simulator), it can allow familiarization with the robotic interface and facilitate training of basic surgical skills (i.e. needle handling). Currently, all commercially available simulators that have been extensively validated in the literature [7-17] are limited to basic skills training. Virtual reality has the potential to play a larger role in training once cognitive-based and procedure-specific modules (i.e. prostatectomy and partial nephrectomy) are developed. Initial validation studies already suggest that extended simulation training of basic skills has an impact on real tissue surgical performance [16].

In vivo training in the animal model is perhaps the most sophisticated training method before intraoperative clinical training. High-fidelity simulation becomes important once basic skills have been acquired and procedural learning begins, but this is expensive, requiring a dedicated training robot and an animal facility that few programmes can afford. Because of cost restraints, in vivo animal training is likely to be limited to advanced procedural training at select centres. At institutions where a robotic animal laboratory is available, in vivo training should be performed and assessed using a standardized assessment tool to track progress and proficiency. GEARS has previously been validated in the clinical model [18]. In the present study, GEARS was used to assess performance on a standardized task that required both delicate tissue handling and suturing, which is currently not replicable in inanimate or virtual reality training. We confirmed that this assessment tool can reliably differentiate trainee and robotic expert performance (P < 0.001). Future efforts will be directed toward the development and validation of procedural-based competency assessment tools, including hilar dissection, tumour excision, and renorrhaphy for partial nephrectomy and bladder neck transection, pedicle control, nerve dissection and reconstruction for prostatectomy.

Clinical training involving patients should follow the establishment of proficiency with the above-mentioned methods. There are several challenges to robotic clinical training, most stemming from the robotic interface, which precludes hands-on teaching and limits control to a single surgeon. Clinical assessment tools, such as GEARS [18] can provide informative feedback on trainee performance and serve as a method to evaluate the clinical outcome of a skills training programme.

In the present study, we provide a novel assessment method (cross-method validity) that may help in the development of an integrated robotic surgery curriculum. To maximize efficiency, synchronized development of standardized training tools into a training curriculum is needed, with best practices established through rigorous performance data-based validation efforts. We propose evaluation of standard trainee-expert performance characteristics on new training tools using traditional forms of validation (i.e. face, content, construct, concurrent validities) as well as comparative assessment (cross-method validity) as decribed in the present study. And while training and assessment are two different domains, they are integral components of the education process. A well-designed and validated training tool may also be used as an assessment tool. As further performance data are accumulated for different levels of experience, proficiency benchmarks for skills evaluation can be generated.

The different methods, whether inanimate, virtual reality, or in vivo should concurrently target the same global skill sets; therefore, observation of performance correlation across methods is relevant and can serve to internally evaluate the individual components of the training programme for their value both as training and as assessment tools. For example, the simplest featured virtual reality task ‘Peg Board,’ probably has limited utility as it did not exhibit a significant correlation with the other two methods (inanimate and in vivo), while performance of the three more sophisticated tasks demonstrated significant correlation with the other methods (Table 3). In addition, performance of the inanimate tasks showed the strongest correlation to in vivo robotic performance, supporting the important role of such inanimate exercises in a robotic training programme. Accordingly, cross-method correlation can be used to select the most useful training and assessment tools when constructing a robotic training curriculum. Further efforts are needed to demonstrate the ability of different training methods to result in better clinical outcomes (predictive validity).

The present study is not without limitations. Its primary limitation is that it provides a static, snapshot assessment of robotic training in a single, although broad, cohort. Longitudinal studies to assess the impact of the training programme on robotic skill acquisition are needed. Another limitation may be the lack of comparison of deconstructed skills across methods. This is a limitation of today's robotic training tools, i.e. the lack of equally developed assessment metrics. For example, current limitations of virtual reality simulation prevent certain skills comparisons (i.e. suturing and realistic tissue deformation). Furthermore, inanimate task training, even the FLS, presently lack validated complex metrics apart from time and the register of errors. We expect that with growing attention and sophistication in training tools, these limitations will be addressed. The present study provides a global comparative assessment of skills.

Efforts to integrate a robotics training programme with didactic and cognitive components are under way, with incorporation of the three methods and where trainee performance is prospectively tracked over time. As minimum standards of proficiency are defined, the tools used in the present study may assist in establishing benchmarks for competency and credentialing for robotic surgical privileging. Further validation of the cross-method concept is actively being pursued with broader application across multiple novel training platforms.

In conclusion, the present findings externally confirm the construct validity of the featured training methods and demonstrate a significant performance correlation across virtual reality, inanimate, and in vivo settings. We present the concept of cross-method validation of individual training tasks, which may provide a method of comparatively evaluating novel tools developed for robotic surgery training.

Conflict of Interest

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References

None declared. Funding for the study was received by Ethicon Inc. and Intuitive Surgical.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgement
  8. Conflict of Interest
  9. References
  • 1
    Duchene DA, Moinzadeh A, Gill IS, Clayman RV, Winfield HN. Survey of residency training in laparoscopic and robotic surgery. J Urol 2006; 176: 21582166; discussion 2167
  • 2
    Preston MA, Blew BD, Breau RH, Beiko D, Oake SJ, Watterson JD. Survey of senior resident training in urologic laparoscopy, robotics and endourology surgery in Canada. Can Urol Assoc J. 2010; 4: 4246
  • 3
    Lee JY, Mucksavage P, Sundaram CP, McDougall EM. Best practices for robotic surgery training and credentialing. J Urol 2011; 185: 11911197
  • 4
    Sroka G, Feldman LS, Vassiliou MC, Kaneva PA, Fayez R, Fried GM. Fundamentals of laparoscopic surgery simulator training to proficiency improves laparoscopic performance in the operating room-a randomized controlled trial. Am J Surg 2010; 199: 115120
  • 5
    Goh AC, Joseph RA, O'Malley M, Miles BJ, Dunkin BJ. Development and validation of inanimate tasks for robotic surgical skills assessment and training. J Urol 2010; 183 (2010 Suppl.): 516
  • 6
    Arain NA, Dulan G, Hogg DC et al. Comprehensive proficiency-based inanimate training for robotic surgery: reliability, feasibility, and educational benefit. Surg Endosc 2012; 26: 27402745
  • 7
    Seixas-Mikelus SA, Kesavadas T, Srimathveeravalli G, Chandrasekhar R, Wilding GE, Guru KA. Face validation of a novel robotic surgical simulator. Urology 2010; 76: 357360
  • 8
    Seixas-Mikelus SA, Stegemann AP, Kesavadas T et al. Content validation of a novel robotic surgical simulator. BJU Int 2011; 107: 11301135
  • 9
    Sethi AS, Peine WJ, Mohammadi Y, Sundaram CP. Validation of a novel virtual reality robotic simulator. J Endourol 2009; 23: 503508
  • 10
    Kenney PA, Wszolek MF, Gould JJ, Libertino JA, Moinzadeh A. Face, content, and construct validity of dV-trainer, a novel virtual reality simulator for robotic surgery. Urology 2009; 73: 12881292
  • 11
    Lerner MA, Ayalew M, Peine WJ, Sundaram CP. Does training on a virtual reality robotic simulator improve performance on the da Vinci surgical system? J Endourol 2010; 24: 467472
  • 12
    Korets R, Mues AC, Graversen JA et al. Validating the use of the Mimic dV-trainer for robotic surgery skill acquisition among urology residents. Urology 2011; 78: 13261330
  • 13
    Lee JY, Mucksavage P, Kerbl DC, Huynh VB, Etafy M, McDougall EM. Validation study of a virtual reality robotic simulator-role as an assessment tool? J Urol 2012; 187: 9981002
  • 14
    Finnegan KT, Meraney AM, Staff I, Shichman SJ. da Vinci Skills Simulator construct validation study: correlation of prior robotic experience with overall score and time score simulator performance. Urology 2012; 80: 330335
  • 15
    Hung AJ, Zehnder P, Patil MB et al. Face, content and construct validity of a novel robotic surgery simulator. J Urol 2011; 186: 10191024
  • 16
    Hung AJ, Patil MB, Zehnder P et al. Concurrent and predictive validation of a novel robotic surgery simulator: a prospective, randomized study. J Urol 2012; 187: 630637
  • 17
    Kelly DC, Margules AC, Kundavaram CR et al. Face, content, and construct validation of the Da Vinci Skills Simulator. Urology 2012; 79: 10681072
  • 18
    Goh AC, Goldfarb DW, Sander JC, Miles BJ, Dunkin BJ. Global evaluative assessment of robotic skills: validation of a clinical assessment tool to measure robotic surgical skills. J Urol 2012; 187: 247252
  • 19
    Patel VR, Tully AS, Holmes R, Lindsay J. Robotic radical prostatectomy in the community setting–the learning curve and beyond: initial 200 cases. J Urol 2005; 174: 269272
  • 20
    Scott DJ, Dunnington GL. The new ACS/APDS Skills Curriculum: moving the learning curve out of the operating room. J Gastrointest Surg 2008; 12: 213221