Tubal evaluation in the investigation of subfertility: A structured comparison of tests
Fallopian tube obstruction is thought to play a role in 12% to 33% of subfertile couples.1,2 Assessing the patency of the fallopian tubes is, therefore, an important part of the work-up of a subfertile couple. There are several tests available for this purpose, including hysterosalpingography, laparoscopy and dye test, selective salpingography and hysterosalpingo-contrast sonography (Table 1). Each one of these tests differs in inter- and intra-observer reliability,3 diagnostic accuracy4 to predict blockage or other tubal disease, the prognostic information for treatment-independent pregnancy, potential complications and costs. Moreover, some of these tests would allow assessment of the possible functioning of the tubes (e.g. measurement of tubal perfusion pressures during selective salpingography5), evaluation of other pelvic pathology that may have an impact on fertility (e.g. laparoscopy to assess for endometriosis)6 or improve pregnancy rates in their own right (e.g. oil-based hysterosalpingography7). Currently, there is a wide variation in the choice, combinations and ordering of tubal evaluation tests in different assisted conception units. While there could be several reasons for such variation in practice, one reason may be a lack of collated information that allows an easy comparison of the available tests. We, therefore, reviewed tubal evaluation tests for their inter- and intra-observer reliability, accuracy to predict tubal blockage and disease, prognostic value for treatment-independent pregnancies, effectiveness to improve pregnancy outcome and possible complications. We limited our review to visual tests only—non-visual tests such as chlamydial antibody testing were not considered in this review.
Table 1. Tubal evaluation in the investigation of subfertility: a structured comparison of tests.
|Hysterosalpingography||Inter-observer reliability8||Reference standard: laparoscopy and dye test|
|Proximal tubal occlusion||0.85||20 diagnostic accuracy studies11 found LR +ve varied between 1.7 and 72, and LR −ve between <0.1 and 0.7 for diagnosis of tubal patency. Three studies judged HSG and laparoscopy independently. Pooled meta-analysis of these gave a sensitivity of 65% and specificity of 83% for diagnosis of tubal patency. Accuracy for detection of peritubal disease was poor.|
|Distal tubal occlusion||0.69|
|Intra-observer reliability8||Typical likelihood ratios:|
|Proximal tubal occlusion||0.89||Proximal tubal occlusion||6.0||0.60|
|Distal tubal occlusion||0.72||Distal tubal occlusion||2.1||0.43|
|Peritubal adhesions||0.65||Peritubal adhesions||1.8||0.61|
|Laparoscopy and Dye test||No data identified.||This is considered the reference standard in evaluating tubal patency. There is some justification for this as a large prospective study found the prognostic information from laparoscopy and dye test is better than HSG13 (see Table 2).|
|Selective salpingography and tubal catheterisation||No data identified.||SS has been compared to laparoscopy and dye test in a randomised controlled trial,34 although published data do not allow the calculation of summary accuracy measures. The authors reported that:|
| ||1. SS is better at predicting proximal tubal occlusion.|
|2. There is no difference in predicting distal tubal occlusion.|
|3. Laparoscopy and dye test is better in predicting peritubal disease.|
|Hysterosalpingo-contrast sonography||Intra-observer reliability40||Reference standard: laparoscopy and dye test|
|Weighted kappa for bilateral patency, unilateral patency and bilateral occlusion:||LRs for tubal occlusion from two large diagnostic accuracy studies41,42 that had prospective and consecutive enrolment are given below:|
|Kappa for right tube for patency or occlusion:|
|Kappa for left tube for patency or occlusion:|
|Inter-observer reliability||Study||LR +ve||LR −ve|
|No data identified.||Hamilton et al.41||7.3||0.33|
| ||Dijkman et al.42||2.9||0.75|
| ||The accuracy of HyCoSy may depend on experience. Dijkman et al.42 recalculated the LRs omitting the first 50 cases, and found the accuracy of HyCoSy was similar to that of HSG.|
|Falloposcopy||No data identified.||No data identified.|
|Fertiloscopy||No data identified.||A prospective comparison with laparoscopy reported the following agreement# between the two tests:|
|Right tube (0.91), left tube (0.86), right ovary (0.80), left ovary (0.84)peritoneum (0.78),uterus (0.86),tubal patency (0.80)|
This review summarises the published literature on the subject. However, as this review addresses a broad range of features of a number of tests, we focussed on representativeness rather than on comprehensiveness in our literature searches. Whenever systematic reviews existed to inform aspects of tests, we used these rather than performing our own reviews. We did not undertake updating of any existing reviews. When reviews did not exist, we focussed on large and valid studies rather than aiming to capture all the literature on the subject. Searches for systematic reviews and primary articles were conducted in the Cochrane Library (2003: issue 4), MEDLINE (1966–2004) and EMBASE (1980–2004) using a combination of keywords to represent tests (‘hysterosalpingo*’, ‘laparoscopy and dye’, ‘chromotubation’, ‘hydrotubation’, ‘dye insufflation’, ‘chromopertubation’, ‘chromolaparoscopy’, ‘selective salpingography’, ‘tubal catheterisation’, ‘tubal catheterization’, ‘hysterosalpingocontrast sonography’, ‘salpingosonography’, ‘hycosy’, ‘salpingoscopy’, ‘falloposcopy’ and ‘fertiloscopy’) and test features (e.g. reliability, reproducibility, accuracy, sensitivity, specificity, likelihood ratios, effects, effectiveness, adverse events and complications).
Reliability is defined as the extent to which repeated measurements under similar conditions on the same individual or test material agree with each other.3 Reliability has several near-synonyms including repeatability, reproducibility and consistency, and is broadly divided into ‘inter-observer reliability’ when two or more observers compare measurements, and ‘intra-observer reliability’ when one observer makes measurements at two or more occasions. Accuracy is defined as the extent to which the test actually measures what it is purporting to measure, and is obtained from a comparison of the index test to a recognised reference or ‘gold’ standard.4Prognosis is defined as the capacity of a particular test to predict the occurrence of treatment-independent pregnancy. Effectiveness relates to changes in clinically important outcomes such as pregnancy rates related to the use of a test. Randomised controlled trials would represent the best study design for assessing effectiveness. However, in the absence of trials, we considered other forms of evidence such as cohort studies. Complications included major and minor adverse events.
Studies were selected in a two-stage process. Firstly, the titles and abstracts from the electronic searches were scrutinised and full manuscripts of all citations that were likely to be relevant were obtained. Secondly, final inclusion or exclusion decisions were made on examination of the full manuscripts. All reviews and primary studies were appraised for quality (findings not reported—available from authors on request) and only those that were judged to have avoided serious bias were included in this review. As this was a subjective assessment, disagreements about study inclusion occurred, and such disagreements were resolved by consensus between the reviewers.
The included reviews and primary studies were summarised and necessary quantitative results were extracted from the articles to inform the various predefined aspects of the tests we were interested in.
This X-ray based contrast test assesses the uterine cavity and the fallopian tubes. Reliability between observers (inter-observer reliability) is almost perfect for proximal occlusion, substantial for distal obstruction and hydrosalpinx and moderate to poor for adhesions8,9(Table 1). Reliability within observers (intra-observer reliability) is almost perfect for proximal tubal occlusion, and substantial for distal obstruction, hydrosalpinx and adhesions8(Table 1). A comparative study assessing the reliability of clinicians versus radiologists for detecting abnormalities on hysterosalpingography films found clinicians were more reliable in diagnosing tubal obstruction and hydrosalpinx, while radiologists more reliably detected more subtle findings such as uterine adhesions.10
An abnormal hysterosalpingography has moderate accuracy in predicting (ruling in) proximal tubal blockage and hydrosalpinges11(Table 1). Both abnormal and normal hysterosalpingography test findings have poor accuracy in ruling in or ruling out distal tubal obstruction and peritubal adhesions.11 Unlike the abnormal test, a normal hysterosalpingography has poor accuracy in ruling out a hydrosalpinx11(Table 1).
The treatment-independent pregnancy rate for one-sided tubal occlusion on hysterosalpingography is reduced slightly to a fecundity rate ratio of 0.80, whereas in the case of bilateral occlusion at hysterosalpingography, the chance of pregnancy is substantially decreased with a fecundity rate ratio between 0.30 and 0.4912–14(Table 2).
Table 2. Prognosis, effectiveness and complications of tubal evaluation tests for the investigation of subfertility: A structured comparison.
|Hysterosalpingography||The adjusted fecundity rate ratio (FRR) for one-sided tubal occlusion at HSG was 0.80. For two-sided tubal occlusion the FRR varies between 0.30 and 0.49.12,13||HSG oil-soluble media vs no intervention 7:||Pelvic infection (1–3%)18|
|Pregnancy: OR 3.57||Intravasation (6.9%)17|
|(95% CI 1.8 to 7.2) [2 RCTs]||Lipo-granuloma (a chronic inflammatory reaction in the tubes)|
|HSG water-soluble media vs no intervention: no RCTs7|| |
|HSG oil-soluble media vs water-soluble media7:|| |
|Live birth: OR 1.5 (95% CI 1.1 to 2.1) [2 RCTs]|| |
|Pregnancy: OR 1.2 (95% CI 0.95 to 1.6) [5 RCTs]|| |
|Water-soluble plus oil-soluble media vs water-soluble media alone. No benefit [2 RCTs].7|| |
|Laparoscopy and Dye test||The adjusted FRR of one-sided tubal occlusion at laparoscopy was 0.51, whereas two-sided tubal occlusion showed an FRR of 0.15.13||No data identified.||Several large observational studies22–26 examining complications in gynaecological laparoscopy exist:|
|Vascular injury: 0.2/1000|
|Bowel injury: 0.4–0.7/1000|
|Urological injury: 0.1/1000|
|Methylene blue toxicity.|
|Selective salpingography and tubal catheterisation||Tubal perfusions pressures (TPP) measured at SS can classify women into three prognostic groups5:||No randomised evidence exists. A systematic review35 of 9 observational studies found that in women with proximal tubal blockage, SSTC achieved ongoing pregnancy (>20 weeks) rates between 9–40%.||Tubal perforation has been reported in 0% to 10% of cases.37,38 However perforations normally require no additional treatment.|
|1.‘good’ [both tubes <300, or one tube <300 and the other 300–500 mmHg]||Women with TPP in the ‘poor’ range (see previous column)5 who, following TC, moved to the ‘good’ prognostic group achieved an age-adjusted 4 year, cumulative non-IVF/ICSI conception rate of 92%.36 Following TC, if women remained ‘poor’ or only moved to ‘mediocre’ from an original ‘poor’, then their age-adjusted 4 year, cumulative non-IVF/ICSI conception rates remained at around 45%.36||Infection|
|2. ‘mediocre’ [both tubes 300–500, or one tube<300 and the other >500 mmHg], and|| ||Minor bleeding|
|3. ‘poor’ [both tubes >500, or one tube 300–500 and the other >500 mmHg]|| ||Vasovagal reactions|
| || ||Ectopic pregnancy (probably due to tubal disease rather than SSTC)|
|Age-adjusted 4 year, cumulative non-IVF/ICSI conception rates were 74%, 56% and 30% respectively.5|| ||The excess risks of cancer for women due to the radiation exposure during SSTC have been estimated at four to thirteen per million procedures.65|
|Hysterosalpingo-contrast sonography||No data identified.||No data identified.||Abdominal and shoulder pain: 7–1044,66|
| || ||Vasovagal reaction|
|Falloposcopy||A score47 based on the location, nature and extent of tubal luminal disease gave the following result48:||There is evidence that re-canalisation of proximally occluded tubes can be achieved in up to 53% of tubes.50||Pin-point perforation of the tube: 5.1%48|
|Score||Spontaneous pregnancy rate|| || |
|<20||28%|| || |
|21–30||12%|| || |
|>30||0%|| || |
|Fertiloscopy||No data identified.||Rectal perforation56|
| ||Injury to the uterus|
| ||Epiploon hernia|
A hysterosalpingography can be performed with water- or oil-soluble contrast media. A Cochrane review found that hysterosalpingographies with oil-soluble contrast media are associated with greater rates of pregnancies compared with no intervention or hysterosalpingographies with water-soluble contrast media7(Table 2). This improvement in pregnancy rates is likely to occur soon after the procedure; there is evidence that the benefit diminishes beyond 12 months post-procedure.15 The odds of obtaining satisfactory images are reduced with oil-soluble contrast media when compared with water-soluble contrast media for both the uterine cavity (OR 0.28, 95% CI 0.16–0.48) and the tubal ampulla (OR 0.05, 95% CI 0.03–0.07).7 Moreover, the odds of intravasation are higher with oil-soluble contrast media when compared with water-soluble contrast media (OR 8.5, 95% CI 3.3–21.9),7 although there were no significant sequelae from this. Although oil-soluble contrast media were associated with reports of anaphylaxis and deaths in the early days of its use, severe adverse events have become virtually absent since the introduction of fluoroscopy.16,17 None of the participants in the eight randomised controlled trials in the Cochrane review suffered any serious adverse events.7
The main complication of hysterosalpingography is the occurrence of pelvic infection, which is reported in 1–3% of all cases.18 The risk of infection can be decreased by an adequate medical history for previous infections, chlamydial testing of the cervix and prophylactic antibiotics.19
As hysterosalpingography is reliable, reasonably accurate (for detecting proximal tubal disease, but not distal tubal disease) and is associated with improved pregnancy rates, as well as being generally safe and inexpensive (see Table 2), a recent National Institute of Clinical Excellence (NICE, UK) guideline20 has recommended that ‘women who are not known to have comorbidities (such as pelvic inflammatory disease, previous ectopic pregnancy or endometriosis) should be offered hysterosalpingography to screen for tubal occlusion’.20
Laparoscopy and dye test
Laparoscopy and dye test (also termed dye hydrotubation, dye insufflation, dye pertubation, chromopertubation or chromolaparoscopy) is widely considered the gold standard test for investigating tubal patency. Additionally, it allows assessment for peritubal disease, adhesions and endometriosis. This has led to a recommendation by the NICE (UK) that women suspected of having comorbidities (such as endometriosis and pelvic inflammatory disease) should undergo laparoscopy so that pelvic and tubal pathology can both be assessed.20
We identified no studies of reliability of laparoscopy and dye test in the assessment of tubal patency, and this is probably due to the fact that performing repeated invasive surgical procedures to assess reliability is likely to be ethically unacceptable. There is good evidence laparoscopic findings have better prognostic value in predicting fertility compared with hysterosalpingography13,21(Table 2), thus supporting the use of laparoscopy and dye test as the reference test in studies evaluating tests for tubal patency. However, women do conceive after laparoscopy has demonstrated bilateral tubal occlusion, thus providing evidence that laparoscopy and dye test can give false results, for example, a false positive result with a temporary tubal spasm or poor operator technique, and thus may not be the ideal reference test.
Apart from the assessment of the fallopian tubes, laparoscopy also allows the inspection of the pelvis for endometriosis. Minimal to mild endometriosis can be treated which can result in improved fertility prospects.6 However, laparoscopy is an invasive procedure and is associated with morbidity22–26 and, very rarely, mortality.22–24 Only a few small studies have specifically examined the incidence of complications from laparoscopy and dye test. However, data from several large observational studies examining complications of general gynaecological laparoscopy may be applicable to laparoscopy and dye test.22–26 Complication rates for diagnostic laparoscopy have been reported to be between 0.06% and 0.20%, with the most significant complications being vascular, intestinal and urological injuries22–26(Table 2). Anaesthetic complications and methylene blue toxicity27 have also been reported, but are rare.
Selective salpingography and tubal catheterisation
The first attempt to unblock a fallopian tube transcervically was with the use of a whalebone guided by tactile sensation, reported by Smith in 1849.28 Transcervical tubal cannulation has since been achieved under hysteroscopic,29 sonographic30 or fluoroscopic guidance,31–33 although fluoroscopic selective salpingography and tubal catheterisation are the most common procedures. In selective salpingography, each tube is cannulated and then flushed with a contrast agent. If proximal tubal blockage is identified, a guide-wire is passed through the selective salpingography catheter into the fallopian tube to achieve recanalisation (tubal catheterisation). Thus, a see-and-treat approach is possible in an outpatient setting with selective salpingography and tubal catheterisation.
We identified no studies that evaluated the reliability of selective salpingography and tubal catheterisation. There is evidence that accuracy of selective salpingography is better than laparoscopy and dye test in predicting proximal tubal occlusion; both tests are equivalent in predicting distal tubal occlusion; and laparoscopy and dye test is better than selective salpingography in predicting peritubal disease.34
The effectiveness of selective salpingography and tubal catheterisation in improving fertility rates has not been evaluated in randomised trials. However, there is substantial observational evidence35 to support the effectiveness of selective salpingography and tubal catheterisation in women with proximal tubal occlusion, prompting a NICE guideline20 recommendation to use selective salpingography and tubal catheterisation (or hysteroscopic tubal cannulation) for women with proximal tubal occlusion. Although tubes may be shown to be patent anatomically, they may still have poor function.5,36 Tubal perfusion pressure, which may be an indicator of tubal function, has been shown to have prognostic value in predicting conception,5 and reductions in tubal perfusion pressure with tubal catheterisation are associated with improved conception rates36(Table 2).
No major complications have been reported with selective salpingography and tubal catheterisation,36–38 and most minor complications associated with selective salpingography and tubal catheterisation can be managed conservatively39(Table 2).
Hysterosalpingo-contrast sonography allows the assessment of uterine cavity outline and tubal patency. This investigation involves the transcervical injection of echogenic medium (air with saline or air-filled albumin microspheres), the course of which is followed in real time by transvaginal ultrasound. There is no anaesthetic requirement and there is no ovarian irradiation.
Substantial intra-observer agreement was found for bilateral, and right tubal patency and occlusion (kappa >0.6,Table 1) although for left tubal patency and occlusion, intra-observer agreement was found to be just fair (kappa 0.37).40 Investigators hypothesised that this could have been due to (a) a difference in prevalences in tubal occlusion between right and left; (b) as investigation began on the right side, and then moved to the left side, the time available may have been limited for the left side examination; and (c) the right-handedness of the investigator may have had an effect on the performance of the test.40 No information was identified on inter-observer reliability.
Both abnormal and normal hysterosalpingo-contrast sonography results have shown variable accuracies in different studies,41–45 although the accuracy of hysterosalpingo-contrast sonography was found to be comparable to that of hysterosalpingography42,43(Table 1). We identified no data on possible therapeutic effects of hysterosalpingo-contrast sonography. Abdominal and shoulder pain from peritoneal irritation were the most common complications.44,45
Falloposcopy is the visual examination of the fallopian tube lumen with a microendoscope.46 It is performed via a transcervical approach, and allows the assessment of the tubes from the tubal ostium to the fimbriae. At first, the technique involved the insertion of the falloposcope through a flexible cannula into the tube under hysteroscopic visualisation. The procedure was carried out with the help of continuous fluid irrigation through the flexible cannula (coaxial delivery system). More recently, a miniature tubular balloon system that is rolled out along the fallopian tube lumen by the use of hydraulic pressure, which concurrently carries the falloposcope forward (linear eversion system), has been described. This can be used without the aid of a hysteroscope, under sedation or with local anaesthesia.
A scoring system47 of falloposcopic findings based on the location, nature and extent of tubal luminal disease predicts spontaneous pregnancy48,49(Table 2). However, as with salpingoscopy, doubt exists about the clinical significance of many falloposcopic findings.
We identified no data on the inter- and intra-observer reliability or the diagnostic accuracy of falloposcopic testing against a reference standard. There is no randomised trial evidence on the potential therapeutic effects of falloposcopy, although there is evidence that recanalisation can be achieved in over half of proximally occluded tubes,50 which is likely to improve spontaneous conception rates.
Technical failures with the procedure are common51,52: a large prospective study found only 57% of the enrolled women received a complete falloposcopic evaluation.52 The most common complication with falloposcopy is pin-point perforation of the tube,48 and this can generally be managed conservatively. A small comparative study found that the pain levels experienced with falloposcopy were lower compared with hysterosalpingography.53
Fertiloscopy is the combination of transvaginal hydrolaparoscopy dye test, optional salpingoscopy, and finally, hysteroscopy performed under local anaesthesia or sedation in an office setting.54 Procedures such as ovarian drilling, biopsy and limited adhesiolysis can also be performed during fertiloscopy. Using a combined Veres needle-trocar system, abdominal distension is achieved by transvaginal instillation of warm saline. This allows the aquaflotation and inspection of the tubo-ovarian structures in their natural position.
We found no data on the treatment-independent prognostic value of fertiloscopy, its inter- and intra-observer reliability or possible therapeutic effects. A study that evaluated agreement between fertiloscopy and laparoscopy found an almost perfect agreement for most diagnostic items.55 The incidence of bowel injury, in a series of 3667 fertiloscopy procedures, was reported at 0.65%.56 Most of these cases (92%) were successfully managed conservatively.56 Other complications include epiploon hernia and injury to the rectum and the uterus, which may again be managed conservatively.55
In this review, we summarise the test features of the main visual tubal patency tests. The inter- and intra-observer reliability, diagnostic accuracy, treatment-independent prognostic value, therapeutic effects and complications for each test have been provided where the data existed.
Although we took a systematic approach in the searching, selection and synthesis of the literature presented in this article, it should be noted that a comprehensive literature review was not undertaken, resulting in the possibility of non-identification of some relevant studies. However, the studies included in this review are likely to be representative of the available evidence, and therefore provide a sound summary of various aspects of the tubal evaluation tests. Although we took a pragmatic decision not to undertake updating of existing systematic reviews, in the case of the accuracy of hysterosalpingography, as the exiting review11 is eight years old, we searched for large and valid diagnostic accuracy studies published since the original review—we did not find any studies that were at variance with the findings of the original review.11
When more than one study reported on a test feature, it was often noted that there was heterogeneity (lack of consistency) in the findings. There may be several reasons for this, including differences in the spectrum of women under investigation, the techniques of performing the tests, the thresholds for defining abnormality and the operator experience. However, two additional issues that could explain the heterogeneity are biases in the study design,57 and the choice of the reference or ‘gold’ standard. Laparoscopy and dye test has generally been used as the gold standard in most accuracy studies, although, as discussed earlier, this test itself is likely to be inaccurate in many instances.
Our review focussed on visual tests of tubal patency. Non-visual tests such as patients' history58 and chlamydial antibody titres59 can also be used as diagnostic tests to inform about the tubal status. Data on the diagnostic accuracy of the patients' history are conflicting, with some suggesting useful information,58,60,61 whereas others suggest that such a history is not useful.62 Large cohort studies are needed to resolve this issue. A systematic review summarising the available evidence until 1996 showed chlamydial antibody titres to have a diagnostic accuracy that was similar to that of hysterosalpingography.59 However, the poor methodological quality of some of the included studies and the probability of publication bias need to be considered while reviewing this evidence.
The findings that tubal perfusion pressures (at selective salpingography),5 and direct visual scoring (at falloposcopy)48 predict fertility prospects suggest that assessment of tubal function may be of importance in addition to assessing patency. Assessment of function assumes importance as there appears to be an effective treatment, tubal catheterisation, which not only reduces the tubal perfusion pressures in those with high pressures, but also results in improved pregnancy rates.36
The findings of our review are consistent with, and support the recommendations by the NICE guideline on fertility management.20 Hysterosalpingography, being a reliable, reasonably accurate and safe test with the potential of improving pregnancy outcome in its own right, should be the screening test in women with no other suspected comorbidities such as endometriosis and pelvic infection. If such comorbidities are suspected, then laparoscopy and dye test may be a more suitable test, as it would also allow a fuller assessment of the pelvis, and treatment if indicated. In women with proximal tubal blockage, selective salpingography with tubal catheterisation should be offered as these may improve the prospects of pregnancy.
Prevalence of tubal disease varies in different population subgroups. For example, while hysterosalpingography-identified bilateral tubal block is found in 24% of couples with secondary infertility with normal sperm and an ovulatory cycle, the figures for those couples with male factor subfertility, or anovulation, are 5% and 7%, respectively. Therefore, in the latter groups, assessment of fallopian tubes could reasonably be delayed in the diagnostic work-up. If these women do not conceive with simple interventions such as clomiphene citrate, then there will be the opportunity to combine tubal assessment with effective therapeutic procedures such as ovarian drilling63 at the time of laparoscopy and dye test.
Finally, in this review, we have identified several areas where further research may be required, particularly in relation to hysterosalpingo-contrast sonography, falloposcopy and fertiloscopy (Table 1). However, such research should take into account the duration of infertility, differences in the spectrum of patients, as well as use an appropriate research design3,4,64 that would minimise biases and maximise external validity.