Paula D. James, Division of Hematology, Department of Medicine, Room 2025, Etherington Hall, Queen’s University, Kingston, ON, Canada K7L 3N6. Tel.: +613 533 2946; fax: +613 533 6855. E-mail: firstname.lastname@example.org
Summary. A personal history of excessive mucocutaneous bleeding is a key component in the diagnosis of a number of mild bleeding disorders, including von Willebrand disease (VWD), platelet function disorders (PFD), and coagulation factor deficiencies. However, the evaluation of hemorrhagic symptoms is a well-recognized challenge for both patients and physicians, because the reporting and interpretation of bleeding symptoms is subjective. As a result, bleeding assessment tools (BATs) have been developed and studied in a variety of clinical settings. This work has been pioneered by a group of Italian researchers, and the resultant ‘Vicenza Bleeding Questionnaire’ stands as the original BAT. In this review, we will discuss the modifications of the Vicenza Bleeding Questionnaire that have taken place over the years, as well as the validation studies that have been published. Other BATs that have been developed and published will be reviewed, as will the special situations of assessing pediatric bleeding as well as menorrhagia. Lastly, the clinical utility of BATs will be discussed including remaining challenges and future directions for the field.
A personal history of excessive mucocutaneous bleeding is a key component in the diagnosis of a number of mild bleeding disorders, including von Willebrand disease (VWD), platelet function disorders (PFDs), and coagulation factor deficiencies. However, the evaluation of hemorrhagic symptoms is a well-recognized challenge for both patients and physicians, because the reporting and interpretation of bleeding symptoms is subjective. Significant symptoms may be overlooked because they are considered to be normal, and minimal or trivial symptoms may be given undue consideration. The risk of this second issue is highlighted by the high frequency of bleeding symptoms reported by the general population [1,2]. In response to these challenges, a number of attempts have been made to standardize bleeding histories in an effort to: (i) improve diagnostic accuracy and thus avoid unwarranted laboratory testing; (ii) predict the risk of bleeding in an individual patient; (iii) describe symptom severity; and (iv) inform treatment. In this article, we will review the evolution of bleeding assessment tools (BATs), review the published literature focusing on the application of these tools, and discuss the remaining challenges.
Over the years, multiple investigators have made attempts to standardize bleeding histories by identifying questions that best distinguish between affected and unaffected individuals. In 1995, Sramek et al. published their experience with a bleeding questionnaire that was administered to patients known to have a bleeding disorder and to a group of normal controls . The most informative questions in terms of discrimination were about bleeding following traumatic events such as tonsillectomy or dental extraction (but not childbirth) and the presence of a bleeding disorder in a family member. Interestingly, these questions were only discriminatory in a screening setting, and not in a referral setting, perhaps because a referral population is composed of a preselected group of individuals with highly prevalent symptoms. In 2005, the ISTH Scientific and Standardization Committee (SSC) on von Willebrand factor (VWF) established a set of provisional criteria for the diagnosis of VWD type 1, including the threshold that must be met for mucocutaneous bleeding symptoms to be considered significant . Since that time, the field has increasingly focused on quantitative assessments of bleeding, and on the need for standardization.
Building on the ISTH provisional criteria, a group of investigators from Vicenza, Italy, led by Rodeghiero, developed and validated a Vicenza-based BAT for the diagnosis of type 1 VWD in a primarily adult population . Each bleeding symptom is scored from 0 (absence or trivial symptoms) to 3 (symptom requiring medical intervention), and the overall bleeding score is determined by summing the scores for all of the bleeding symptoms. The results of this study showed that having at least three hemorrhagic symptoms or a bleeding score of 3 in males and 5 in females was very specific (98%) for the bleeding history of type 1 VWD, although less sensitive (69%).
In an attempt to improve the sensitivity of this bleeding score, the scoring system was revised to increase the range of possible grades from − 1 (absence of bleeding after significant hemostatic challenge, such as two dental extractions or operations) to 4 (symptoms requiring the most significant medical intervention, such as infusion of clotting factor concentrates or surgery to control bleeding) . This − 1 to 4 version was used for the European Molecular and Clinical Markers for the Diagnosis and Management of type 1 VWD (MCMDM-1 VWD) Study, and the resultant bleeding score was shown to be strongly inversely correlated with VWF level (P < 0.001, based on three multiple regression models). Additionally, higher bleeding scores were associated with an increasing likelihood of VWD, and scores specifically related to spontaneous mucocutaneous bleeding predicted an increased risk of future bleeding following surgery or dental extraction.
A condensed version of the MCMDM-1 VWD Bleeding Questionnaire was then developed by removing from the full version all of the details that do not directly affect the bleeding score. This version was then prospectively analysed in three studies: one in the primary care setting and two in referral populations. In the primary care setting, the Condensed MCMDM-1 VWD Bleeding Questionnaire showed a sensitivity of 100%, a specificity of 87%, a positive predictive value (PPV) of 0.20 and a negative predictive value (NPV) of 1 for the diagnosis of VWD. Interobserver reliability was confirmed by two observers who administered the questionnaire an average of 3 months apart (interclass correlation coefficient = 0.81, P < 0.001) . In a study published by Tosetto et al.  in 2011, the same condensed bleeding questionnaire was evaluated in a referral population. The data showed that the sensitivity for a mild bleeding disorder varied widely, depending on the reason for referral (25–47%). The specificity ranged from 81% to 98% in the different referral groups, and the PPV was 0.03–0.78. The NPV was again shown to be high (0.82–0.99), meaning that a negative or normal bleeding score can help to exclude a clinically significant inherited bleeding disorder . The Condensed MCMDM-1 VWD Bleeding Questionnaire was also studied in a group of 30 women presenting with menorrhagia, and was able to distinguish those with a bleeding disorder from those without a bleeding disorder (sensitivity 85%, specificity 90%, PPV 0.89, NPV 0.86), and was also able to distinguish disease severity; women with type 3 VWD had the highest bleeding scores .
As mentioned above, an additional area of interest for research involving bleeding quantification lies in differentiating bleeding severity between different disorders. The data in Table 1 show that, in general, mucocutaneous bleeding symptoms are reported more frequently by patients with type 3 VWD than by patients with type 2 and type 1 VWD, although there is a great deal of overlap. Interesting work has evaluated these subtype differences by comparing bleeding symptoms between type 3 VWD obligate carriers (OCs) and normal controls. Type 3 OCs reported more epistaxis, cutaneous bleeding and postsurgical bleeding than normal controls, further highlighting the heterogeneity of symptoms in VWD .
Table 1. Prevalence (%) of bleeding symptoms in patients with von Willebrand disease (VWD) and in healthy individuals [3,5,35–37]
Normals n = 500 n = 341 n = 215
All types of VWD n = 264
Type 1 VWD n = 671 n = 84
Type 2 VWD n = 497
Type 3 VWD n = 348 n = 66
NA, not available.
Post-dental extraction bleeding
Bleeding from minor wounds
In order to consolidate the knowledge obtained from these published studies and the work described below in pediatrics, and to develop a consensus bleeding assessment tool, a Working Party sponsored by the VWF and Perinatal/Pediatric Hemostasis Subcommittees of the ISTH SSC was established in 2008. This group, with input from the Women’s Health Issues in Thrombosis and Haemostasis SSC, published the ISTH BAT in 2010 . Studies to validate this new tool are ongoing, including the necessary psychometric evaluations. Criticisms of the previously published BATs are based on the scoring of the worst single bleeding episode; as a result, there is a lack of accounting for the frequency of bleeding symptoms, and a plateau effect is seen if the questionnaire is administered to individuals with severe bleeding disorders. The ISTH BAT was specifically designed to extend the utility of the earlier BATS by incorporating information on both symptom frequency and severity. A web-based version of the ISTH BAT is freely available through Rockefeller University, with the objective of encouraging investigators to share data (https://bh.rockefeller.edu/ISTH-BATR/). The evolution of the Vicenza-based BATs can be found in Fig. 1, a review of the primary publications in Table 2, and a comparison of the different scoring systems in Table S1.
Table 2. Primary Vicenza-based and other bleeding assessment tools
Ages (years) studied, mean/median (range)
CHAT, Clinical History Assessment Tool; MCMDM-1 VWD, European Molecular and Clinical Markers for the Diagnosis and Management of type 1 VWD; NA; OC, obligate carrier; PBAC, Pictorial Bleeding Assessment Chart; RU-BHQ, Rockefeller University – Bleeding History Questionnaire; VWD, von Willebrand disease. NA, not available.
In addition to the BATs derived from the Italian group’s work, a number of other tools have been developed and published. A comprehensive ontology-backed system was developed at Rockefeller University (Rockefeller University – Bleeding History Questionnaire [RU-BHQ]) that facilitates the collection and collation of detailed, standardized bleeding histories . This bleeding questionnaire is web-based and freely available. To date, the results of the administration of this questionnaire to 500 normal individuals have been reported , and data collection for individuals with type 1 VWD is ongoing. Disease-specific tools have also been studied, including a questionnaire specific for the Quebec Platelet Disorder .
Studies have shown that up to 5–10% of women seek medical attention for heavy menstrual periods at some point during their reproductive life , and that up to 15% of those have an underlying bleeding disorder [15–18]. Despite this, the average delay from onset of bleeding symptoms to the diagnosis of a bleeding disorder has been reported to be 16 years . Additionally, as can be seen in Table 1, menorrhagia is the second most commonly reported bleeding symptom overall by patients with VWD, and the symptom most commonly reported by women. Therefore, tools designed specifically for the assessment of patients with menorrhagia are valuable. The Pictorial Bleeding Assessment Chart (PBAC) allows women to track the number of pads or tampons used for a menstrual period, as well as the degree of soiling. On the basis of that information, a score is generated, and PBAC scores of ≥ 100 correlate with menorrhagia, defined as ≥ 80 mL of menstrual blood loss . More recently, a screening tool for bleeding disorders in women with menorrhagia was developed and tested by Phillipp et al. [21,22] on a population of women with PBAC scores of ≥ 100 and normal pelvic examination findings. The tool, which consists of 11 questions about bleeding symptoms and family history, has a sensitivity of 89% for a bleeding disorder. This was improved to 93% by adding iron deficiency, and to 95% when the PBAC score was increased to > 185. An important detail of this study, however, is that, of the 217 women enrolled, 154 had a bleeding disorder (which is much higher than the published prevalence of a bleeding disorder in other studies), raising concern about the widespread applicability of the results. A review of these tools can be found in Table 2.
Assessing bleeding symptoms in children presents unique challenges. The issue of overlap of symptoms between normal individuals and those affected with mild bleeding disorders also exists in children, particularly for bruising and epistaxis. An additional consideration is that bleeding symptoms manifest in distinctly different ways in children and adults. Some of the classic bleeding symptoms in adults (i.e. menorrhagia and postsurgical bleeding) are clearly not prevalent in the pediatric population. A child with a bleeding disorder may not have had surgery, or (in the case of girls) reached the age of menarche; however, may still have symptoms that cause difficulty and merit treatment. For example, umbilical stump bleeding or bleeding at the time of circumcision may be important early markers of a bleeding disorder, but may be overlooked and not investigated. In order to address these issues, tools have been developed that are specific to pediatrics.
An epistaxis scoring system was published in 1988 by Katsanis et al. . This scoring system results in a child with recurrent nosebleeds being classified as either ‘mild’ or ‘severe’, on the basis of characteristics such as frequency and duration of epistaxis. Children classified as ‘severe’ were more likely to have a family history suggestive of a bleeding diathesis, to be anemic and iron-deficient, to have undergone nasal cauterization, and to have had laboratory coagulation abnormalities identified. In 2000, the hemostasis research group from the Hospital for Sick Children in Toronto, Ontario published their ‘in-house’ pediatric BAT , and they followed this up in 2004 with a second publication confirming the reliability and reproducibility of this questionnaire . After administration of this bleeding questionnaire, children are classified as either ‘bleeders’ or ‘non-bleeders’, depending on whether or not any one of a number of mucocutaneous bleeding symptoms meet the criteria to be considered significant (e.g. recurrent nosebleeds requiring medical treatment or leading to anemia). This questionnaire was compared with the ISTH provisional consensus criteria for significant mucocutaneous bleeding in a group of children with VWD, and was found to be less stringent and therefore perhaps more useful in a pediatric setting .
As a result of the endorsement of the Vicenza-based questionnaires by the ISTH, and with the goal of standardization across a range of ages, Bowman et al.  created the Pediatric Bleeding Questionnaire (PBQ) by adding pediatric-specific bleeding symptoms to the MCMDM-1 VWD Bleeding Questionnaire, maintained the same scoring system, and tested it in a variety of settings. Their work showed that the PBQ had a sensitivity of 83% and a specificity of 79% for VWD. Additionally, the PPV was low, at 0.14, but the NPV was very high, at 0.99, making this an effective tool with which to decide which children do not require blood tests. The receiver operating characteristic curve was very good, with an area under the curve of 0.88 (P = 0.002), showing that the PBQ can accurately distinguish between affected and unaffected children. A review of the main pediatric bleeding questionnaires can be found in Table 3. Subsequently, the PBQ was also tested in children previously known to have an inherited bleeding disorder, and was able to: (i) distinguish disease severity in children with different subtypes of VWD (P < 0.0001); and (ii) highlight age-related increases in bleeding scores in VWD as bleeding challenges are encountered with increasing age . The PBQ was also used to identify the pattern of bleeding symptoms in children with PFDs .
Three independent studies have evaluated the diagnostic utility of the PBQ since its original publication, and whereas two confirmed its efficacy [29,30], the third did not, although their methods of analysis differed .
Evaluating clinical utility
When evaluating the clinical utility of the various BATs, it is critically important to keep in mind the specific objective and setting of use. In general, the tools reviewed in this article have been directed towards two main clinical objectives: (i) to act as a screening tool in both the primary and tertiary care settings for individuals being investigated for the first time for an inherited bleeding disorder; and (ii) to act as a standardized way of describing disease characteristics and of assessing disease severity.
With regard to using BATs as a screening tool for bleeding disorders, it is important to recognize how specific study populations can affect the results, particularly for sensitivity. This is important if symptoms from individuals known to have a bleeding disorder are included after diagnosis, when prophylactic treatments might have been given. Each of the primary Vicenza-based publications dealt with this potential source of bias in different ways; in the original Rodeghiero 2005 publication, OCs of type 1 VWD (rather than index cases) were studied, eliminating the possibility of increasing the sensitivity by studying known bleeders . In the 2006 Tosetto publication (full MCMDM-1 VWD), only bleeding symptoms present before the diagnosis of type 1 VWD were used to compute the bleeding score (or symptoms from individuals who did not receive hemostatic prophylaxis) . In both the Bowman 2008 (Condensed MCMDM-1 VWD) and Bowman 2009 (PBQ) studies, individuals presenting for the first time for investigation of VWD were included, and laboratory levels of VWF and factor VIII were used as the diagnostic gold standard [7,26]. Additionally, specificity can be affected by the definition of controls; the 2005 Rodeghiero study used age-matched and gender-matched controls who were in good health and had never been referred for evaluation of hemorrhagic symptoms. Normal laboratory testing was not required . Controls in all three of the other primary Vicenza-based studies were healthy individuals who had never sought medical attention for bleeding symptoms and who had normal VWF levels [6,7,26].
Undoubtedly, the main focus of the Vicenza-based BATs presented in this review has been VWD, but there are a few notable exceptions. The Condensed MCMDM-1 VWD Bleeding Questionnaire has been studied prospectively as a screening tool for PFDs, and the sensitivity, specificity, PPV and NPV are 86%, 65%, 0.50 and 0.92, respectively . The 2011 Tosetto study (which used the Condensed MCMDM-1 VWD Bleeding Questionnaire) evaluated patients newly referred for all hemorrhagic disorders, and presented data on VWD, PFDs, and FXI deficiency, as well as senile purpura and Rendu–Weber–Osler disease . The analysis of diagnostic utility was reviewed previously in the section on Vicenza-based tools. This study concluded that BATs in conjunction with the activated partial thromboplastin time improve the evaluation of patients suspected of having a mild bleeding disorder, even in a low-prevalence setting . Finally, as mentioned, the PBQ has been studied in 23 children with PFDs; this was purely a descriptive study, and analysis of diagnostic utility was not performed .
The original Vicenza bleeding questionnaire was designed to be used before diagnosis; however, as mentioned, a number of studies have been performed evaluating the performance of these tools as a standardized way of describing disease severity. Of critical importance for this indication is the impact of the diagnosis of a bleeding disorder on the natural history of the disease. Following diagnosis, patients are typically given hemostatic prophylaxis prior to invasive procedures or surgical operations, and investigators need to take care not to include these treatments in the calculated bleeding score. Failure to do so will result in false elevations of the overall bleeding score.
There are differences in the published studies reviewed here because of heterogeneity of patient populations and methods of analysis, but, in general, our ability to predict who is not going to bleed is far superior to our ability to predict who is going to bleed. In some settings, this may be useful; however, in others it challenges us to continue to work to optimize our tools. It is plausible that the expectation that one bleeding assessment tool can serve both clinical objectives well in a variety of clinical settings is far too ambitious. Additionally, many of the existing tools are too long to be of value in a busy clinical practice, and additional study is also required to identify the most discriminatory questions from the perspective of screening, and the most useful questions in terms of assessing disease severity.
Ongoing challenges and future directions
Despite the well-recognized ideal of standardization of BATs, we have reviewed at least 10 different versions in this report, most with multiple independent publications. Additionally, the best scoring system, even among the Vicenza-based BATs, remains a subject of debate. To date, two publications have addressed this issue, comparing the 0 to + 3 with the − 1 to + 4 scoring system. Neither publication showed clear superiority of one over the other, although, for use as a screening tool, particularly in the pediatric population, eliminating the − 1 scores was advantageous [33,34]. Further study is necessary to definitively resolve the debate.
An additional issue, particularly for children who have not experienced hemostatic challenges, is the long-term clinical behavior of patients assessed with BATs. The tools are useful for predicting the diagnosis of VWD, but studies evaluating whether or not the tools can directly predict future bleeding episodes are lacking. This may be less of a concern for bleeding scores in the adult population, where the clinical behavior of accumulated exposures to hemostatic challenges is captured.
Our ability to address critical clinical questions, such as how to optimize treatment on the basis of the risk of bleeding for various situations, is dependent on studies with significant sample sizes. One potential approach to this challenge is to create a system that would allow the merging of existing datasets, rather than setting out to undertake additional prospective studies. Such an approach is currently underway, utilizing the bioinformatics capabilities at Rockefeller University. Through international collaboration, it is possible that our collective legacy data could help to direct our future treatment protocols. Ultimately, the goal of this field is to improve care for individuals with inherited bleeding disorders. We envision a web-based system, accessible by interested researchers and clinicians, that presents the best questions based on extensive study in the most efficient manner, no matter what the clinical setting or patient presenting complaint.
Disclosure of Conflict of Interests
The authors state that they have no conflict of interest.