Fresh blood for everyone? Balancing availability and quality of stored RBCs


Walter Dzik, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA.
Tel.: (+1) 617 726 3715; fax: (+1) 617 726 6832;


summary Effective, prolonged ex vivo storage of red blood cells is an essential requirement for inventory management of each nation’s blood supply. Current blood storage techniques are the development of a century of research. Blood undergoes metabolic and structural deterioration during prolonged ex vivo storage. Current regulatory requirements for the limit of blood storage do not consider oxygen delivery or clinical outcomes. Recently, concern that stored blood may produce adverse outcomes in recipients has sparked renewed interest in studies that can evaluate the clinical impact of fresh vs. stored blood. In 2008, a retrospective study of outcomes among transfused patients undergoing cardiac surgery suggested that blood stored for more than 14 days prior to transfusion led to higher perioperative mortality compared with blood stored for less than 14 days. Here, we critically examine the details of the above-mentioned study. Numerous substantial flaws in data analysis and presentation may have led to an erroneous conclusion about the effect of blood storage age and perioperative mortality. Given the fundamental importance of a safe and adequate blood supply to national healthcare, the question of the proper storage age for blood should be studied using a prospective study design.

The preservation of human red blood cells (RBCs) outside the body is a rather remarkable phenomenon when you think about it. Although suspended animation of humans is the stuff of science fiction cinema, this futuristic world is an everyday occurrence in Transfusion Medicine. Modern blood preservation was established through pioneering work in several nations. In 1916, Rous and Turner demonstrated that RBCs mixed with citrate and dextrose could be stored in a refrigerated state for several days and successfully retransfused to rabbits (Klein & Anstee, 2005). The technique was applied in 1918 to Allied troops in France by Robertson. Huge demands for stored blood during the Spanish Civil War and World War II led to the development of acid–citrate–dextrose solution by Loutit and others. In 1960s, Simon pioneered the use of adenine additives. Further modifications were made following advances by Greenwalt, Hogman, Valeri, Heaton, Moroff and many others who applied increasingly sophisticated understanding of RBC metabolism to the science of blood preservation (Klein & Anstee, 2005).

Nevertheless, RBCs suffer under the strain of ex vivo storage and undergo biochemical, structural, enzymatic, morphological and functional deterioration. Chief among these are reduced post-transfusion survival; loss of 2,3 diphosphoglycerate (DPG); depletion of ATP; reduction in Na+–K+ gradients; increase in osmotic fragility and membrane changes including microvesiculation and haemolysis (Hess, 2006). Although many of these changes are reversible following transfusion, it is very reasonable to predict that there is a limit to RBC storage beyond which transfusion is unfavourable. Identifying that limit is an important scientific priority and is central to all strategies that maintain a national blood inventory. Without an adequate inventory of blood, there would be no modern surgery, chemotherapy or intensive care medicine. A safe and adequate blood supply is a national medical treasure frequently taken for granted.

In parallel with the development of the science of blood preservation came regulatory guidance for the acceptable duration of blood storage. The hallmark measure has been RBC survival of an aliquot of cells obtained at the maximal duration of preservation. After an aliquot of stored cells is radiolabelled and injected into an autologous donor–recipient, the percentage of injected cells circulating 24 h after transfusion is measured. One expects to observe at least 75% recovery of the stored cells after 24 h. This storage requirement has the advantage of demonstrating in vivo survival rather than using an in vitro surrogate measure. It is clearly sensitive to storage-related deterioration of cells, as demonstrated by the decline in survival following gamma irradiation and storage of blood. However, the test has numerous shortcomings. These include the failure to require data on the duration of cell survival following transfusion; dependence on only a small aliquot of injected blood rather than a therapeutic dose; acceptance of up to 25% in vivo destruction of the injected product and most importantly, a standard based only on the ‘presence’ of the cells and not on their functional capability. Proving that stored blood will be present in the circulation after transfusion is good, but it would be far better to demonstrate that the cells were delivering oxygen to tissues. The increasing availability of new techniques to assess tissue oxygenation holds great promise as a method that will permit assessment of the functional properties of stored blood after transfusion. Once these methods are refined and applied, we should look forward to an improved biological qualification of the proper storage limit.

Concern that stored blood may meet current regulatory guidelines but may not perform normally in the microcirculation or, worse yet, might even be deleterious to tissue oxygenation is not new. Recently, this concern has produced a resurgence of interest in ‘fresh’ blood transfusion. The psychological appeal of this notion is enormous and should not be underestimated by anyone evaluating the published literature in this field. Many clinicians simply want to prove that fresh blood is better because they feel it must be so – perhaps because the ability to store human tissue for a prolonged period ex vivo is, in fact, such an astonishing achievement. The appeal of fresh blood has spawned a series of retrospective data reviews analysing patient outcomes in relation to the duration of blood storage (Purdy et al., 1997; Zallen et al., 1999; Vamvakas & Carven, 2000; Mynster & Nielsen, 2001; Offner et al., 2002; Leal-Noval et al., 2003; Basran et al., 2006; van de Watering et al., 2006). This year, the potential adverse consequences of transfusing stored blood received even more attention following publication in the New England Journal of Medicine of a paper entitled ‘Duration of red-cell storage and complications after cardiac surgery’ (Koch et al., 2008). Let us take a closer look at this work.

The authors report a retrospective study analysing data collected on patients who underwent cardiac surgery at the Cleveland Clinic in the United States. The data spanned the years from 1998 to 2006. This detail alone bears noting because in recent years, cardiac surgery has been applied to increasingly high-risk patients. Thus, any comparisons made between patients should consider the year of surgery. The authors retrieved data on 2872 patients who were transfused with 8802 units of RBCs, all of which were <14 days of storage age (fresher blood). The comparison group consisted of 3130 patients transfused with 10 782 RBCs, all of which were stored >14 days (older blood). The survival of the two groups of patients was compared with a Kaplan–Meier survival statistic. The authors concluded that among patients undergoing cardiac surgery, transfusion of RBCs stored for >14 days was associated with more complications and with reduced short-term and long-term survival. Was their conclusion justified?

At first glance, some complications of surgery, listed in Table 1, appear more commonly among those who received older storage-age blood. There is no doubt that in-hospital mortality, need for prolonged ventilation support, development of renal failure and multi-organ dysfunction were statistically more frequent among the patients who received blood stored for >14 days. However, the magnitude of the differences was actually rather small for a study of this size. More importantly from a statistical standpoint, the paper reports 20 comparisons of complications with no correction made for multiple comparisons. (It is possible, of course, that even more than 20 comparisons were actually made but that only 20 were selected for presentation in the manuscript.) Whenever multiple comparisons are made, one must consider the statistical fact that excessive comparisons may lead to a statistical difference by chance alone. Several methods exist to ‘correct’ or compensate for multiple comparisons. One of the simplest is to divide the value used for significance (the ‘alpha’, generally 0·05) by the number of comparisons and requires that any difference be held to that new alpha standard. In this case, 0·05 divided by 20 would be P = 0·0025. This would be a conservative estimate if other comparisons, not presented in the manuscript, were actually made. Using this revised P value, most of the previously significant differences disappear including in-hospital mortality and multi-organ failure. The difference in proportion of individuals with renal insufficiency becomes of marginal statistical significance. Only the difference in requirement for prolonged ventilatory support maintains statistical significance. Despite these statistical shortcomings, the general sense is that those who received older storage-age blood suffered more medical complications compared with those receiving younger storage-age blood, which leads us to ask whether these two groups were suitable for comparison in the first place.

Table 1.  Synopsis of complications found to be significantly different between those receiving younger storage-age blood and those receiving older storage-age blood in the study by Koch et al.*,†
ComplicationStorage age <14 days, n (%)Storage age >14 days, n (%)P value
  • *

    The original table presented 20 total comparisons and the other 14 were not found to be significantly different. Adapted from Koch et al. (2008).

  • Comparisons are unadjusted for differences between the two groups. The χ2 test or Fisher’s exact test was used for comparisons.

In-hospital death49 (1·7)88 (2·8)0·004
Ventilation >72 h160 (5·6)304 (9·7)<0·001
Respiratory insufficiency177 (6·2)278 (8·9)<0·001
Renal failure45 (1·6)84 (2·7)0·003
Septicaemia80 (2·8)125 (4·0)0·01
Multi-organ failure7 (0·2)23 (0·7)0·007

Prospective, randomized trials are so much more difficult to perform than retrospective analysis of data already present in computer databases. For one thing, prospective randomized trials require considerable prior planning and regulatory approval and require informed consent of each participant. The enormous effort invested in prospective randomized trials is warranted by their one exquisite and unique attribute, namely that the outcomes of the groups are truly suited for comparison. Randomization balances the many unknowable effects that influence biological outcomes. When one randomizes between treatment ‘A’ and ‘B’, all other factors affecting outcome should, by chance, be balanced between the two groups. Randomization is the best protection for the influence of these other factors that confound comparisons of two groups. The study by Koch et al. was not a randomized trial, and so our attention turns to whether the two comparison groups were equal for clinical factors other than the storage age of blood. This information is summarized in Table 2.

Table 2.  Characteristics of patients in the two comparison groups in the report of Koch et al.*
CharacteristicsStorage age <14 days (n = 2872), n/n (%)Storage age >14 days (n = 3130), n/n (%)P value
  • *

    The table lists those characteristics found to be statistically different between the two comparison groups at baseline. Adapted from Koch et al. (2008).

  • P values determined by the χ2 test.

Blood group of patient <0·001
 Group A992/2860 (34·7)1542/3120 (49·4) 
 Group B303/2860 (10·6)449/3120 (14·4) 
 Group O1456/2860 (50·9)949/3120 (30·4) 
 Group AB109/2860 (3·8)180/3120 (5·8) 
Leucoreduction used <0·001
 Yes1037 (36·1)1723 (55) 
 No1724 (60)1050 (33·5) 
 Mixed111 (3·9)357 (11·4) 
Abnormal left ventricular function1662 (57·9)1975 (63·1)<0·001
Mitral regurgitation1842 (64·1)2105 (67·3)0·01
Peripheral vascular disease1563 (54·4)830 (58·5)0·002

As seen in Table 2, the two groups were quite different in the distribution of ABO groups among the patients. The younger storage-age group was populated with 53% group O patients. In contrast, the older storage-age group had only 31% group O patients, P < 0·001. The ABO distribution is likely to be important because von Willebrand factor is known to be lowest among group O individuals. Coronary syndromes (a likely driver of survival in this population of patients) have been shown to be more extensive in group A individuals compared with group O individuals (Ketch et al., 2008). As expected, the strong ABO imbalance in the patients was reflected in a matching ABO imbalance of the RBC units transfused, with a much higher proportion of group O RBCs being administered to the younger storage-age group.

Indeed, issues of ABO inventory management are likely to actually underlie the fundamental allocation of the two comparison groups. The fact that group O units tend to have a younger average shelf-life at the time of issue will come as no surprise to any Transfusion Medicine audience. Thus, at the time of retrospective collection of data, it is not surprising that patients who exclusively received RBCs stored <14 days were disproportionately weighted with group O recipients. Although blood storage age appears prominently in the title of the manuscript, the actual comparison groups are strongly driven by ABO and it would have been far better had the authors stratified their comparisons according to recipient ABO group.

Putting aside the issue of ABO, Table 2 also demonstrates that the two ‘comparison groups’ were, in fact, substantially different for several other factors that would be expected to influence the outcome being compared. For example, the group receiving older storage-age blood had a statistically higher proportion of individuals with poor left ventricular function, with peripheral vascular disease and with mitral regurgitation. In addition, a much higher percentage of patients receiving older blood were transfused with leucoreduced blood, suggesting that the cohort of patients receiving older storage-age blood underwent surgery more recently and thus may have represented higher risk patients. When one sees striking differences between the baseline characteristics of the comparison groups for factors that were assessed, one can expect that other factors (not assessed) were likely to be different as well. Thus, the final comparisons of this study bear little or no resemblance to those that would have been obtained had a randomized study been performed.

Sick patients receive more transfusions. Thus, massive transfusion is always correlated with mortality. Any study that focuses on death or severe clinical complications among transfused patients must be sure that the proportion of massively transfused subjects is similar in the two groups being compared. Indeed, in an earlier report, these same authors demonstrated that fatal outcomes in cardiac surgery patients were concentrated among those receiving ≥9 units of RBCs (Koch et al., 2006). Thus, in the current paper, it would have been helpful had the authors demonstrated that the proportion of heavily transfused patients was equal in their two ‘comparison’ groups. Unfortunately, they do not present this data. Rather, they report that the overall distribution of RBC usage was similar to the two groups, but this information is statistically dominated by many patients who received only 1–3 units of RBCs. In their report, the graph displaying the distribution of blood usage shows that a greater proportion of patients in the older storage-age group received ≥9 units of RBCs. Thus, it is possible that the ‘comparison groups’ were not balanced for the proportion of very sick individuals receiving a large ‘dose’ of blood.

Finally, we turn to the main outcome measure, the Kaplan–Meier graph of survival. This graph, illustrated in colour in the journal, shows that the group receiving older storage-age blood has a poorer survival than the group receiving younger storage-age blood. The key to this graph is in the legend beneath the figure. There it states in rather fine print: ‘In this un-adjusted comparison, the percentage of patients receiving.…’. This is a critical detail of the manuscript. What does ‘unadjusted comparison’ mean? It means that the Kaplan–Meier graph shown to the readers of the journal displays survival outcomes for the two groups without ‘correcting for’ or ‘adjusting for’ the baseline differences between the two ‘comparison groups’. Thus, the effects on survival of worse left ventricular function, more peripheral and valvular heart disease and higher proportions of group A individuals are each included in the shape of the survival curve. Given this lack of adjustment, the fact that the two lines are labelled simply ‘newer blood’ and ‘older blood’ is misleading. Indeed, the line with reduced survival in the Kaplan–Meier graph would have been more accurately labelled: ‘those who received older blood and had worse left ventricular function and more peripheral vascular disease and more valvular heart disease and were more weighted with group A’s and may have required more transfusion support’. Thus, the principal graphic of this paper represents a comparison between two groups that really may not be suitable for comparison and may represent an ‘apples vs. oranges’ comparison. To be fair to the authors, they did perform a logistic regression analysis designed to adjust for confounding baseline variables. We can only presume that this failed to show a difference in survival between the two groups following adjustment for the baseline differences, otherwise adjusted survival curves would have been presented. Instead, their adjusted analysis focused on a ‘composite outcome’ contrived by the authors to include any 1 of 20 adverse events. The results showed that this composite outcome was more frequent among those receiving older storage-age blood after adjusting for several baseline variables. Although the adjusted analysis on the composite outcome gives more credence to the conclusion that older storage-age blood is deleterious, it is unfortunate that the graphic display shows unadjusted data, that results were not stratified by ABO group or left ventricular function, that date of surgery was not considered and that a host of unmeasured baseline differences between the two groups could not be accounted for in their adjustments.

During the bloodshed of 20th century wars, when RBC preservation technology was developed, the world was in desperate need of any method that might extend the storage age of blood. But in the current century, when patients and practitioners in wealthy nations expect a safe and plentiful blood supply, scrutiny of former assumptions underlying the acceptable period for blood storage is an understandable and healthy enterprise. Indeed, those who are stewards of the blood supply have an obligation to demonstrate that the RBCs provided do indeed benefit the recipient and do not cause harm. However, medical researchers, physicians and the public should not take the nation’s blood inventory for granted. Before any changes are made that threaten the availability of supply, clinicians and researchers have an obligation to all patients to investigate blood storage questions in a rigorous, scientific manner without serious methodological flaws and without preconceived bias. The work of Koch et al. does not come close to measuring up to that obligation.

The truth will be found, of course, through the conduct of a prospective randomized trial that assigns recipients randomly to receive either younger storage-age or older storage-age blood. The best design for this trial would not compare patients on either side of a single date of storage (such as day 14) as Koch did. Furthermore, comparing fresh blood with mid-storage aged (‘standard’) blood is also a flawed design because a negative study outcome (no difference) would not answer the question of whether prolonged storage is disadvantageous; and an outcome favouring only fresh blood would promote an unrealistically short shelf-life that would threaten availability of the national blood supply. Rather, a non-inferiority trial design could compare patients receiving the two extremes of blood storage – e.g. comparing outcomes among those randomly assigned to receive exclusively blood stored 10 days or less with those assigned to receive exclusively blood stored 30 days or more. Comparing groups who are given the two extremes of blood storage is methodologically the best choice to answer the question regarding storage age and is ethically appropriate, given the current equipoise regarding potential risks of younger storage-age blood (graft vs. host disease) and older storage-age blood (impaired oxygen delivery). If an adequately powered study based on this simple design failed to show any clinically important difference between the two groups, then the storage age question should be put to rest. If the group assigned to receive older storage-age blood (or younger!) fared worse, then the allowable storage age for blood used for all patients could be adjusted accordingly.

In the meantime, there are other ways to examine this issue. Perhaps, investigators should re-examine data sets from previously conducted randomized transfusion trials. For example, prospective randomized trials of leucoreduced vs. non-leucoreduced blood will at least compare outcomes among transfusion recipients with balanced clinical comorbidities. By extracting subsets of patients from these studies and grouping them according to those who received exclusively younger storage-age vs. exclusively older storage-age blood, some insight may be obtained into the effect of storage age among recipients who are more suitable for direct comparison. In addition, data sets can be examined that group patients into ordinal data groups (youngest storage-age, middle storage-age and oldest storage-age) to examine whether any observed outcomes are progressively affected by increasing duration of storage. This approach, which treats storage age more as a continuous variable, would be better than the approach used by Koch, which treated storage age as a two-sided dichotomous variable. In addition to the reanalysis of existing data sets, it may prove useful to model the impact of different expiration dates on the availability of supply. Finally, we may wish to give more attention to the outcomes among patients who undergo RBC exchange transfusion. In my hospital programme, we use ‘standard age’ blood for RBC exchanges given to patients with severe sickle cell syndromes. Some of these individuals are outpatients receiving exchange transfusions following prior central nervous system events. It is striking to consider that these patients undergo massive transfusion (>10 units) over a period of approximately 100 min, using blood unselected for storage age, and certainly suffer no obvious adverse events. This observation alone underscores the weakness of studies that suggest causality (rather than correlation) between transfusion and acute mortality. With the informed consent of RBC exchange patients, one might alternately transfuse either all younger storage-age or all older storage-age blood during a series of RBC exchanges. This may prove to be a valuable model because each patient could then serve as his or her own control.

How should we respond to requests for ‘fresh’ blood for cardiac patients? In light of the many flaws in the Koch paper, I would strongly urge hospitals and clinical groups not to make a policy favouring fresh blood for cardiac surgery patients at this time. This is based on two concerns: first, when critically examined the evidence from the Koch paper that fresher blood is actually better is weak, and it is entirely possible that fresh blood is worse for cardiac surgery patients. In the 1980s, when Japan emphasized the use of fresh blood for cardiac surgery, the nation witnessed an explosive increase in the frequency of transfusion-associated graft vs. host disease (Takahashi et al., 1994). This event highlighted the fact that allogeneic blood is still a form of organ transplantation and that storage age affects more than just RBC function. Preventing this complication by universal gamma irradiation would impart an even greater storage lesion to RBCs. Second, if indeed there is an advantage to younger storage-age blood, then there is no reason to believe that such an advantage is restricted to cardiac surgery patients. In my view, it is ethically incorrect to assign a large proportion of the blood supply to one group to confer a presumed advantage when this implies that, by default, other patients will receive blood presumed to be disadvantaged. So, if we are to shorten the shelf-life of RBCs, we will need to do this for all transfusion recipients unless data prove that the advantage only applies to a particular group of recipients. Given the fragile balance of blood supply and demand on which modern medicine depends, the decision to reduce the period of allowable blood storage is an important one that deserves data generated from carefully designed prospective randomized clinical trials. Let us do the studies and find out the truth.