- Top of page
- LITERATURE CITED
- Supporting Information
The Pearson correlation coefficient (PCC) and the Mander's overlap coefficient (MOC) are used to quantify the degree of colocalization between fluorophores. The MOC was introduced to overcome perceived problems with the PCC. The two coefficients are mathematically similar, differing in the use of either the absolute intensities (MOC) or of the deviation from the mean (PCC). A range of correlated datasets, which extend to the limits of the PCC, only evoked a limited response from the MOC. The PCC is unaffected by changes to the offset while the MOC increases when the offset is positive. Both coefficients are independent of gain. The MOC is a confusing hybrid measurement, that combines correlation with a heavily weighted form of co-occurrence, favors high intensity combinations, downplays combinations in which either or both intensities are low and ignores blank pixels. The PCC only measures correlation. A surprising finding was that the addition of a second uncorrelated population can substantially increase the measured correlation, demonstrating the importance of excluding background pixels. Overall, since the MOC is unresponsive to substantial changes in the data and is hard to interpret, it is neither an alternative to nor a useful substitute for the PCC. The MOC is not suitable for making measurements of colocalization either by correlation or co-occurrence. © 2010 International Society for Advancement of Cytometry
- Top of page
- LITERATURE CITED
- Supporting Information
The quantification of colocalization between two fluorescence channels broadly divides into two categories: (1) methods that simply consider the presence of both fluorophores in individual pixels, which we call co-occurrence and (2) those that examine the relationship between the intensities, correlation. The two categories are different and full co-occurrence is compatible with zero correlation, while a high correlation can be found among the co-occurring pixels even when co-occurrence is rare.
The co-occurrence of fluorophores may simply reflect physicochemical similarities between two fluorescent molecules or antigens: hydrophobic molecules will partition into membranes, hydrophilic molecules to the cytoplasm while amphiphilic molecules are mostly found at interfaces. Co-occurrence can be quantified by expressing the number of co-occurring pixels as a fraction of the total number or by using the M1 and M2 coefficients which, separately for each fluorophore, record the fraction of the total fluorescence that co-occurs (1).
A correlation between the intensities could reflect a direct molecular interaction or an indirect interaction, with a third molecule or with subdomains of a cellular compartment. The variability of the fluorescence and therefore the potential for correlation arises from inhomogeneities within a domain. A correlation between two fluorophores is likely to be of greater biological significance than co-occurrence, though any change in colocalization that can be related to an experimental intervention is of interest. There is a need to measure colocalization and the accuracy with which measurements can be made sets the limits for an observable physiological response.
Two measures of correlation appear in most software, the Pearson correlation coefficient (PCC) and the Mander's overlap coefficient (MOC) (1). The PCC is a well-established measure of correlation, originating with Galton in the late 19th century (2), but named after a colleague, and has range of +1 (perfect correlation) to −1 (perfect but negative correlation) with 0 denoting the absence of a relationship. Its application to the measurement of colocalization between fluorophores is relatively recent (3). The MOC lacks the pedigree of the PCC and was created to meet perceived deficiencies in the PCC, principally that the PCC “is not sensitive to differences in signal intensity between the components of an image caused by different labeling with fluorochromes, photobleaching or different settings of amplifiers” and “the negative values of the correlation coefficient (PCC) are difficult to interpret when the degree of overlap is the quantity to be measured” (1), much repeated claims (4, 5).
The two measures are mathematically similar, differing only in the use of either the absolute intensities (MOC) or the departure from the mean (PCC) in both the numerator and the denominator. The numerator is the sum of the products of the two intensities (which we will for convenience refer to as red and green) in homologous pixels and the denominator computes the maximal product, corresponding to perfect colocalization. The method works because the numerator is maximized when the relative intensities of the two fluorophores coincide: high with high and low with low, while combination of high with low reduces the sum of their products. The denominator acts to limit the range of the coefficients: 0 to +1 for the MOC and −1 to +1 for the PCC.
Two other measures of correlation have been used to quantify colocalization, the intensity correlation quotient (ICQ) (6, 7) and the Spearman rank correlation (SRC) (8, 9), both derived from the PCC. The SRC is a well-established statistical test and is simply the PCC applied to ranked data: intensities are replaced by the order in which they occur. The ICQ goes a step further than the SRC and only considers the sign: whether each of the two intensities are above or below their respective mean intensity. The numerator is the number of pairs of intensities that have a common sign, either minus and minus or plus and plus. The denominator is just the number of intensity pairs. This would give the ICQ method a range from 0 to 1 but, to align negative correlations with a negative coefficient, 0.5 is subtracted from the calculated values, creating a range from −0.5 to +0.5 (6).
It is hard to visually assess the degree of colocalization from a pair of images even when they are overlayed. A more informative alternative is to display the intensities of the pairs of homologous pixels in a scattergram (Fig. 1A). Each axis covers the intensity range of one of the fluorophores and the scattergram shows the frequency of occurrence of each pair of intensities, which reveals any correlation between the fluorophores.
Figure 1. Colocalization, scattergrams, and regions of interest (ROI). A: Two images and their log frequency distribution histograms. For each pixel in the pair of fluorescent images, the two intensities are used as the coordinates of an entry in the scatterplot. This shows the relationship between the two fluorophores. Pixels from the whole area, including areas outside the cell are included. A grayscale look up table shows the frequency of occurrence of each pairs of intensities. Note that a white background has been used for the scattergrams and that the fluorescent images have been contrast stretched for display purposes, but that the histograms and scattergrams show the original distribution. B: Correlation measurements and the background intensity. C: Scattergrams for different ROIs (inserted top left), showing which pixels were included in the analysis: the nuclear ROI is speckled because an intensity threshold (mean plus twice the standard deviation) was also employed.
Download figure to PowerPoint
Despite the appearance of several reviews on colocalization (4, 10–12) and related literature, it is surprising that a critical comparison of the methods used to measure colocalization using correlation has not been undertaken. This we seek to remedy.
- Top of page
- LITERATURE CITED
- Supporting Information
It is self-evidently worthwhile to quantify colocalization but the plethora of available coefficients (PCC, MOC, ICQ, SRC, M1, M2, k1, and k2) and their differing meanings, can be confusing. We have made a detailed examination of two of the coefficients used for correlation measurements, the PCC and its derivative the MOC, to establish how they work and whether they are useful. The two coefficients are almost identical and differ only in the use of the absolute intensity, by the MOC, or the deviation from the mean, by the PCC, a small but significant change.
The MOC was created as an improvement on the PCC, to be “…especially applicable when the intensities of the fluorescence of detected antigens differ” (12) and because the PCC “is not sensitive to differences in signal intensity between the components of an image caused by different labeling with fluorochromes, photobleaching or different settings of amplifiers” (1), claims that have been repeated uncritically (5, 11). We find that both the PCC and MOC are, within wide limits, independent of the magnitude of the signal. Therefore, the major claim made for the MOC falls.
Two further coefficients, k1 and k2, were derived from the MOC, by using only the intensity of one fluorophore in the denominator (1). The product of the intensities of both fluorophores is then related to the intensity of a single fluorophore, hence the need for two coefficients. Absolute intensity is embedded in the k1 and k2 coefficients, but image acquisition is almost always adjusted to fit the detector's response range and the actual intensities have little meaning. k1 and k2 really require the actual number of molecules present in each pixel. Even comparisons between cells imaged under standard conditions are problematic because uptake and expression of fluorescent molecules varies widely. The k1 and k2 seem to have no advantages over the M1 and the M2 coefficients that were concurrently launched (1). M1 and M2 calculate for each fluorophore the fraction of the total intensity that co-occurs. The absolute values of k1 and k2 would only become meaningful if intensities were replaced by an estimate of the number of molecules present. However, even photon counting methods are not used routinely in biological imaging and estimating the number of molecules is difficult.
Offset has a differential effect on the PCC and the MOC. The PCC is completely independent of shifts in the signal but the MOC can either increase or, more surprisingly, be decreased, by positive offsets. The MOC works because the product of the intensities (the numerator) is less than or equal to the denominator. A positive offset increases the numerator more than it increases the denominator and the MOC rises. A fall in the MOC after a positive offset to one of the images is therefore unexpected. It arises when the correlation is negative and high intensities in one image correspond to low intensities in the second image, and vice versa. Then, the increase in the numerator is less than the increase in the denominator and the MOC falls. An increase in the MOC followed by a progressive fall is also possible when a single image is offset. This occurs with low intensities, low enough that the product with homologous pixels is nearly zero, and when a small increase, say from 1 to 2, has a bigger effect on the numerator than on the denominator. In this limited sense, the MOC is sensitive to the absolute signal. It is therefore important that the offset be set correctly, i.e., zero fluorescence produces zero detection. Correctly setting the offset is important since the position of the intercept contains useful information: a line that does not pass through the origin indicates that part of the fluorescence is independent of the second fluorophore and of the correlation between the two fluorophores. Since the size of any offset is not reported by the PCC, it could be considered a limitation.
The inclusion of uncorrelated pixels with low intensities, which emulate background pixels, has a profound effect on the PCC but leaves the MOC unchanged. The PCC becomes more positive and the effect on low or negatively correlated PCCs is substantial even when the percentage of background pixels is very small. The practical consequences are that the accurate measurement of the PCC requires the exclusion of background pixels, which should be standard practice. The failure to exclude pixels devoid of fluorescence transforms an apparently uncorrelated relationship into a highly positive PCC (10). The corollary is that pixels with intensities close to the mean affect the MOC but not PCC.
Combining two positively correlated populations appreciably reduces both the PCC and the MOC, although the absolute change in the MOC is smaller. This is a limitation of both coefficients. The coefficients summarize what may actually be a complex relationship that might include differently correlated subpopulations and nonlinear relationships. The PCC underestimates nonlinear relationships and the rank Spearman coefficient is a viable alternative (8, 9). The original images and the scattergram should always be examined, even though visual assessment is imperfect (13).
When a scattergram suggests a complex relationship, it is tempting to select and then separately analyze any subpopulation (14). However, this is a fundamental error in data analysis, since the selection of the subpopulation is based on the very relationship for which an objective measurement is required. A legitimate alternative is to select biologically meaningful areas for analysis, e.g., individual cells rather than a tissue or to separate the cytoplasm from the nucleus. This might initially involve selecting a distribution from the scattergram and establishing its spatial origin in the specimen, but if a physiologically relevant area is highlighted then all the pixels in that area must be considered in the correlation analysis, i.e., if the “interesting” pixels come from the cell nucleus it is not legitimate to analyze only the selected pixels.
The explanation for the different properties and sensitivities of the PCC and the MOC lies in the different weighting given to the intensities of the two fluorophores. Since the PCC is based on differences from the mean, intensity pairs near the mean are of little consequence whereas those at the extremes of the intensity range are highly influential (8), hence the consequences of including background pixels. In the MOC, combinations of high intensities carry significant weight while combinations, where one or both of the pair is/are of a low intensity, have little influence on the numerator and a small influence on the denominator. This seems like an attractive feature, for a correlation coefficient, but a strong correlation requires that a match exists across the whole intensity range, including low intensities and the MOC is blind in this region. One high-intensity pair can produce a MOC that is almost unaffected by any number of blank or low-intensity combinations, which undermines its value as an overlap coefficient and makes the MOC a poor measure of co-occurrence. The biggest difference between the MOC and the PCC is apparent in the pattern of weightings for the numerator and denominator, they are similar for the MOC but differ with the PCC. The ratio of the numerator to the denominator shows one main axis for the MOC but two axes for the PCC, one strongly negative. This makes the PCC an effective measure of correlation. The different pattern of weighting explains the quite different meanings of a coefficient of zero: the PCC reports zero when there is no relationship between the intensities whereas the MOC reports zero only when the two fluorophores totally avoid each other.
The SRC is attractive because, unlike the PCC, it does not require a normally distributed population, a prerequisite that many biological specimens may not meet. The SRC also detects nonlinear correlations and is less sensitive to outlying datapoints than the PCC (8, 9). It might be good practice to compare SRC and PCC and examine the raw data should they differ.
A new correlation method that counts only whether intensities are above or below the mean has been developed (6). The ICQ method simply expresses the number of matching pixels, when both are either above or below their mean, as a fraction of the total, and then subtracts 0.5. The subtraction ensures that negative correlations have a negative quotient, within a −0.5 to +0.5 scale. This scale differs from the more common −1 to +1 generally used for correlation. A remedy is simply to double the ICQ (15). The ICQ is a simple and therefore intelligible coefficient. The disadvantage is that pixels marginally above the mean carry exactly the same weight as pixels with more extreme intensities. Therefore, the ICQ is sensitive to changes in pairs of pixels that fall near the mean intensity of either fluorophore. By comparison, the PCC is almost unaffected by changes in this subset of pixels. Surprisingly, the ICQ was not rigorously compared with established coefficients when introduced (6). The ICQ performed well over the range of correlations produced by changing the copy fraction, being similar to the PCC and SRC. A tendency to flip between values was seen when a single pixel was moved and examination of the weightings suggests that there are datasets, which could undergo substantial changes without affecting the ICQ. The ICQ is nevertheless an interesting innovation.
A mistake often arises when two fluorophores that do not co-occur, with perhaps one in the cytoplasm and the other in the nucleus, are nonetheless tested for correlation. The PCC then reports a negative correlation, whereas the MOC reports a plausibly low value, the one occasion it delivers. This PCC is clearly spurious but these negative correlations are not always recognized as artifacts (10). It is important to differentiate between a true negative correlation, where high intensities are matched with low intensities, and this “not in the same place” error. The lack of co-occurrence could be detected by the M1 and M2 coefficients. We strongly advocate thresholding to exclude pixels which do not contain both fluorophores and the separate analysis of biologically distinct regions. Automatic thresholding, using the idea that the background pixels are uncorrelated (16) or based on the background mean and standard deviation, are alternatives to operator controlled thresholding.
The MOC is considered to be easier to interpret than the PCC since it only reports positive values (4, 5). Since negative correlations can arise, for example, an enzyme that converts a fluorescent molecule into a nonfluorescent form, quenching, FRET or localized avoidance, it seems appropriate to record them. The original case for the MOC “the negative values of the correlation coefficient (PCC) are difficult to interpret when the degree of overlap is the quantity to be measured” (1) is much more restricted and includes the important caveat “when the degree of overlap is of interest.” Like many caveats, this one has been overlooked in the discussion of the PCC and the MOC (4, 5).
The question arises as to what specifically “overlap” refers to in the context of the MOC, it remains undefined in the original article (1), unless the equation for the MOC is taken to be the definition and “an overlap coefficient equal to 0.5 implies that 50% of both components of the image overlap” is accepted, a claim for which there appears to be no justification. The assumption is that overlap is some measure of the degree of similarity in the distribution of two fluorophores, but the MOC is a curious hybrid measure combining elements of correlation with a highly weighted form of co-occurrence. It is in no way comparable with either the percentage of pixels in which co-occurrence is found nor to the M1 and M2, coefficients which report the fraction of each fluorophore's intensity that co-occurs.
It has been suggested that a threshold exists for values of the PCC (10, 12) and the MOC (12) that mark biologically meaningful colocalization and, conversely, below which colocalization is deemed unimportant. It has been stated that no conclusions can be drawn from a PCC between −0.5 and 0.5 (10) and the MOC's threshold is apparently 0.6, for which no supporting evidence or rationale has been presented (12). Our results show that a MOC of <0.6 cannot be obtained even from datasets that show minimal or even negative correlation and that low values of the PCC have biological meaning (17). Even after randomly shuffling the pixel intensities, the MOC can still return values above 0.6 while randomization, predictably, reduces the PCC to zero but more surprisingly leaves the ICQ positive (15). However, since the PCC and MOC are graded measures the very idea of a threshold is strange (18), especially for values close to the nominal threshold, where a minor shift in the measurement would reverse the interpretation. The relevant biological consideration is whether the measured colocalization is changed experimentally. Even small changes “half an eye is just 1% better than 49% of an eye” (Richard Dawkins) and “information is any difference that makes a difference” (Gregory Bateson) (19) can be important. A more pertinent consideration is the accuracy and precision with which measurements can be made. It is acknowledged that the quality of the images influences the accuracy of colocalization measurements (1, 16, 20) and that noise reduces the measured colocalization. A correction for noise has been demonstrated for the PCC and SRC (8, 21). The MOC is as insensitive to noise as to most other features of the data.
Overall, the PCC and the MOC produce values that differ widely for both the simulated datasets we have employed and with biological images (15, 22) and there is little correlation between these two measures of correlation. The PCC does measure correlation, the degree to which the intensity variations of one fluorophore follows variation in the second fluorophore, but since only pixels containing both signals are analyzed, the PCC should be qualified by the M1 and M2 coefficients, which report the fraction of the total intensity that co-occur (1, 23). The MOC provides a highly weighted measure of co-occurrence, is also affected by correlation and is sensitive to offset. For measurements of co-occurrence, the MOC should be replaced by M1 and M2. Given that colocalization is well supplied with coefficients, it would be productive to abandon the MOC and the related k1 and k2 pair of coefficients. The PCC, SRC, and perhaps the ICQ provide useful measures of correlation (Table 1).
Table 1. Comparison of correlation coefficients
|Theoretical range||−1 to +1||0 to +1||−0.5 to +0.5||−1 to +1|
|Offset background subtraction||Unimportant||Important||Unimportant||Unimportant|
|Weighting||Departure from the mean||Magnitude||None||Departure from mean rank|
|Inclusion of background pixels||Sensitive||Insensitive||Sensitive||Sensitive|
|Inclusion of midrange pixels||Insensitive||Sensitive||Sensitive||Slightly sensitive|
|Sensitivity to correlation||Good||Poor||Good||Good|
Ri,coloc is the intensity of the red fluorophore in pixels where the green fluorophore is present.
Gi,coloc is the intensity of the green fluorophore in pixels where the red fluorophore is present.