The Met Office has used imagery from the European Meteosat geostationary satellites for many years, initially for forecaster guidance but increasingly for quantitative use of the cloud products in the nowcasting system, Nimrod (Golding, 1998). One of the key requirements is to make the products available within minutes of the measurement. More recently the direct assimilation of the cloud products and radiances from the geostationary satellites has been developed (e.g. Munro et al., 2004; Kelly and Francis, 2008) requiring the data are accurately pre-processed. The European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) provides some products in near real time (see http://www.eumetsat.int/Home/Main/Image_Gallery/Real_Time_Imagery/index.htm), but in order to meet the timeliness requirements the locally received data are processed at the Met Office to allow flexibility to tailor any products for the United Kingdom and other areas of interest to customers of the Met Office.
A significant milestone in the provision of geostationary imagery over the European and African regions was the launch of the second generation Meteosat (MSG) satellites in 2002 by EUMETSAT. This satellite contained new technology sensors including the Spinning Enhanced Visible and Infrared Imager (SEVIRI) radiometer. Data from SEVIRI have been available since 2004, and will continue to play a major role in the nowcasting and short-range forecasting capabilities at the Met Office until at least 2018, by which time the third generation of Meteosat satellites is scheduled to have started operations. Compared to the first generation of Meteosat imagers, SEVIRI can provide imagery at significantly higher spatial and temporal resolutions, and senses radiation in 12 different spectral channels in the solar and infrared spectra, compared with the 3 channels on the earlier satellites. There is scope, therefore, for significant improvement in our ability to derive quantitative products for use in nowcasting systems, numerical weather prediction (NWP) applications and as imagery for forecasters by fully exploiting the SEVIRI data.
Section 2 of this paper briefly introduces the SEVIRI instrument. The methodology to identify cloud contaminated pixels in the SEVIRI data is documented in Section 3. Section 4 examines the relative effectiveness of the cloud tests applied to the imagery. Sections 5 and 6 compare the Met Office SEVIRI mask with that of the Nowcasting Satellite Applications Facility (SAFNWC), and with the cloud mask derived from Moderate Resolution Imaging Spectro-radiometer (MODIS) imagery. Finally, Section 7 summarizes the results and Section 8 concludes the paper.
2. Summary of the SEVIRI instrument
SEVIRI is a radiometer on MSG that measures radiation reflected from and emitted by the Earth's surface and its atmosphere at more wavelengths and greater temporal and spatial resolution than any other geostationary meteorological satellite. It is located above approximately 0° latitude, 0° longitude, covering a region including Europe, Africa, Middle East and the North and South Atlantic. MSG-1 was launched in August 2002, and became operational (as Meteosat-8) in January 2004. MSG-2 was launched in December 2005, and, as Meteosat-9, was declared the prime 0° satellite in April 2007. The other two satellites in the series, MSG-3 and MSG-4, are due to be launched in 2011 and 2013 respectively.
A summary of the 12 different spectral channels is given in Table I. Four of the channels (i.e. those centred at 0.6, 0.8 and 1.6 µm plus the High Resolution Visible (HRV) channel) are located in the visible/near-infrared part of the spectrum (i.e. day time only use), with the 3.9 µm channel acting as a solar + thermal infrared channel during the day and a thermal infrared-only channel during night time. The remaining seven channels are thermal infrared only, and therefore behave in the same way by day or night. The spatial resolution of the data is 3 km (at sub-satellite point) for 11 of the channels, with the HRV channel having a spatial resolution of 1 km though not with full disk coverage. The full disc is viewed once every 15 min, enabling monitoring of rapidly evolving phenomena, and potentially aiding the weather forecaster in the swift recognition and prediction of dangerous weather phenomena such as intense convection, heavy rain, fog and rapid cyclogenesis. More details on the SEVIRI instrument can be found in Schmetz et al. (2002).
Table I. Summary of spinning enhanced visible and infrared imager (SEVIRI) channels
Central wavelength (µm)
IR + solar
The MSG image data are received at the Met Office continuously via the EUMETCast system (EUMETSAT, 2006), and are routinely processed by the operational processing system to generate single channel images (e.g. visible and infrared), false-colour (RGB) images and a variety of other derived products. Some of the imagery is available on the Met Office web site at www.metoffice.gov.uk/weather/satellite/index.html
3. Cloud mask determination
The nowcasting system requirement at the Met Office is for cloud products sampled every 30 min and generated within 3 min of the nominal end of the measurement period. MSG has a set of channels designed to provide accurate cloud-top pressure and amount estimates. The first step in deriving cloud parameters is to produce a cloud mask which identifies clear, partly cloudy and fully cloudy fields of view using techniques previously developed for the Advanced Very High Resolution Radiometer, AVHRR, (e.g. Saunders and Kriebel, 1988). However, an important development for processing MSG radiances is to include clear-sky simulated radiances produced using NWP model profiles input to a fast radiative transfer model. This allows the differences between observed and simulated radiances to be compared which should be small for clear sky radiances. A summary of the cloud detection tests applied to SEVIRI radiances is given in Table II. They encompass spectral differencing using various channel combinations and comparing the measured differences with the simulated differences. Spatial coherence and visible reflectance information is also used. A pixel is flagged as cloud contaminated if one or more tests return a positive result. If no tests indicate the presence of cloud the pixel is flagged as clear. However, if there are missing raw satellite data (for example due to errors in the transmission or reception system) such that one or more tests cannot be applied to a pixel and no other tests have indicated cloud, the pixel is flagged as undetermined.
Table II. Summary of cloud mask tests
Ancillary data required
Day time and twilight only (θsol≤85°)
, , ,
(i) Night and twilight
SST SSTretreived − T* < threshSST
Pre-calculated regression coefficients, background skin temperature (T*)
Pre-calculated regression coefficients
10.8 µm spatial coherence
Land (night only) and sea only
in surrounding 3 × 3 block of pixels
HRV spatial coherence
Day time and twilight only
Land and sea only
in 3 × 3 block of high resolution pixels
0.8 µm spatial coherence
Day time and twilight only (θsol≤85°)
in surrounding 3 × 3 block of pixels
Visible threshold (sea) (land)
EUMETSAT clear sky reflectance data
Visible/NIR ratio (sea) (land)
Day time and twilight only (θsol≤89°)
Land and sea only
Twilight temporal differencing
Twilight only (75° < θsol < 90°)
, , , , , ,
Cloud mask from T-60 min and T-45 min prior slots
In order to evaluate the performance of the cloud mask, quantitative comparisons have been carried out against other cloud masks, as described in Sections 5 and 6. In addition to this, qualitative comparisons with SEVIRI imagery are made by eye. In the discussions below, the term ‘false positive’ is used to indicate a clear pixel wrongly flagged as cloudy as assessed by such qualitative comparisons. Likewise ‘false negatives’ refer to cloudy pixels incorrectly flagged as clear.
In the following, the symbols , , and represent the observed reflectances at 0.6, 0.8, 1.6 µm and in the HRV channel respectively, , , and are the observed brightness temperatures (BTs) for the 3.9, 8.7, 10.8 and 12.0 µm channels respectively, and , , and are corresponding background clear-sky brightness temperatures, as calculated from a recent forecast profile using the RTTOV fast radiative transfer model (Saunders et al., 1999). Hourly forecast fields are available and these are interpolated onto 15 min intervals for use by RTTOV. EUMETSAT have derived a SEVIRI pixel-resolution land surface emissivity (LSE) map (Lutz and König, 2008) from the University of Wisconsin-Madison/Cooperative Institute for Meteorological Satellite Studies (UW/CIMSS) high resolution MODIS LSE atlas (Seemann et al., 2008). The pixel LSE values from the EUMETSAT atlas are averaged over NWP model grid points and the resulting emissivities are input to RTTOV. Currently RTTOV Version 9 is being used which allows the radiative transfer calculations to be carried out on the background model levels, rather than having to interpolate onto the 43 levels standard to previous versions of RTTOV. Throughout the cloud processing algorithms, simulated brightness temperatures and model data (where they are used explicitly) are linearly interpolated onto pixel locations. The interpolation takes account of surface type so that wherever possible the land/sea classification of the pixel matches that of all the interpolated model data points.
The observed reflectance values in each visible channel have been normalized by the following factor:
where θsol is the solar zenith angle and is the commonly applied multiplier which accounts for the varying local solar zenith angle at the surface. It is modified at high solar zenith angles in order to avoid excessively high reflectance values near the terminator: for θsol > 85° the factor increases linearly with θsol at the gradient of at θsol = 85°.
The tests carried out on each pixel depend upon the time of day and the surface type. Day time is defined by θsol < 80°, twilight by 80°≤θsol < 90° and night time by θsol ≥ 90°. A pixel land/sea mask distinguishes between land, sea and coastal pixels, the latter being those pixels which contain both land and sea surfaces. There are eight gradations of coastal pixels representing the ratio of land to sea within each pixel. A further pixel land surface type map derived from the International Geosphere-Biosphere Programme (IGBP) surface atlas (Belward, 1996; Francis, 2004) is used to identify specific surface types (e.g. ‘barren’ surfaces) and inland water bodies.
Cloud tests which employ the visible channels can incorrectly identify cloud near Sun glint. This mostly affects water surfaces, but desert regions at high viewing angles such as the Saudi Peninsula can also be affected. Two parameters are calculated to mitigate this: is the satellite zenith angle at the centre of the Sun glint area (the solar specular reflection point) and θglint is a proxy for the intensity of the Sun glint for a given pixel based on geometrical considerations, with θglint = 0° indicating the solar specular reflection point. θglint is defined by:
where θsat is the satellite zenith angle, and θscat is the scattering angle defined for the range (0°, 180°) from backward scattering (0°) to forward scattering (180°) directions. Sea and inland water pixels are tagged as being affected by Sun glint if θglint < 25°. Some cloud tests employ further conditions to identify Sun glint-affected pixels.
There follows a description of each of the tests applied to identify cloud contaminated pixels. The tests are also tabulated in Table II.
3.1. Snow test
This test is based on the SAFNWC snow test developed by Derrien and Le Gléau (2005) and is applied over land where θsol≤85°. It requires the observed reflectances at 0.6, 0.8 and 1.6 µm, the observed brightness temperatures at 3.9, 10.8 and 12.0 µm, and the RTTOV-calculated background clear-sky brightness temperature at 10.8 µm.
In this test, a pixel is flagged as being snow-contaminated if all the following conditions hold true:
where , , , and are pre-defined snow test thresholds and offsets, and is an offset relating to the gross test (see below). The first two tests exploit the lower reflection of snow and ice compared to water clouds at 1.6 and 3.9 µm. Conditions (5) and (6) are based on the fact that snow and ice are expected to have slightly lower temperatures than snow- and ice-free land surfaces beneath a clear sky. Condition (7) discriminates between thin cirrus cloud and snow and ice (see the thin cirrus test below). The final test attempts to prevent shadows being mistaken for snow or ice.
3.2. Gross test
This is the most general cloud test, and is applied at all times and for all surface types. It has been used in many previous cloud detection schemes, such as the APOLLO scheme (Kriebel et al., 2003). The basic premise of the test is that if the observed brightness temperature at 10.8 µm is colder than that which one would expect given the background cloud-free model profile, then this is likely to be due to the presence of cloud. Hence, a pixel is flagged as being cloud contaminated if:
where is a pre-defined offset which can vary with time of day and underlying surface type.
3.3. Thin cirrus test
This is another widely used test (e.g. Inoue, 1985; Kriebel et al., 2003; Derrien and Le Gléau, 2005) which exploits the higher absorption by ice clouds at 12.0 µm compared to 10.8 µm: the attenuation by high, thin clouds of radiation emitted by the surface is greater at 12.0 µm than at 10.8 µm. Therefore, the difference in observed brightness temperatures in these channels is expected to be larger than the difference in the corresponding simulated brightness temperatures for pixels containing such cloud. The test is applied at all times and for all surface types. A pixel is flagged as being cloud contaminated if:
where is a pre-defined offset which is larger over barren surfaces to account for spectral variation in surface emissivity. An additional condition must be satisfied over land surfaces:
This reduces the risk of false positives over very warm, moist, cloud-free areas as noted by Derrien and Le Gléau (2005).
3.4. Fog/low-cloud test
This test is also common to other cloud masks (e.g. Eyre et al., 1984; Saunders and Kriebel, 1988; Derrien and Le Gléau, 2005). It is applied to night time and twilight pixels only over all surface types. The emissivity of liquid water clouds is larger at 10.8 µm than at 3.9 µm so the observed brightness temperatures are expected to differ by a greater amount than the simulated clear-sky brightness temperatures for pixels containing liquid water clouds. A pixel is flagged as being cloud contaminated if:
where is a pre-defined offset and is a pre-defined threshold. Barren surface types also exhibit higher emissivity at 10.8 µm so a larger offset is used for these surface types to reduce the risk of confusing such surfaces with cloud.
False positives are observed to occur over certain surface types such as barren regions and tropical rainforest, particularly over central Africa and tropical South America. In order to reduce the number of false positives, the test is not applied over these particular surface types if:
Liquid water cloud has a much higher emissivity at 8.7 µm than at 3.9 µm so this condition rarely rejects genuine cloud, whereas the land surfaces in question have more similar emissivities in the case of tropical forest or lower emissivities at 8.7 µm for arid surfaces.
Water clouds also exhibit lower emissivity at 8.7 µm than at 10.8 µm which can be exploited over vegetated (i.e. non-barren) surfaces at high latitudes (for example, Derrien and Le Gléau, 2005) at all times using the 8.7 µm channel. A pixel is flagged as being cloud contaminated if:
where is a pre-defined offset and is a pre-defined threshold.
3.5. Mixed scenes test
The mixed scenes test is similar in principle to the thin cirrus test, and again has a precedent in other cloud masks (e.g. Saunders and Kriebel, 1988; Derrien and Le Gléau, 2005). It detects thin, high clouds over all surface types at night time only using the fact that ice clouds absorb more readily at 12.0 µm than at 3.9 µm. A pixel is flagged as being cloud contaminated if:
where is a pre-defined offset.
3.6. Sea surface temperature test
This test is applied to sea pixels only. A pixel is flagged as being cloud contaminated if:
where T* is the skin temperature from the background field, ΔTSST is a pre-defined offset, and SSTret is the retrieved sea surface temperature, calculated from the following regression relationship:
where a0–4 are pre-defined regression coefficients, S = (1/cosθsat)− 1 and θsat is the satellite zenith angle. The regression coefficients were derived from radiative transfer simulations based on a reference set of atmospheric profiles (Chevallier, 1999).
3.7. 8.7 µm test
This test is also applied to sea pixels only. A pixel is flagged as being cloud contaminated if:
where ΔTB(8.7) is a pre-defined offset, and is the 8.7 µm brightness temperature predicted from the following regression relationship:
where b0–2 are pre-defined regression coefficients. The regression coefficients were calculated as for the sea surface temperature test.
3.8. 10.8 µm spatial coherence test
This test is also common to other masks including those of Saunders and Kriebel (1988) and Derrien and Le Gléau (2005). It makes use of the observation that broken cloud and cloud edges typically exhibit higher spatial variability in their thermal characteristics than the surface beneath, particularly over oceans. A pixel is flagged as being cloud contaminated if:
where is the standard deviation of the observed 10.8 µm brightness temperature in a 3 × 3 box centred on the pixel in question, and is a pre-defined threshold for this standard deviation which varies with underlying surface type. The test is not applied to land surfaces in daylight and twilight, and is not applied to coastal pixels at all.
3.9. High resolution visible (HRV) spatial coherence test
Each standard SEVIRI pixel corresponds precisely to a 3 × 3 block of HRV pixels. This test uses the standard deviation of reflectances in such a 3 × 3 HRV block, , to identify cloud in the corresponding SEVIRI pixel. For some pixels (defined below), is normalized by the mean of the nine HRV reflectances, . This normalization can reduce the occurrence of false positives and allows for cloud detection under Sun glint conditions. For the purposes of this test, Sun glint is defined by θglint < 40°.
For all day time sea pixels and twilight Sun-glint-affected sea pixels, a SEVIRI pixel is flagged as being cloud contaminated if:
where is a pre-defined threshold which varies according to whether or not pixels are affected by Sun glint.
For twilight sea pixels unaffected by Sun-glint, a SEVIRI pixel is flagged as being cloud contaminated if:
where is a pre-defined threshold.
For land pixels in day or twilight, a SEVIRI pixel is flagged as being cloud contaminated if:
where is a pre-defined threshold which is larger for barren surface types. For land pixels Condition (26) requires that all nine of the HRV reflectances should exceed 0.1 which reduces false positives caused by the edges of inland water bodies and by cloud shadows.
The test is not applied to snow-contaminated pixels or to coastal pixels. A further check is made on the surface types of the 3 × 3 block of standard resolution SEVIRI pixels around the pixel under test: they must either be all land or all water (i.e. sea or inland water) for the test to be applied. This further reduces the incidence of false positives at the edges of inland water bodies.
3.10. 0.8 µm spatial coherence test
This test uses the standard deviation of 0.8 µm reflectances in the 3 × 3 block around the pixel in question, . It is similar in principle to the HRV spatial coherence test and is useful in capturing some of the small-scale cloud in those regions where HRV data are unavailable. This test is applied only to sea pixels. To reduce false positives around cloud edges the test is not applied if:
where is the reflectance of the central pixel and is the mean 0.8 µm reflectance in the 3 × 3 block of pixels.
To reduce false positives due to noise in the channel the test is not applied if:
Reflectances are calculated under the assumption that the underlying surface (or cloud) behaves approximately as a Lambertian reflector. Under certain bi-directional configurations (notably at high illumination and viewing angles), this assumption can be violated sufficiently that calculated reflectance values may exceed 1.0. To reduce false positives due to these cases the test is not applied if:
where the maximum is taken over the reflectances of the 3 × 3 block of pixels.
To reduce false positives due to noise in the channel near the terminator the test is not applied if:
To reduce false positives due to cloud shadows in twilight conditions the test is not applied if:
The severity of Sun glint increases as the solar specular reflection point approaches the edge of the disc. The parameter described previously is used to distinguish cases of ‘moderate’ and ‘severe’ Sun glint. Moderate Sun glint is defined by:
and severe Sun glint is defined by:
In cases of moderate Sun-glint, the test is not applied if:
where the minimum is taken over the reflectances of the 3 × 3 block of pixels. This reduces the occurrence of false positives due to cloud shadows caused by high illumination angles.
For daytime pixels unaffected by Sun glint, a pixel is flagged as being cloud contaminated if:
where is a pre-defined threshold.
For all twilight pixels where θsol≤85° (as per condition (30)), and for day time pixels affected by moderate Sun glint, a pixel is flagged as being cloud contaminated if:
where is a pre-defined threshold for the normalized standard deviation which varies according to the time of day and whether or not the pixel is affected by Sun glint. The test is not applied to pixels affected by severe Sun glint.
3.11. Visible threshold test
The observation that clouds frequently have higher reflectances than the underlying surface is commonly used for day time cloud detection (e.g. Saunders and Kriebel, 1988). Over the sea, a pixel is flagged as being cloud contaminated if:
where is a pre-defined reflectance threshold.
Over coasts, a pixel is flagged as being cloud contaminated if:
where is a pre-defined reflectance threshold. In both Equations (37) and (38), the θsol dependence was empirically derived to account for increased clear-sky reflectances at higher solar zenith angles.
Over land, the EUMETSAT clear sky reflectance map (CRM) product (EUMETSAT, 2008) is used to generate dynamic thresholds. A pixel is flagged as being cloud contaminated if:
where is the clear-sky 0.6 µm reflectance and is a pre-defined offset. The nominal CRM product time is 1200 UTC: although these data may be used effectively over much of the disc throughout many slots each day, some restrictions are required to avoid false positives in the cloud mask due to varying illumination conditions. The CRM data are used only for slots between 0500 UTC (0400 in the northern hemisphere winter) and 2000 UTC. A higher offset is used for slots before 0600 and after 1800, and also at high solar zenith angles to mitigate false positives caused by enhanced forward scattering. In addition, certain barren surfaces (for example the Middle East region) can similarly exhibit strong forward scattering and so fixed thresholds (independent of the CRM data) are employed over these regions under problematic illumination conditions. Finally, if the 0.6 µm CRM reflectance exceeds 0.60 for a pixel, the fixed thresholds are used as such high CRM values are often symptomatic of either snow or cloud contamination in the CRM product. In the cases where the CRM data are not used, a pixel is flagged as cloud contaminated if:
where is a pre-defined threshold.
The test is not applied to any pixels contaminated by snow or ice, or to sea or inland water pixels affected by Sun glint.
3.12. Visible/near-infrared ratio test
This is another test used by Saunders and Kriebel (1988). It is applied to pixels where θsol≤89°, over both sea and land (but not coasts). It exploits the fact that water typically has a smaller reflectance at 0.8 µm than at 0.6 µm, while the opposite is true for vegetated land surfaces. In contrast, clouds have a similar reflectance in the two channels. Over the sea or over inland water, a pixel is flagged as being cloud contaminated if:
where is given by:
Over the land, a pixel is flagged as being cloud contaminated if:
where is given by:
The θsol dependencies in Equations (42) and (44) were derived empirically. The test is restricted to specific land surface types as it can introduce false positives particularly over water-logged or sparsely vegetated surfaces. False positives can also occur around rivers and lake edges: these are reduced by requiring the pixels in the 3 × 3 block around the pixel under test to be either all land or all water (i.e. sea or inland water). The test is not applied at all to sea or inland water pixels affected by Sun glint. In addition, the test is not applied to sea pixels with θglint < 40° and for which HRV data are available on the grounds that the HRV test captures essentially the same cloud with much lower risk of false positives due to Sun glint. The test is also not applied to snow-contaminated pixels or to coastal pixels.
Derrien and Le Gléau (2007) devised a temporal differencing algorithm which is designed to capture low cloud and fog in twilight conditions. Daytime detection of low cloud is efficiently performed by visible tests. At night time the differing emissivity of liquid water clouds at 3.9 µm compared to 10.8 µm is exploited. In twilight conditions detection of low clouds is difficult because the visible channels become noisy approaching the terminator and the 3.9 µm channel is contaminated by solar radiation.
The temporal differencing algorithm is applied in two parts. First, a seeding process takes place in which twilight pixels (those for which 80°≤θsol < 90°) in the current slot that were classified as cloudy in the cloud mask from 1 h ago and whose thermal characteristics have not changed significantly are flagged as seed pixels. Second, once all pixels have been examined a region growing process occurs in which seed groups are expanded to encompass adjacent pixels which share the thermal and visible characteristics of the seed group. Some modifications to the original algorithm were made to account for differences between the SAFNWC and Met Office cloud masks.
The seeding process in the Met Office mask requires the pixel to have been flagged as cloudy both in the T-45 min and T-60 min cloud masks. Any pixels flagged by the thin cirrus, mixed scenes or 8.7 µm cloud tests in the T-60 min mask are rejected from the seeding process since these tests typically detect high-level cloud. Any pixels flagged only by one or more of the spatial coherence tests in the T-60 min mask are also rejected since these tests often flag sub-pixel cloud and cloud edges. Such pixels will often pass the conditions of the seeding process regardless of whether cloud actually remains in the pixel in the current mask. An exception to this last rule is made for sea pixels where the 0.8 µm reflectance exceeds 0.30 in the current mask since this is a good indication that the pixel still contains cloud.
The seeding process is vulnerable to false positives in the prior cloud mask. For this reason snow-contaminated pixels are excluded from the seeding process, as are barren land surface types and coastal pixels. False positives are more common for such pixels than for others.
Sea pixels are flagged as seeds if they satisfy the above conditions and if:
where and are the 10.8 and 12.0 µm brightness temperatures of the pixel 1 h earlier, and and are pre-defined thresholds.
Land pixels are flagged as seeds if, in addition to the above conditions, they satisfy:
where is the 8.7 µm brightness temperature of the pixel 1 h earlier, and and are pre-defined thresholds. All seed pixels are then flagged as cloud contaminated in the cloud mask.
Once all pixels have been through the seeding process, the region growing algorithm may be invoked. This relies heavily on information in the 0.6 µm channel. Some combinations of viewing and illumination geometry result in this information becoming less reliable as an indicator of cloud. These are typically cases of enhanced forward or backward scattering and generally occur when the terminator is near the edge of the disc, near Sun glint, and at high viewing angles. Steps are therefore taken to avoid performing the region growing in these circumstances.
The region growing routine is only called if . The parameter is used as a measure of how close the terminator is to the edge of the disc, as well as an indicator of severe Sun glint.
Groups of adjacent seed pixels which number fewer than nine are discarded from the region growing. For those seed groups which remain, the average reflectance at 0.6 µm of the seed group, , and the average 10.8 µm BT of the seed group, , are calculated. Any seed groups for which are discarded from the region growing. This avoids some region growing on seed groups which result from false positives in the prior mask. It also helps to ensure the seed group is clearly discernible in the 0.6 µm channel which reduces the chance of false positives resulting from the region growing. Pixels adjacent to a seed group are added to the group provided they are not snow-contaminated and the following conditions are satisfied:
Snow-contaminated pixels can lead to false positives due to their higher reflectances and so are excluded. Conditions (49) and (50) restrict the region growing to the twilight region and mitigate false positives which can result from strong forward scattering. Condition (51) avoids false positives being added due to higher observed reflectances at large viewing angles when the terminator approaches the edge of the disc. Conditions (52) and (53) avoid false positives which occur near Sun glint. Condition (54) ensures that only pixels with temperatures similar to, or slightly colder than, the seed group are added in the region growing process. The final condition exploits the visible information available in the twilight region, looking for pixels which have higher reflectances than the average of the seed group. In this last condition, is the smallest allowed reflectance for pixels to be added to the seed group, and is set to 0.45 for barren surfaces and 0.30 over other surface types. The factor is an empirically derived multiplier greater than or equal to 1.0 which increases linearly with the solar zenith angle and is applied only when the terminator approaches the edge of the disc. This multiplier reduces the risk of false positives being introduced to the mask due to strong forward or backward scattering, or noise in the 0.6 µm channel, close to the terminator. When a new pixel is added to the seed group, the pixels adjacent to it are also processed. In this way the region growing algorithm ‘fills in’ missing cloud and accounts for cloud motion between the prior slot and the current one. Finally, the pixels added by the region growing process are flagged as cloud contaminated in the cloud mask.
3.14. Partial cloud
Pixels that are flagged as cloudy by one or more of the spatial coherence tests (10.8 µm, HRV and 0.8 µm) and are not flagged by any other test are designated partially cloudy. These pixels are excluded from the downstream cloud processing which generates products such as cloud top height.
4. Effectiveness of cloud detection tests
To illustrate the relative effectiveness of each test Table III lists the proportion of all cloud flagged by each test over all slots on 16 November 2009. The table also lists the amount of cloud flagged uniquely by each test as a proportion of the total cloud flagged by the test. In the following discussion this quantity will be referred to as the relative unique cloud fraction (relative in the sense that it is measured as a proportion of the cloud flagged by the test rather than all cloud in the mask). It is useful to know what proportion of all cloud in an image is captured by each test. However, some cloud tests are only applied to a subset of all pixels (for example the SST and 8.7 µm tests are only applied to sea pixels), and some tests are designed to detect specific types of cloud (such as the fog/low cloud test). Therefore the relative unique cloud fraction provides a fairer measure of the comparative usefulness of each test.
Table III. Illustrating the relative effectiveness of cloud tests
Pixels flagged as % of total cloud
Pixels flagged uniquely as % of all pixels flagged by test
Statistics based on all slots from 16 November 2009.
10.8 µm spatial
HRV spatial coherence
0.8 µm spatial coherence
In the interests of clarity Table III does not break the test statistics down by time of day or surface type. However, since more cloud tests are applied to day time pixels than to night time pixels (as a result of the visible data being used during the day), it is generally true that more pixels are detected by tests uniquely at night than in the day time. Similarly, all of the cloud tests are applied to sea pixels whereas only a subset are applied to land pixels, so the chances of a cloudy pixel being detected by only one test is generally lower over sea than land. The relative homogeneity of sea surfaces compared to land also means that cloud test thresholds may be set to allow more cloud to be flagged over sea compared to land, while still avoiding false positives. This also increases the likelihood that more than one test will detect a given cloud contaminated pixel over sea compared to land. It follows that individual tests become more important to the mask at night time and over land.
The gross test, being the most general in principle and being applied to all surface types at all times of day, captures the largest proportion of the cloud (82.55%) and a relatively large fraction of this cloud (10.23%) is flagged uniquely by this test. It flags 85.91% of all night time cloud, and 78.55% of all day time cloud. Over land this difference is likely to result from the NWP model failing to represent land surface temperature extremes well in all cases: at night model surface temperatures may be too warm in some regions which can result in false positives appearing in the cloud mask, while in the daytime the model surface temperatures may sometimes be too cold resulting in cloud being missed from the mask.
The thin cirrus test is useful for capturing semi-transparent ice cloud which may often be missed by other tests. It is most useful in the day time over land (it captures 28.08% of all cloud in this category, 22.15% of which is uniquely flagged by this test). The mixed scenes test detects 25.80% of all night time cloud, but most of this cloud is also captured by other tests (in particular the thin cirrus and gross tests). Over sea, high cloud is often captured by the 8.7 µm and SST tests. The result is that the thin cirrus and mixed scenes tests each flag very little cloud uniquely over ocean.
The fog/low cloud test has the largest relative unique cloud fraction of all tests (16.54%). This is not surprising since low cloud is not well captured by any other test at night time, particularly over land. The twilight low-cloud temporal differencing algorithm complements the fog/low cloud test. Being designed to detect a very specific type of cloud, this only flags 1.78% of all cloudy pixels. The impact of the test on the cloud mask is dependent on synoptic conditions: the test can be very beneficial in cases of extensive low stratus which are often not fully captured by the other cloud tests under twilight conditions. Figure 1 shows an example of this.
The HRV spatial coherence test has a comparatively high unique cloud fraction of 9.48%. This test captures 30.94% of day time cloud over sea with a relative unique cloud fraction of 10.26%. This compares with 13.58% of day time cloud over land, 7.33% of which is uniquely identified by this test. The HRV channel is able to resolve sub-pixel cloud which the other channels cannot detect, and, as with other tests, the test thresholds may be set rather lower over ocean than land due to the greater homogeneity of the background resulting in more cloud being flagged over ocean.
The 0.8 µm spatial coherence test does not generally add much cloud to the mask beyond that flagged by the HRV test where HRV data are available. However, in areas where the HRV data are unavailable the 0.8 µm test is particularly useful over sea in capturing some of the small cumulus missed by other tests.
The 10.8 µm spatial coherence test flags a large proportion of all cloud (60.56%), with 11.35% of this being flagged by this test alone. This test is particularly effective for sea pixels, flagging 77.26% of all cloud over ocean, and 11.92% of this uniquely. These latter pixels mostly contain broken cloud and cloud edges.
The SST test also flags a large proportion of cloud over oceans (68.84%), but only 0.34% of this is flagged by this test alone for the reasons given above. Similarly, the 8.7 µm test also flags only a small amount of cloud uniquely.
The use of the clear-sky reflectance product is beneficial for the visible threshold test, allowing for particularly efficient thresholds over land while admitting few false positives (Sections 5 and 6). The test flags 59.94% of all day time cloud over land, and 9.09% of these pixels are detected by this test alone. Over sea this test flags 49.33% of day time cloud, but very little of this is flagged only by this test.
The thresholds in the visible/NIR ratio test admit less cloud over land surfaces than sea, and as a result the test flags more cloud over sea than land (57.92% compared to 34.30%). In fact the test is not applied over certain land surface types as areas which are sparsely vegetated or which are covered by substantial amounts of surface water can result in false positives. The relative unique cloud fraction for day time land pixels is 5.88% compared to 0.25% for day time sea.
An important caveat for this analysis is that it does not take false positives into account: pixels wrongly flagged by a test will often only be flagged by that one test, and so false positives in the mask will tend to increase the count of uniquely flagged pixels erroneously. The following sections compare the Met Office SEVIRI cloud mask with that of the SAFNWC and with the MODIS cloud mask. These comparisons indicate those tests most likely to introduce false positives.
5. Comparison against the nowcasting satellite applications facility (SAFNWC) SEVIRI cloud mask
The Nowcasting Satellite Applications Facility (SAFNWC) also generates a SEVIRI cloud mask based on a series of threshold tests (Derrien and Le Gléau, 2005). A number of cloud tests are common to the SAFNWC and Met Office masks. However, a significant way in which the masks differ is in how test thresholds are determined: the SAFNWC mask employs pre-calculated look-up tables derived from large numbers of off-line radiative transfer simulations on a wide range of model profiles. The look-up tables are indexed by geometrical variables such as solar zenith angle and satellite zenith angle, NWP fields such as surface temperature and total column water vapour, and other ancillary data such as elevation and climatological data. This contrasts with the Met Office cloud mask in which thresholds are derived from in-line clear-sky radiative transfer simulations based on recent NWP forecast fields, ancillary datasets, and empirically derived relationships with variables such as solar zenith angle, as detailed in Section 3.
Table IV compares the Met Office and SAFNWC masks for all slots on 16 November 2009. ‘Cloud matches’ refers to the percentage of pixels flagged by each test which were cloudy in the SAFNWC mask. This is a measure of the skill of the test against the SAFNWC mask. ‘Cloud matches as % of all cloud’ gives the proportion of all the SAFNWC cloud which is flagged by the test, and likewise ‘cloud mismatches as % of all clear’ gives the percentage of SAFNWC clear pixels flagged as cloudy by the test. In the following discussion it should be borne in mind that the SAFNWC mask does not represent the truth in terms of cloudy/clear classification. Agreement between the masks does not guarantee that both masks are correct. Likewise, ‘cloud mismatches’ do not always represent false positives in the Met Office mask. The comparison of the two masks serves to highlight those cloud tests which may be introducing false positives into the cloud mask.
Table IV. Comparison of the Met Office cloud tests with the Nowcasting Satellite Applications Facility (SAFNWC) cloud mask over all slots on 16 November 2009
Cloud matches (%)
Cloud matches as % of all cloud
Cloud mismatches as % of all clear
‘Cloud matches’ refers to the percentage of pixels flagged by each test which were cloudy in the SAFNWC mask. ‘Cloud matches as % of all cloud’ gives the proportion of all the SAFNWC cloud which is flagged by the test, and likewise ‘cloud mismatches as % of all clear’ gives the percentage of SAFNWC clear pixels flagged as cloudy by the test.
10.8 µm spatial
0.8 µm spatial
The gross test flags the largest proportion of SAFNWC clear pixels (5.09%). For land pixels 92.79% of the flagged pixels agree with the SAFNWC mask compared to 98.61% over sea. The difference is due to the reliance of the gross test on accurate model surface temperature, and, in general, land surface temperatures are not characterized as well as sea surface temperatures in NWP models. If the model surface temperatures are too warm, the thresholds will be set too high thus increasing the risk of false positives. This situation is not uncommon over parts of Africa at night.
The twilight temporal differencing algorithm exhibits comparatively low skill among the tests, with 92.34% of the pixels flagged by the test being cloudy in the SAFNWC mask. False positives (due to other tests) in the prior cloud mask are often perpetuated by the seeding process. This is the most common source of false positives introduced by the temporal differencing scheme. At high viewing and solar zenith angles, atmospheric dust and aerosol can produce higher reflectances in the 0.6 µm channel and this can cause the region growing process to add cloud-free pixels to the mask. For example, false positives occur with greater frequency around the Middle East region where the atmosphere is frequently laden with sufficient quantities of dust to cause problems for the algorithm.
The comparison suggests that the fog/low cloud test has the lowest skill of all the tests: of all pixels flagged by the test, 85.74% are cloudy in the SAFNWC mask, though there is a large difference between the value for sea pixels—89.18%—and that for land pixels—75.31%. In fact, while false positives do occur, there are occasions in which the fog/low cloud test flags cloud missed by the SAFNWC mask. Figure 2 shows such an example. Over sea, the majority of false positives introduced by this test occur at high viewing angles. This may be a result of the CO2 ‘limb-cooling’ effect in the 3.9 µm channel which is not captured well by the radiative transfer simulations since RTTOV is being used at the limits of its viewing angle specification towards the edge of the Earth disc.
The HRV test also appears to have relatively low skill, but the 2009 version of the SAFNWC mask used in this comparison does not make use of HRV data. On the basis of comparisons with visible imagery, it is believed that the majority of the SAFNWC clear pixels flagged by this test are in fact cloud-contaminated. Figure 3 shows an example. The 2010 release of the SAFNWC cloud mask will exploit HRV spatial coherence information (Derrien et al., 2009).
Many pixels flagged by the 0.8 µm spatial coherence test which are clear in the SAFNWC mask are also in fact cloud contaminated (for example Figure 4). Thus the skill of this test is somewhat higher than indicated in Table IV.
The 10.8 µm spatial coherence test also flags a relatively large proportion (2.56%) of clear SAFNWC pixels. The SAFNWC mask also employs a version of the 10.8 µm test and the discrepancies are mostly located around cloud edges and areas of small cumulus since it is these areas where small differences between the thresholds used by each mask have the greatest impact.
Overall, 95.06% of the pixels flagged by the thin cirrus test are cloudy in the SAFNWC mask. There is extremely good agreement over sea where 99.18% of flagged pixels are cloudy in the SAFNWC mask. It is somewhat worse over land (92.75%), where warm, clear regions with high water vapour content can result in false positives.
The visible threshold test agrees very well with the SAFNWC over both land and sea, with 97.38% and 99.28%, respectively, of flagged pixels matching the SAFNWC mask. Before the CRM data were used, fixed thresholds were applied over land surfaces: it was found that the CRM data allowed a significant amount of extra cloud to be detected with very few additional false positives introduced (Hocking et al., 2009). The good agreement between this test and the SAFNWC mask over land suggests the SAFNWC approach compares very favourably with the use of the clear-sky reflectance data.
There is also good agreement between the visible/NIR ratio test and the SAFNWC mask, with 91.53% and 99.31% of flagged pixels being cloudy in the SAFNWC for land and sea respectively. The lower value for land pixels is due to the greater chance of false positives with this test over land surfaces mentioned earlier.
The mixed scenes and SST tests agree with the SAFNWC mask to a very high degree. As noted above, they add only a small amount of cloud to the mask that is not flagged by any other test: the result of the comparison is not then surprising, since most of the pixels they flag as cloudy are corroborated by other cloud tests.
Overall the Met Office and SAFNWC masks give the same clear/cloudy classification to 92.57% of the pixels tested.
6. Comparison against the MODIS cloud mask
The Moderate Resolution Imaging Spectro-radiometer (MODIS) carried onboard the polar orbiting Terra and Aqua satellites has 36 channels in visible, near-infrared and thermal infrared wavelengths. All channels have a spatial resolution of at least 1 km at the sub-satellite point. The Level 2 MODIS cloud mask product (Ackerman et al., 2006) is generated from a series of threshold tests, some of which have analogues in the Met Office SEVIRI mask. MODIS pixels which are processed successfully by the cloud detection algorithm are flagged as high confidence clear (>99% confidence level for clear), confidently clear (>95%), uncertain (>66%), or cloudy. The MODIS mask does not represent the ‘truth’ in terms of cloud-contamination, but nevertheless is a useful product for the purposes of validating the SEVIRI cloud mask.
The Met Office and SAFNWC SEVIRI cloud masks were compared with the MODIS cloud mask for all MODIS granules which fell within the MSG field-of-view on 16 November 2009. Each MODIS granule is scanned over a 5 min period, while each SEVIRI full-disc scan takes (nominally) 15 min. The three granules scanned during a given 15 min SEVIRI scan interval were matched up with that SEVIRI slot. The maximum temporal difference between the MODIS and SEVIRI data was therefore 15 min.
To compare the SEVIRI and MODIS masks, each MODIS pixel was mapped onto the nearest SEVIRI pixel. Each SEVIRI pixel for which at least 10% of the mapped MODIS pixels were flagged cloudy was flagged as cloud-contaminated. Each SEVIRI pixel for which at least 90% of the mapped MODIS pixels were confidently clear (i.e. with greater than 95% confidence) was flagged as clear. SEVIRI pixels for which neither condition was true were excluded from the comparison: this excludes SEVIRI pixels with too many MODIS pixels mapped to them being classified as uncertain. The cut-off for cloudy pixels was set at 10% since the SEVIRI HRV channel has similar resolution to MODIS at the sub-satellite point (1 km) so it would be expected that both instruments can resolve clouds on this scale (where HRV data exist), and HRV pixels are approximately 10% the size of a normal SEVIRI pixel. The comparison was restricted to SEVIRI pixels with MSG satellite zenith angles of 75° or less.
The resulting MODIS mask was compared on a pixel-by-pixel basis with the SEVIRI masks. Some differences between the masks can be ascribed to errors in registration between MODIS and SEVIRI pixels which are largely due to parallax effects, and also to the difference in scan times.
Table V shows the results of the comparison for each Met Office cloud test with the MODIS mask. The terms ‘cloud hits’, ‘clear hits’, ‘false positives’ and ‘false negatives’ in Tables V and VI are used for convenience, and are in respect to the MODIS cloud mask rather than the ‘truth’. Note that due to Terra and Aqua over-pass times there are very few match-ups covering twilight illumination conditions and so there are insufficient data to compare the output of the twilight temporal differencing algorithm with the MODIS mask. The results are consistent with those in Table IV for most of the tests. One notable exception is the fog/low cloud test which shows much higher skill against the MODIS mask supporting the idea that the true skill of the test is higher than indicated by the comparison with the SAFNWC mask. The HRV spatial coherence test also appears more skillful when compared to MODIS, though still only 92.76% of pixels flagged by the test were cloudy in the MODIS mask. This may be partly a result of errors in registration between the MODIS and SEVIRI pixels: much of the cloud flagged by this test is fragmented which will be more susceptible to registration errors than more extensive areas of contiguous cloud.
Table V. Comparison of the Met Office cloud tests with the Moderate Resolution Imaging Spectro-Radiometer (MODIS) cloud mask for all MODIS granules falling within a Meteosat Second Generation (MSG) satellite zenith angle of 75° on 16 November 2009
Cloud hits (%)
Cloud hits as % of all cloud
False positives as % of all clear
10.8 µm spatial coherence
HRV spatial coherence
0.8 µm spatial coherence
Table VI. Summary of the comparison of the Met Office and the Nowcasting Satellite Applications Facility (SAFNWC) Spinning Enhanced Visible and Infrared Imager (SEVIRI) SEVIRI masks against the MODIS cloud mask
Cloud hits (%)
Clear hits (%)
False positives (%)
False negatives (%)
Cloud and clear hits show the proportion of cloudy/clear SEVIRI pixels which agree with the Moderate Resolution Imaging Spectro-Radiometer (MODIS) mask. False positives shows the proportion of pixels flagged as cloudy which were clear in the MODIS mask, and false negatives shows the proportion of pixels flagged as clear which were cloudy in the MODIS mask.
The 10.8 µm spatial coherence test shows lower skill against the MODIS mask. As with the HRV test, many discrepancies between the masks occur with fragmented cloud and around cloud edges due to registration errors. However, the 10.8 µm test does introduce some false positives around cloud edges.
The thin cirrus and mixed scenes tests also show lower skill against the MODIS mask, especially over land. However, the MODIS cloud detection algorithm identifies some semi-transparent cloud as clear which the Met Office and SAFNWC masks flag as cloud. The MODIS mask has an additional flag for thin cirrus cloud detected under day time conditions using its 1.38 µm channel. This flag indicates high cloud which is sufficiently thin that it may be corrected for when retrieving surface properties using visible channels. By considering MODIS pixels with this flag set to be cloudy, the thin cirrus hit rate against MODIS rises to 94.54%. (With this extra thin cirrus flag, the proportion of all pixels for which the Met Office and MODIS masks agree drops by 0.41% and for the SAFNWC mask it drops by 0.36%). Some of the remaining discrepancies appear to be due to thin cirrus not captured by the MODIS mask at night (Figure 5). Over sea the thin cirrus and mixed scenes tests agree well with MODIS (cloud hit rates of 98.59 and 99.33% respectively).
Table VI shows the overall statistics for the comparison between the Met Office and MODIS masks, and the SAFNWC and MODIS masks. The Met Office mask agrees with the MODIS mask for 90.65% of all pixels tested. The SAFNWC mask agrees with the MODIS mask for 89.69% of pixels tested. The results indicate that the Met Office mask is flagging slightly more cloud correctly (as compared to MODIS) than the SAFNWC mask while introducing a similar proportion of false positives. This demonstrates that overall the Met Office SEVIRI mask is performing as well as or better than the SAFNWC mask when compared to the MODIS cloud mask. The SAFNWC mask sometimes captures semi-transparent cloud more effectively than the Met Office cloud mask (as seen in Figure 5), while the Met Office mask can be more efficient at detecting low cloud at night (Figure 2, for example).
An analysis of the relative effectiveness of the tests has been presented which indicates those tests that are most valuable to the mask. The results show that the gross test flags the most cloud overall and a substantial proportion of this is not captured by any other test over both land and sea. The 10.8 µm spatial coherence and sea surface temperature (SST) tests also flag a large proportion of cloud over sea, though while the 10.8 µm test is useful for detecting broken cloud and cloud edges (particularly at night), the SST test flags only a small amount of cloud uniquely. The High Resolution Visible (HRV) spatial coherence test captures small-scale cloud missed by other tests (again especially useful over sea), and where HRV data are unavailable the 0.8 µm spatial coherence test captures some of this cloud. Most cloud tests applied over land detect significant amounts of cloud which no other test captures due to the greater difficulties in cloud detection over relatively heterogeneous land surfaces (as compared to sea). This is especially true of the gross and thin cirrus tests, and the visible threshold and visible/near-infrared ratio tests. The visible tests also flag a large amount of cloud over sea, but much of this is also captured by other tests. The fog/low cloud test and twilight low-cloud temporal differencing algorithm are both valuable for capturing low cloud under night time and twilight conditions. The 8.7 µm and mixed scenes tests add relatively little cloud to the mask beyond the other tests, but are retained in the interests of detecting as much cloud as possible and function as a ‘back-up’ in cases where other channel data are missing from an image.
Comparisons with the Nowcasting Satellite Applications Facility (SAFNWC) Spinning Enhanced Visible and Infrared Imager (SEVIRI) cloud mask and the MODIS cloud mask have been used to validate the cloud mask and to indicate which tests are most likely to introduce false positives. In general, false positives are more common over land surfaces. The gross test sometimes flags too much cloud over land, especially at night, as a result of errors in model surface temperatures. The thin cirrus test is vulnerable to false positives over land in regions of high water vapour content. The visible/near-IR ratio test also introduces some false positives over land surfaces: this is more common at high solar zenith angles. The 10.8 µm spatial coherence test can flag too many pixels around cloud edges. The fog/low cloud test introduces some false positives, but not as many as the comparison against the SAFNWC mask would suggest. Finally, the twilight temporal differencing algorithm is vulnerable to false positives from other tests in the prior cloud mask. Despite these differences, the Met Office and SAFNWC masks show good agreement, especially over ocean. The two masks identify 90.65 and 89.69% of pixels correctly as compared to the MODIS cloud mask for 1 day.
Data from Meteosat Second Generation (MSG) are used by the Met Office for the production of imagery products, nowcasting and assimilation in numerical weather prediction (NWP). Processing of the raw data received via EUMETCast is undertaken to produce a range of quantitative products for the nowcasting system. A fundamental pre-processing step is the identification of cloud contaminated pixels. This paper has described the threshold tests applied to MSG imagery in real-time to distinguish cloudy, partially cloudy, and clear pixels. The tests are applied to each pixel individually. If at least one test returns a positive result, the pixel in question is flagged as cloud-contaminated. If all tests return a negative result, the pixel is flagged as clear.
An important aspect of the Met Office cloud detection is the use of simulated clear-sky brightness temperatures (BTs) based on recent NWP forecast fields. By using the difference between observed and simulated BTs in the cloud tests, atmospheric effects are accounted for (notably CO2 absorption in the 3.9 µm channel), enabling relatively tight thresholds to be used. The benefit of this approach is seen where these tests detect cloud missed by alternative cloud detection schemes. This is particularly true of the fog/low cloud test, which captures low cloud missed by the SAFNWC mask as noted above. Furthermore, improvements to NWP models over time are expected to have a positive impact on those tests which make use of the simulated data.
Developments to existing cloud tests are planned for the future. For example, EUMETSAT internally produce clear-sky reflectance data valid at 2-hourly intervals throughout the day. When this product is disseminated routinely, it will be used to improve the visible threshold test by better capturing the diurnal variations in bi-directional surface reflectance, particularly in arid regions which often exhibit greater anisotropy.
Validation against the SAFNWC SEVIRI cloud mask and the MODIS cloud mask together with manual comparisons against MSG imagery provide assurance that the quality of the mask is sufficient for operational use. Further comparison/validation exercises will be conducted in the future, for example, with the next version of the SAFNWC cloud mask.
Finally, new techniques such as those developed by Merchant et al. (2005) will be investigated to see if further improvements can be made.
The authors thank Jean-Pierre Olry at Météo-France for providing the SAFNWC cloud mask data. The SAFNWC mask images shown are copyright SATMOS (INSU/CNRS-METEO-FRANCE). The MODIS data were obtained from the Level 1 and Atmosphere Archive and Distribution System (LAADS).