The Low Frequency Array (LOFAR) is a synthesis radio telescope covering the 10–240 MHz range. LOFAR is the first operational pathfinder to the Square Kilometre Array (SKA), a future radio telescope envisaged to be at least an order of magnitude more sensitive than current instruments. LOFAR exploits the same hierarchical beamforming structure as envisaged for the SKA phased array systems. In this paper, we describe the system requirements imposed by calibratability, i.e., the ability to perform a proper self-calibration of the instrument, for the low frequency regime. We derive requirements on station size, aperture efficiency and side lobe level. We also discuss the impact of the polarimetric response of the stations. We discuss the LOFAR design choices made to satisfy these requirements and indicate their implications for SKA phased array systems. This demonstrates that calibratability imposes requirements complementary to those based on imaging requirements and that self-calibratability has a significant impact on configuration design considerations.
 On 12 June 2010, the Low Frequency Array (LOFAR) was officially inaugurated. LOFAR is an aperture synthesis radio telescope covering the 10–240 MHz frequency range [de Vos et al., 2009]. In an aperture synthesis radio telescope like LOFAR, signals from individual stations (subarrays with their own beamformer) are correlated. The correlation values, commonly referred to as visibilities in the radio astronomical community, can be transformed into a synthesis image. The station beam shape is used to track a specific area of the sky and determines the angular size, or field-of-view (FoV), of the image while the largest distance between two stations determines the image resolution.
 LOFAR is the first operational pathfinder to the Square Kilometre Array (SKA) using stations based on aperture array (AA) technology. In the context of the SKA, the term aperture array is commonly used to denote phased array antenna systems, in which single antenna elements or compound elements are used to sample the telescope aperture directly. The SKA is a future radio telescope envisaged to be an order of magnitude more sensitive than current instruments [Dewdney et al., 2009]. The SKA will consist of a number of synthesis arrays, each exploiting a different receiver technology to support observations in a different part of the radio spectrum. Two of these synthesis arrays will likely exploit aperture array technology. LOFAR provides an excellent starting point for the development of one of these synthesis arrays, the AA-low system envisaged to cover the 70–450 MHz range.
 In this paper, we start with a short overview of the LOFAR system to define the context for calibratability of a synthesis array system that uses beamforming at station level as well as at compound element level. The AA-low system is likely to have a very similar top level system design. We describe the current status of calibration at each level of the beamforming hierarchy and its limitations in section 3. We will see that proper self-calibration of a synthesis array requires estimation of the complex valued gain of each station in a number of directions. This implies that each station should have enough sensitivity to observe the relevant number of sources within the timescales dictated by ionospheric variations. We will also see that the number of directions for which we can solve for a complex valued gain per station is limited to 5–10 sources. The station beam should thus be matched to the ionospheric patch size to ensure that 3–5 directions are sufficient to interpolate the direction dependent gain values using an appropriate model for the station beam pattern and the ionosphere. Low station side lobes are required to reduce the number of source signals detected outside the main beam area. This ensures that the 3–5 sources in the main beam and a few strong sources in the side lobes can all be properly treated. This leads to requirements on station size, aperture efficiency and side lobe level as discussed in section 4. We also discuss the LOFAR design choices made to satisfy these requirements and indicate their implications for the AA-low system. This demonstrates that calibratability imposes requirements complementary to those based on imaging requirements and that calibratability has a significant impact on configuration design considerations. Although much progress has been made in these areas, a calibration and imaging working group has been formed to continue the investigation of these issues within the context of the aperture array verification program (AAVP). This is briefly introduced in an outlook toward the future in section 5, before we conclude this paper.
2. A Brief Description of LOFAR
 LOFAR is a multipurpose radio telescope [de Vos et al., 2009] that uses phased array technology at various levels. It exploits two types of antennas, the low band antenna (LBA) and the high band antenna (HBA) to cover the 10–90 MHz and 110–240 MHz range respectively. The HBAs are grouped in tiles, a compound element consisting of a 1.25-m spaced regular 4 × 4 array of HBAs, whose signals are combined in an analog beamformer. These receive elements (LBAs and HBA tiles) are connected to the station back end in which the signals are digitized and beamformed to form a station beam. The station beamformer can handle 48 MHz signal bandwidth that can either be pointed in one direction or distributed over multiple directions to increase the instantaneous field-of-view (FoV). The resulting station beam signals are sent over an optical fibre network to the central processing facility where the signals are processed further.
 LOFAR exploits three types of stations: core stations, remote stations and European stations. Currently (June 2011), 21 of the planned 24 core stations are operational. These stations are located in the central area with 3 km diameter near the village of Exloo in The Netherlands. These core stations consist of 96 LBAs of which 48 can be used at the same time in either a compact configuration with 32 m diameter or one of four larger configurations with 82 m diameter. The 48 HBA tiles at the core stations are arranged in two 24-tile subarrays with 31 m diameter, whose signals can be beamformed separately thus effectively doubling the number of stations in the core area.
 The remote stations are located in Netherlands and provide baselines up to ∼80 km. In June 2011, 7 of the planned 16 remote stations were operational. Each of them consists of 96 LBAs, which can be configured in the same way as the LBAs in the core stations, and a single 48-tile HBA array with 41 m diameter. The long baselines (up to ∼1200 km) are provided by the international stations, which are spread throughout Europe. These stations consist of a 65-m 96-element LBA station and a 57-m 96-tile HBA station and have a back end with twice the number of input channels as the Dutch stations thereby allowing beamforming of either all 96 LBAs or all 96 HBA tiles. This provides the additional sensitivity needed to calibrate the long baselines. In June 2011, 8 of the 11 planned European stations were operational.
 The station back end does not only provide a digital beamformer, but also a station correlator to correlate the signals from the LBAs or HBA tiles. These data are used for station calibration and can be used to determine the sensitivity of the station after calibration [Wijnholds and Van Cappellen, 2011]. Those measurements indicate that the Dutch LBA and HBA stations have a peak sensitivity of 2.7 · 10−3 m2/K per dipole at 57 MHz and 3.7 · 10−2 m2/K per tile at 162 MHz respectively, giving a station peak sensitivity of 0.13 m2/K at 57 MHz and 1.8 m2/K at 162 MHz respectively.
 A crucial aspect of the system design of LOFAR, and probably the SKA aperture arrays, is the hierarchical beamforming scheme. At the bottom of this hierarchy, we have the element beam pattern of a single antenna, at the top, we have the synthesis array beam. The analog tile beamformer combines the signals of 16 HBAs. The digital station beamformer combines the signals from all HBA tiles or all LBAs within a station to form a station beam, whose output can be combined with the beamformed signals from other stations to form the array beam of the synthesis array. The sensitivity of the beam increases with each level while the FoV becomes smaller. The array beam formed by combining the station signals determines the instantaneous resolution of the synthesis telescope, while the station beam determines its instantaneous FoV, which is ∼65 square degrees at 60 MHz for the compact LBA array, ∼40 square degrees at 30 MHz for the large LBA array and ∼6.3 square degrees at 150 MHz for a 48-tile HBA station.
 A crucial consequence of beamforming is, that the station beam shape changes with elevation due to projection of the station aperture on the sky. Furthermore, the station beam rotates w.r.t. the observed field, which also moves through the polarized element beam patterns of the antenna elements. We need corrections for each of these three effects if we use Fourier inversion to make a synthesis image using the correlations between the station signals.
3. Calibration Strategies
 All beams in the beamforming hierarchy need to be steered and accurately known to allow high dynamic range and high fidelity imaging over the large FoV provided by the station beams. Beamforming requires calibration of the received signals at each level of this hierarchy. Due to differences in available baseline lengths, sensitivity and FoV, calibration requires a different calibration strategy at each level. This implies that each level should be designed in such a way that it can indeed be calibrated. In section 4, we show that this calibratability requirement has a profound impact on the configuration of the elements within a station array and the stations within the synthesis array. Wijnholds et al.  provide a nice overview of the distinct calibration challenges at each level of the beamforming hierarchy, potential solutions and a number of open issues. In the next sections, we will briefly touch upon each level in the beamforming hierarchy to show where LOFAR is standing now and which conditions should be met to ensure calibratability of the system.
3.1. Calibration of the HBA Tiles
 The analog tile beamformer is implemented as a 5-bit beamformer with 0.47-ns true time delays providing a maximum delay of 14.6 ns. This design gives a residual phase error with a sawtooth pattern and larger errors at elevations below 30 degrees for the corner elements of a tile. The production tolerances of the delay lines are ±0.1 ns. Since we cannot fine tune the available settings, the HBA tiles are steered using a table with available delay settings that is based on the design of the system. There are two methods to verify the operation of the tiles in more detail:
 1. We can turn on just one randomly selected element in each tile, thus making an array of 48 (96 for the European stations) HBAs, that can be calibrated using the same multisource calibration algorithm used for the LBA array as described by Wijnholds . This scheme has already been used to produce some all-sky images with a single HBA station. By repeating this calibration for different delay settings and different elements, all signal paths in the HBA tiles can be individually characterized.
 2. We can also use the holographic measurement technique developed by Hampson and Smolders . In their scheme, different beams (superpositions of antenna signals) are formed while the voltage response toward a single calibration source is measured. If a sufficient number of superpositions is formed, the response of the individual antennas can be reconstructed.
3.2. Station Calibration
 For station calibration, the station correlator provides all correlations between the individual antennas in LBA mode and between all tiles in HBA mode, but only over a narrow subband. The full bandwidth is obtained by selecting different subbands in consecutive short term integrations. The main goal of station calibration is to determine the electronic gain differences between the signal paths in the station, such that these differences may be compensated in the station beamformer to provide a nominal station beam. Accurate station calibration also enables effective nulling of RFI and high resolution direction-of-arrival estimation techniques. Early prototype systems proved to be invaluable to discover the challenges imposed by an actual system [Wijnholds, 2010]. Based on the experience gained from the prototype stations, a station calibration pipeline was developed with a detector for unsuitable data before calibration and a detector for erroneous results after calibration.
 The calibration algorithm solves for (1) a direction independent complex valued gain for each signal path; (2) a direction dependent gain toward each calibration source common to all antennas; and (3) a noise covariance matrix that includes additive terms for all baselines shorter than four wavelengths and for baselines with visibilities affected by crosstalk.
 The algorithm was developed using a model based approach in which the received signals were described by a single matrix equation. This allowed a mathematical description of the estimation problem, that was used to find closed form solutions for the aforementioned subsets of parameters. Wijnholds and van der Veen [2009a, 2009b] have shown that a weighted alternating least squares algorithm in which these closed form solutions are used while iterating over the parameter subsets, provides a solution that not only gives a statistically optimal solution, but is computationally efficient as well. This algorithm uses spatial filtering to suppress the signal from the Galactic plane using the point source model for the calibration sources only for intrastation baselines longer than four wavelengths. This implies that even the central elements of the station array need to form a baseline larger than four wavelengths to any of the other elements.
 The gain solutions suggest an amplitude correction that is constant over frequency in a given mode and a linear phase correction representing a specific time delay for each signal path. This model is fitted to the gain solutions and allows to inter- and extrapolate to subbands for which no calibration data were available, e.g., due to RFI. Comparison between the fitted model and the data and repeated calibration indicates that a fixed calibration table gives an RMS phase error of 1.5° and an RMS amplitude error of 1%. Assuming that the errors are independent, this results in a phase error of 0.2° and an amplitude error of 0.14% in the main beam voltage response of a 48-element station. These errors include systematic errors due to coupling and crosstalk. These errors give a degradation of the beam pattern that is negligible compared to other effects in the data. We therefore conclude that a fixed calibration table already provides a sufficiently accurate station beam pattern.
 An obvious extension to the current station calibration would be to include the directional response of the tiles. Mitchell et al.  have developed a calibration strategy for the Murchison Widefield Array (MWA), that takes into account the direction dependent response of the individual compound elements. This may provide a good starting point for the development of a similar scheme for the LOFAR stations or the SKA AA-low system. MWA is currently being deployed in the Western Australian Desert and will ultimately consist of 512 compound antennas comparable with the LOFAR HBA tiles [Lonsdale et al., 2009]. However, these compound antennas are placed in a random sparse array configuration limiting mutual coupling effects to only 16 elements and equal for all tiles. Another route to be explored is to include a parameterized mutual coupling model in the formulation of the calibration problem. Wijnholds  has shown that the effect of mutual coupling can be described by a linear transformation acting on the received signals. In a regular array, the matrix describing this transformation must have a certain structure based on symmetry arguments. Lanne  has shown that such a structure can be exploited to take mutual coupling into account in the array calibration.
3.3. Calibration of the Synthesis Array
 At synthesis array level, it is no longer possible to assume that the direction dependent effects are the same for all stations. This makes calibration at this level a challenging task. The approach taken by the LOFAR project is described by Noordam  and analyzed in more detail by van der Tol et al. . They showed mathematically that a calibration problem with a complex valued gain correction per station per direction is intractable in general, because the parameters are no longer identifiable. Based on simulation done by van der Tol et al.  we can conclude that only 5–10 sources can be handled properly in 3–4 iterations over the data to iteratively refine the source model. This begs for a parameterization of the calibration problem that exploits continuity of the calibration parameters over time, frequency and position in the FoV. Wijnholds et al.  provide an overview of routes that are being considered.
 Currently (June 2011), the LOFAR calibration at array level only solves for a single direction independent complex valued gain per station and uses successive estimation and subtraction or “peeling” to remove the strong sources in the FoV [Noordam, 2002]. In the “peeling” approach, spatial filtering techniques are used to estimate the response toward the strongest sources which is then subtracted. This procedure is then repeated for the next strongest source, and so on. At the end, we have collected information on the response toward the strongest sources, that can be used to fit, e.g., an ionospheric model.
 Although the latter step has not been taken yet, the LOFAR project is in a good position to do so based on past research by project members. Station based direction dependent gains were introduced in the calibration model to handle variations in the station beam patterns and the ionospheric propagation conditions. The latter have received considerable attention. Intema  has proposed to fit a phase screen model to the puncture points, which are the points at which the lines-of-sight from the stations to the peeled sources intersect the phase screen. van der Tol  has shown that the coefficients of the base functions describing the phase screen can be estimated in a robust way using a maximum a posteriori estimator. He also demonstrated that a set of data dependent base functions can be found using the Karhunen-Loève transform.
 Recently, Smirnov [2011b] has demonstrated that differential direction dependent gain amplitudes can be extracted for a number of calibration sources within the FoV in data from an actual WSRT observation on 3C147. This is an important step in view of the puncture points needed to constrain the instantaneous beam shape. However, since most astronomical sources are unpolarized, these puncture points may be insensitive to a unitary Jones matrix ambiguity that has been identified by Hamaker . The solutions may give perfect reconstruction for each puncture point, but may lead to errors in the interpolation required for the reconstruction of sources between the puncture points.
 This discussion about the identifiability of the polarimetric response in the puncture points begs the question whether puncture points are the appropriate route to find the ionospheric phase screen. It is interesting to note that Mitchell et al.  have developed a scheme based on source position shifts for the MWA and that the station calibration developed by Wijnholds  solves for the apparent source power and, if required, apparent source positions to handle the large scale effects of the ionosphere. A Cramèr-Rao bound analysis shows, that source positions and source powers are identifiable while the antenna or station phase and polarimetric response along a given line of sight are not. Proper calibration and imaging will therefore require us to choose our parameterization carefully.
4. Calibratability Requirements
4.1. Impact of the Ionosphere
 The ionosphere can have significant impact on the propagation of radio waves at LOFAR's operating frequencies. Currently used ionospheric models either describe it as a turbulent phase screen or in terms of traveling ionospheric disturbances (TID). If we use the Kolmogorov turbulence model, we can define a coherence size, the diameter of a circular area over which the RMS phase variance is at most one radian squared, which scales with wavelength λ as λ−6/5 [Thompson and Bregman, 2006]. An important characteristic of the Kolmogorov model [see van der Tol, 2009, and references therein] is that the maximum phase change over the length scale is 2π radian, but if the circular diameter is increased to ∼3 times the length scale and a best fit phase plane is subtracted, the phase variance over this area will be 1 radian squared. The result is, that an object appears shifted by the tilt and blurred by the residual phase errors even in an exposure shorter than the coherence time.
 The TIDs are caused by acoustic density waves in the lower ionosphere. The associated fluctuations in electron density produces a wave pattern in the phase screen whose amplitude scales linearly with wavelength. For the Netherlands, we find a typical period of ∼10 minutes and with an assumed propagation speed of 0.15 km/s at a height of 200 km [van Velthoven, 1990], we find a typical wavelength of 90 km, which corresponds to an extent of 24° on the sky. A single sine like wave pattern can be subdivided in six intervals ([−π/6, π/6], [π/6, 3π/6], [3π/6, 5π/6], etc) of 1/6th of a wavelength (or 4° on the sky) that can be approximated by either a linear or a parabolic phase screen. Although these are only low order polynomials, it implies that direction dependent self-calibration for wide field synthesis imaging is needed to reach the thermal noise floor.
 The impact of the ionosphere on low frequency observations has been studied using observations in the 138–157 MHz band with the WSRT by Bernardi et al.  and data from the VLA Low-frequency Sky Survey (VLSS) at 74 MHz [Cohen et al., 2007] while much better data will become available from observations with LOFAR covering baselines from 100 m to 1200 km. Based on the available data, we can identify the TIDs as a recognized physical phenomenon with a characteristic size and period that not only matches the characteristic phenomenological description of Kolmogorov turbulence, but could form the basis for physics based simple phase screen modeling. For the stations, we then need a size that gives a station beam width of 4° irrespective of frequency, which is in contrast with the characteristic coherence size according to the Kolmogorov model that predicts scaling almost proportional with frequency.
4.2. Station Size
 From the previous section, we can conclude that the station beam size should be about 4° to ensure that the ionospheric phase screen over the FoV can be described by a low order polynomial. The main beam size can be measured as the distance between the first nulls, which is about 2.4λ/D, where D denotes the diameter of the station, or as the half power beam width, which is about λ/D for a completely filled aperture with uniform taper. The latter criterion is a less stringent criterion than the first thereby accepting potentially large variations in the ionospheric phase screen in the main beam area between the half power contour and the first null. The size of the beam scales proportional to λ in either case, which means that the lowest frequency to be observed defines the station size. If we take the half power beam width as a measure of main beam size, the 65-m European LBA stations are matched to the ionospheric patch size down to 66 MHz and the 82-m Dutch LBA stations down to 52 MHz. This shows that observations in the lower part of the LBA band should be done under good ionospheric conditions to be successful. The same argument shows that the beam width of the 31-m 24-tile HBA stations in the core is matched to the ionospheric coherence size down to 138 MHz, while the 48-tile and 96-tile stations are sufficiently large for their beams to be matched over the full range of operating frequencies.
 The station size also affects the calibratability of the receiving elements within the station. The model based station calibration currently used to produce the station calibration tables for LOFAR applies a four wavelength baseline restriction to the data to filter signals on large spatial scales like the Galactic plane [Wijnholds, 2010]. Such a baseline restriction enforces a station diameter of at least eight wavelengths to allow calibration of the receiving elements in the center of the array. Since the Dutch stations allow the operators to select a compact LBA array consisting of 46 LBAs in a 32-m diameter area in the center of the array, two LBAs have been placed at the edge of the station field to allow calibration of this compact array, This station calibratability argument shows that the 82-m Dutch stations can be calibrated down to 30 MHz, while the 65-m European stations can be calibrated down to 40 MHz. Fortunately, we can extrapolate the calibration results to the lowest observing frequencies.
 This station size requirement can potentially be reduced by using redundancy calibration. Redundancy calibration exploits the regularity in the array configuration by using the fact that the same baselines formed by different antenna pairs probe the same spatial structure and should thus measure the same visibility. This idea was originally proposed by Noordam and de Bruyn  and developed further by Wieringa . The key assumption is that all receiving elements in the array have the same reception pattern. This assumption is also made in the currently used model based calibration scheme, which only solves for a direction dependent gain that is equal for all elements. In practice, however, the tile beam patterns may differ due to electromagnetic coupling between antenna elements in adjacent tiles and production tolerances. The applicability of redundancy calibration to aperture arrays was studied by Noorishad et al. . Although the first experimental results looked promising, it turns out that both redundancy and model based calibration break down if the dominant signal is detected via one of the tile side lobes instead of the main beam [Noorishad et al., 2011], which differ significantly and should therefore be modeled taking into account a direction dependent response per element. P. Noorishad et al. (Redundancy calibration in phased array stations, submitted to Astronomy and Astrophysics, 2011) are currently developing quantitative limits to the applicability of redundancy calibration, that can be a valuable input to decisions on the calibration strategy and station configuration of future instruments.
4.3. Aperture Efficiency
 If the main beam size is appropriately matched to the ionospheric patch size, direction dependent gain information for 3–5 sources within the FoV should be sufficient to characterize the ionospheric phase screen and basic deviations of the station beam pattern, such as mis-pointings. In view of the variability of the ionosphere, LOFAR is specified to support an update rate of 10 s for the direction dependent gain parameters. If a station can detect three sources within this interval, we can solve for a planar phase screen. This can be extended to a curved phase screen if the station can detect five sources.
 Based on the source statistics given by Scheers , we can find a requirement on the aperture efficiency or filling factor for low frequency instruments as discussed by Wijnholds et al. . The filling factor can be defined as the ratio of the cumulative effective area of all receiving elements in the station and the physical area, which we assume to be 104 m2. We further assume a typical fractional bandwidth of 20% and the aforementioned 10 s integration time. Although the system temperature at the lowest frequencies is dominated by the sky noise, we have added a contribution of 50 K to account for the antenna and receive electronics, which dominates at the highest frequencies.
Figure 1 shows the estimated number of detected 5σ sources, of which at least 3 are needed for direction dependent gain calibration, as function of filling factor for a number of frequencies in the LOFAR and AA-low frequency range. Note that to realize the same filling factor, a different number of antennas is required at each frequency, i.e., this plot does not describe the situation of a practical system with a given number of antennas within a given physical area in which the effective area, and hence filling factor, varies. The curves for 50 and 100 MHz almost overlap, because the system temperature is sky noise limited. This causes the decrease in FoV to be compensated by an increase in sensitivity making the number of sources within the FoV almost constant over this range. In this range, a filling factor as small as 0.1 would just do. In LOFAR, this criterion is met up to 50 MHz by the sparsest possible configuration of the Dutch LBA stations. The filling factor of the compact LBA array is 0.4 at 60 MHz decreasing to 0.2 at 90 MHz. The filling factor of the 65-m European stations is 0.1 at 90 MHz improving to 0.25 at 60 MHz and an almost filled aperture at 30 MHz.
 At higher frequencies, the calibration becomes more challenging, because the system is no longer sky noise dominated. As a result, higher filling factors are required. The filling factor should increase from 0.1 at 100 MHz to at least 0.2 and preferably 0.4 at 400 MHz. The LOFAR HBA tiles provide a completely filled aperture below 138 MHz and a filling factor of 0.33 at the highest operating frequency of 240 MHz, thus easily satisfying this filling factor requirement.
4.4. Dense and Sparse Operating Regimes
 The required aperture efficiency is a key argument in the choice between a dense or a sparse aperture array. In this section we briefly mention a number of other considerations. We start this section with a summary of some important properties of phased arrays in the context of low frequency radio astronomy. The antenna spacing in a narrowband array design is typically ∼λ/2 to avoid grating lobes over the entire scan range. If a phased array is used over a broad frequency range, the array will become sparse at the highest frequencies, thereby narrowing the width of the main beam and causing appearance of grating lobes when scanning beyond a certain zenith angle. At the lowest operating frequencies, the effective area is approximately equal to the physical area of the array, but the effective impedance of the antenna elements varies. As a result, the receiver temperature does not only vary with frequency, but also with scanning angle.
 Below 300 MHz, the sky noise temperature is proportional to λ2.6, while the effective area of the antennas in a sparse configuration scales with λ2. As a result, the sensitivity, determined by the ratio of effective area and system temperature, increases toward higher frequencies for a sparse station array. This even compensates for the fact that most celestial sources become weaker with frequency. Above 300 MHz, the receiver temperature of about 50 K starts to dominate the system temperature. This shows, that a sparse array is a very attractive option at frequencies below 300 MHz, but that the sensitivity decreases rapidly once the array becomes dense at the lowest operating frequencies. This attractive behavior is specific to a sky noise dominated system and does not hold for a system in which the receiver noise dominates.
 The trade-off between a dense and a sparse configuration does not only depend on the required filling factor but also on the desired array characteristics. These have been studied for dense vs. sparse and regular vs. irregular arrays by van Cappellen et al. . This study shows that a sparse regular array is unattractive for radio astronomical applications in view of the presence of grating lobes and the non-smooth variation of the average element beam pattern over the FoV. A sparse irregular array “scrambles” the grating lobes producing a ring of high side lobes at the same distance from the main beam where gratings lobes would normally appear, and it has a smooth average beam pattern. These attractive properties explain why a randomized exposhell configuration was chosen for the LBA station array of LOFAR. An exposhell configuration has a constant number of elements in annuli with exponentially increasing width.
 This study also showed that dense arrays, both regular and irregular, have many attractive properties, such as the absence of grating lobes and a smoothly varying sensitivity over position and frequency. Unfortunately the mutual coupling between the elements increases at lower frequencies and causes variation in impedance and effective receiver noise temperature. An array with a broad frequency range that is dense at the highest operating frequency, also has a much smaller effective area than a sparse array at the lowest operating frequency at the same cost. The regular arrangement of HBA antennas in a station is therefore only dense below 138 MHz, introducing the disadvantages of a sparse regular array at higher frequencies. One of these disadvantages are grating lobes. In the next section we discuss the measures taken to reduce their impact on the image quality of a synthesis array with phased array stations.
4.5. Suppression of Sources Outside the Main Beam
 A high aperture efficiency does not only improve the sensitivity toward the calibrators within the FoV, but also the sensitivity toward strong sources like Cas A seen in the side lobes and sources in the grating lobes, which could have similar strength as the calibrators in the main beam. A detailed analysis by van der Tol et al.  showed that only 5–10 sources with sufficient intensity can actually be handled properly in 3–4 iterations over the data to iteratively refine the source model. Since we want to use 3–5 sources to characterize the station beams and the ionosphere, it is safe to assume that we can only handle a few sources seen by the station side lobes. Hence, source detections outside the FoV should be limited to the strongest sources in the sky, such as Cas A, Cyg A, the Sun and an incidental RFI source. Calibratability of the array thus requires that the side and grating lobes are sufficiently suppressed. In LOFAR, three measures are taken to suppress the side lobe response: station rotation, tapering and frequency averaging.
4.5.1. Station Rotation
 The influence of interfering sources outside the FoV can be reduced by lowering the side lobe level of the average station beam pattern of the entire synthesis array. Rotation of the station configurations w.r.t. each other is a very effective strategy as demonstrated in Figure 2. Figure 2 (top) shows the average station power beam pattern for nine 24-tile HBA station at 180 MHz assuming an isotropic element beam pattern. In this case, the grating response is as sensitive as the main beam. Figure 2 (bottom) shows the response for nine stations rotated in steps of 10° with respect to each other suppressing the grating response by 20 dB in a synthesis observation. The rationale behind this method is that a grating lobe of one station coincides with a side lobe or even a null of another station. Averaging over all interferometer contributions according to their weight results in the effective station beam pattern of a synthesis observation. Rotation also ensures, that an interfering source only affects a limited number of interferometers at a given instant. The corresponding data can often be discarded with only limited impact on further processing.
 This method is used throughout the LOFAR system. Since the LBA station arrays have a randomized exposhell configuration, they don't have rotational symmetry, so the orientations of the LBA station arrays are distributed over 360°. The HBA station arrays have four fold symmetry. Distribution of the station orientation angles over 90° therefore already provides maximum suppression. In LOFAR, the rotation of the station configuration is done while counter-rotating the antennas such that all antennas are aligned to a constant axis to preserve the polarimetric properties of the station. This is discussed in more detail in section 4.6.
 The side lobes close to the main beam can usually be suppressed by either a space taper (decreasing density of receiving elements toward the edges) or tapering in the beamformer (decreasing weights toward the edges of the array). Figure 3 shows a nice example of a space taper: due to the exposhell configuration of the LBA array, the spatial density of elements decreases outwards thus tapering off the aperture. This suppresses a large range of side lobes close to the main beam by 5–10 dB. Tapering can also be a powerful tool in combination with station rotation, because high side lobes and grating lobes are now averaged out by side lobes that are even lower than in the untapered case.
4.5.3. Time and Frequency Averaging
 Tapering does not work for grating lobes (regular arrays) and far side lobes (irregular arrays) in sparse arrays. Fortunately, the sources at a large distance from the main beam have a considerably different geometrical delay between stations. By applying appropriate filtering on the phase variations over time and frequency of the signals that are correlated between stations, we can suppress signals outside the FoV considerably. This effect is exploited in the demixing algorithm proposed by van der Tol et al. . This approach seems to provide sufficient suppression of strong sources such as Cyg A in a 6-h observation with 30 MHz bandwidth using the Dutch HBA stations. This is promising since subtraction of Cyg A requires extensive source modeling, a route explored by S. Yatawatta (Radio astronomical image deconvolution using prolate spheroidal wave functions, arxiv.org/abs/1101.2830, 2011).
 Although Carozzi et al.  have argued that polarimetric diversity among array elements may improve the wide-field polarimetric performance of the array, an array of elements with the same polarimetric characteristics is usually preferred. The advantage of a polarimetrically homogeneous array is, that the array can be considered as two single polarization arrays each consisting of receiving elements that capture signals with the same polarization. The signals from these two arrays can be processed independently using a scalar, i.e., single polarization, measurement equation instead of a full polarimetric measurement equation. We will refer to this approach as the bi-scalar approach as opposed to the full polarimetric approach. The disadvantage of a polarimetrically homogeneous array, as pointed out by Carozzi et al. , is that all antennas in the array may have poor polarimetric performance for the same specific directions of arrival.
 In most imaging observations, the field of interest is tracked for a few hours. During the observation, such a field traces out a path through the voltage beam patterns of the antennas in each station. This implies, that if these antennas have poor polarimetric performance for a limited fraction of the visible sky, this will only affect the data over a limited amount of time during the observation. In view of this consideration, the advantage of being able to avoid first order polarimetric corrections per complex visibility sample was deemed more important for efficient image formation with LOFAR than the disadvantage of poor polarization discrimination in some directions. It was therefore decided that the antennas of all stations within the core area should all have the same orientation and that the antennas throughout the LOFAR system would be oriented as parallel as reasonably possible considering the curvature of the Earth. This allows a single polarimetric correction to be made to all visibilities that are observed at the same instant, or a snapshot image assuming that all antennas have the same polarimetric response.
 Since the LOFAR station configurations are rotated copies of each other, this means that the orientation of the antennas have to be counter rotated w.r.t. the station configuration to ensure their parallelism over the synthesis array. The electromagnetic coupling between neighboring antennas will therefore differ from one station to another. This may reduce the similarity between the station beam patterns. This is regarded as a second order effect in the LOFAR post-processing pipelines. However, it still needs to be demonstrated that such second order effects do not prevent us from reaching the thermal noise limit in the final imaging products.
4.7. Requirements for AA-Low in SKA
 The discussion above shows that matching to the ionospheric coherence size imposes the most demanding requirement on the station size. If matching of the half power beam width is deemed sufficient, this would require a 61-m station, assuming a lowest operating frequency of 70 MHz. This is already sufficiently large to allow a four wavelength baseline restriction to be used in station calibration. This station size requirement may become more stringent if it is decided to match the diameter of the ring of nulls around the main beam to the ionospheric patch size or broadening of the beam by tapering of the station is taken into account. For example, if parabolic tapering is applied, the half power beam width grows to 1.28λ/D. This decreases the aperture efficiency to 75%, but the FoV increases by a factor 1.64 [Bregman, 2004]. As a result, the survey speed is only reduced by a factor 0.93, but the side lobes are reduced considerably. Low side lobes may help to suppress detection of sources outside the main beam as discussed in section 4.5, which facilitates the calibration and image processing.
Figure 1 indicates that the aperture efficiency of an AA-low station should increase from 0.1 at 70 MHz to at least 0.2 but preferably 0.4 at 400 MHz. This requirement can potentially be relaxed if a larger contiguous FoV is created by multibeaming. However, such an approach needs further study and will reduce the multibeaming flexibility of the instrument.
 Although LOFAR is designed to exploit the advantage of parallel dipole orientation throughout the array, Carozzi et al.  have made a strong case for polarization diversity. In view of recent developments in polarimetric calibration [see, e.g., Wijnholds, 2010; Smirnov, 2011a, 2011b; Warnick et al., 2011], which builds upon earlier work by Hamaker et al.  and Sault et al. , and the fact that polarimetric and other direction dependent corrections will be required anyway, a completely full polarimetric signal processing scheme may be more attractive for the SKA aperture array systems than the bi-scalar approach commonly used to date. The experience gained from LOFAR is expected to play a crucial role in making this trade-off.
5. Next Steps
 The international efforts toward the design and the technology development for aperture arrays for the SKA are coordinated in the context of the Aperture Array Verification Programme (AAVP, www.ska-aavp.eu). The programme is executed by institutes in Europe and Western Australia in association with South Africa and with United States observers. A major goal of AAVP is to demonstrate complete performance of phased array stations operating up to a frequency of 1.4 GHz using a substantial demonstrator system. The demonstrator system is envisaged to consist of a sparse AA-low array operating between 70 and 450 MHz and a dense AA-mid array covering the 400–1400 MHz range.
 One of the key issues that should be addressed to attain the goals of the AAVP is the challenge of designing a system that can ultimately achieve 70 dB dynamic range. Dynamic range has to be understood as the ratio between the strongest object in a synthesis image and the average noise level and artifacts in the image. The feasibility of achieving 70 dB dynamic range should be demonstrated by prototype systems and predicted for the full array in simulation. This will not only require a well designed system that can be properly calibrated, but more importantly effective calibration and imaging algorithms that limit the required processing capacity to what can be provided by affordable processing platforms. These issues are investigated by the calibration and imaging working group.
 In this paper, we discussed how calibratability arguments have influenced the design of the LOFAR system considerably and indicated some implications for the SKA AA-low system. Despite all the work that has already been done, it is still very hard to quantify the required beam shape accuracy and the required station beam stability to achieve the 70-dB dynamic range envisaged for the SKA system. These values are required to derive requirements on the setting accuracy in lower levels of the beamforming hierarchy and on the accuracy and stability of the antenna hardware by means of an error propagation analysis.
 Although this paper shows that high dynamic range imaging with LOFAR is feasible using self-calibration techniques, there are no images available yet where all the required aspects are fully implemented to prove that all aspects are fully understood. Especially the predictability of the shape of a tracking station beam including mutual coupling effects and ionospheric disturbances is crucial to reach the 70 dB dynamic range criterion. Predictability not only requires a proper system model with parameters that can be accurately estimated by self-calibration at specific instants, but also requires stability and known steering effects between these instants. If the system can be designed in such a way that the envisaged dynamic range is feasible, the next question will be how calibration and imaging should be implemented to achieve it.
 Ultimately, the dynamic range will be limited by the effective noise floor in the image. As discussed by Wijnholds and van der Veen , the effective noise floor consists of three components: thermal noise, estimation noise and source confusion. Estimation noise stems from the fact that estimation of calibration parameters extracts information from the data that can no longer be used to construct the image. Wijnholds and van der Veen  have already demonstrated that the effective noise in the image increases with the number of calibration parameters that are estimated. Depending on, e.g., the sensitivity and stability of the system, we may thus have to design calibration mechanisms into the system to prevent the extraction of calibration parameters from causing an unacceptable increase in the image noise.
 In this paper, we gave an overview of the current status of the LOFAR system and its calibration strategy. LOFAR is the first operational telescope to exploit a hierarchical beamforming structure similar to the one that is likely to be used in the SKA aperture array systems. We identified the calibratability of the system at each level as a key factor to the success of the instrument. From the calibratability requirement, we derived a number of system requirements for LOFAR and the AA-low system:
 1. The station beam size should be matched to the ionospheric patch size.
 2. The aperture efficiency should have a certain minimum value to observe sufficient self-calibration sources in the main beam to allow a full ionospheric phase screen solution for the station beam of each telescope. For AA-low, for example, it should be at least 10% at 70 MHz increasing to 20% but preferably even 40% at 450 MHz.
 3. The side and grating lobe responses should be suppressed to a level that the self-calibration loop at array level only has to deal with the strongest sources in the sky, like Cas A and the Sun, and an occasional RFI source.
 4. Although LOFAR exploits parallel alignment of the dipoles to facilitate the calibration, an analysis by Carozzi et al.  suggests that polarization diversity may improve system performance. This needs to be reassessed for the SKA aperture arrays in view of recent advances in polarimetric calibration and efficient image construction.
 This demonstrates that calibratability arguments have a separate impact on the design of an imaging instrument that is as profound as the impact of the imaging requirements that usually attract most attention.