Holographic Optical Elements for Augmented Reality: Principles, Present Status, and Future Perspectives

Holography refers to the process of recording a complete wave field of interfered coherent beams into a medium (hologram), which can be used to reproduce the original wave field. Since its invention by Dennis Gabor in 1948, the quality of recorded holograms has been dramatically improved due to advancements in materials and recording methodology. As a result of the unique property of wavefront manipulation, holographic optical elements (HOEs) have since found pervasive applications in the fields of data storage, solar concentration, imaging, and display. Lately, augmented reality (AR) has gained unprecedented research interest in both academia and industry because of its potential to become the next-generation display, which could fundamentally transform our daily lives. The basic concept of AR is to seamlessly blend virtual digital contents with real surrounding environments. However, the see-though capability of AR, along with the requirement of delivering high-fidelity images to the viewer’s eyes, poses great challenges to optical designs in terms of field of view (FOV), eye box size, image contrast ratio, and generation of correct focus cues, just to name a few. Generally, AR systems with traditional geometric optics are based on a partialmirror combination. Freeform surfaces can be adopted for aberration correction and achieving higher image quality, but the tradeoff between system form factor and the product of FOV and eye box remains a huge obstacle toward commercialization. Recently, HOE-based AR systems have gained increasing momentum because of the diverse functions of HOEs and large degrees of freedom in design and choice of materials. Various systems have been proposed to resolve the issues related to focus cue generation, system form factor, FOV, and eye box size. In this review, we will first introduce holography methods and the underlying physics of HOE formation. Next, we will describe some unique optical properties of HOEs and their functionalities. After that, we will briefly review the applications of HOEs in several AR display systems and discuss their pros and cons. Finally, we will cast some perspectives on future developments of HOEs for AR displays.


Introduction
Holography refers to the process of recording a complete wave field of interfered coherent beams into a medium (hologram), which can be used to reproduce the original wave field. Since its invention by Dennis Gabor in 1948, the quality of recorded holograms has been dramatically improved due to advancements in materials and recording methodology. [1] As a result of the unique property of wavefront manipulation, holographic optical elements (HOEs) have since found pervasive applications in the fields of data storage, [2] solar concentration, [3] imaging, [4] and display. [5] Lately, augmented reality (AR) has gained unprecedented research interest in both academia and industry because of its potential to become the next-generation display, which could fundamentally transform our daily lives. [5a,6] The basic concept of AR is to seamlessly blend virtual digital contents with real surrounding environments. However, the see-though capability of AR, along with the requirement of delivering high-fidelity images to the viewer's eyes, poses great challenges to optical designs in terms of field of view (FOV), eye box size, image contrast ratio, and generation of correct focus cues, just to name a few. Generally, AR systems with traditional geometric optics are based on a partialmirror combination. [5a] Freeform surfaces can be adopted for aberration correction and achieving higher image quality, but the tradeoff between system form factor and the product of FOV and eye box remains a huge obstacle toward commercialization. Recently, HOE-based AR systems have gained increasing momentum because of the diverse functions of HOEs and large degrees of freedom in design and choice of materials. [5b,6b] Various systems have been proposed to resolve the issues related to focus cue generation, system form factor, FOV, and eye box size.
In this review, we will first introduce holography methods and the underlying physics of HOE formation. Next, we will describe some unique optical properties of HOEs and their functionalities. After that, we will briefly review the applications of HOEs in several AR display systems and discuss their pros and cons. Finally, we will cast some perspectives on future developments of HOEs for AR displays.

Holography Methods
The unique wavefront-regeneration property of an HOE results from recording the interference pattern of the object wave and reference wave. The recording medium, responding to the intensity or polarization direction of the electric field, records the distribution of interfering fringes by converting it to a physical grating pattern, through the modulation of the transmittance, refractive index, or molecular orientation. When a reference beam is incident on the recorded medium, the object wave can be reconstructed with the diffraction of local grating patterns. Depending on how the recording medium responds to light, the holography methods can be categorized into intensity holography and polarization holography.

Intensity Holography
In intensity holography, the recording materials are sensitive to the light intensity of the interfering field. With respect to the type of modulation, holograms can be classified into amplitude hologram and phase hologram. In amplitude holograms, a commonly used material is photographic film with silver halide emulsions. [1] The recording of amplitude gratings is similar to the exposure process of picture-taking in a film camera. The silver halide, after absorbing the energy in high-intensity regions, forms nanoscale silver particles in the medium, which can be later developed to form permanent patterns, as Figure 1a shows. The intensity information of interfering light is transferred to transmittance modulation.
Phase holograms are generally based on the modulation of a material's refractive index. Among various approaches, including photorefractive materials, dichromated gelatin, and photoresists, [1a,b] photopolymers have the advantages of low cost, low scattering, high resolution, and simplicity of fabrication, and therefore receive widespread applications. [3,5b] Holographic photopolymers are based on light-intensitydependent polymerization rate and diffusion of monomers in the recording process. As shown in Figure 1b, in the highintensity regions, the monomers absorb photons and form interconnecting chains (polymerization). The consumption of monomers therefore causes the diffusion of monomers from dark regions to bright regions, leading to increased density and refractive index in the bright regions. Such an intensity information of the interfering field is therefore recorded as index modulation in the material. Another type of holographic material, called holographic polymer-dispersed liquid crystal (HPDLC), [7] adopts a similar mechanism of monomer diffusion and polymerization, but also contains a liquid crystal (LC) that is dynamically switchable. [7d,e] During the formation process, the monomers diffuse to bright regions and then polymerize, while the LCs diffuse to dark regions and form droplets with random director orientations, as shown in Figure 1c. The monomer concentration in an HPDLC is usually as high as %70 wt%, and thus the formed LC droplets are in %100 nm scale, so they do not scatter the visible light. When a voltage is applied, the LC directors inside the droplets are reoriented along the electric field direction, as shown in Figure 1d. If the refractive indices of the polymer and LC are chosen to match n polymer ¼ n o (the ordinary refractive index of the LC), then the whole structure becomes transparent. This is the voltage-off state of the grating modulation.

Polarization Holography
In recent years, HOEs based on polarization holography have attracted extensive interest due to their high efficiency, polarization selectivity, electrically switchable capability, and high image quality. [8] Unlike intensity holography, which records the intensity of interfering beams, polarization holography records the polarization state of an electric field based on photoinduced anisotropy. In polarization holography, usually recording beams with orthogonal polarization states are used, which results in a spatially varying polarization field. The basic mechanism of photoinduced anisotropy formation is the photoisomerization of azobenzene molecules. Namely, the molecules repeatedly go through trans-cis isomerizations and are reoriented perpendicular to the long axis of general elliptical polarization state, in which the absorption of light is minimal. Such a linear polarization state has the best reorienting capacity, so most polarization holography materials produce linear anisotropy. Similar to intensity holography, volume holograms record the spatial field in polarization holography, [9] but the required light intensity is usually very high because the recording medium is a solid LC polymer with an isotropic initial state. To induce an appreciable birefringence would require a large exposure dosage. Lately, photoalignment polarization holography (PAPH) has received much attention because of its low fabrication complexity and high quality of optical elements. [10] In PAPH, the recording of wavefront through photoinduced anisotropy only occurs in a thin layer of azo dye molecules, so the required light intensity is relatively low. Such a patterned photoalignment layer is later used to align the LC material placed on top.
The basic principle of PAPH is shown in Figure 2a. When two circularly polarized (CP) beams with opposite handedness interfere, the electric field on the plane exhibits a sinusoidal linear polarization pattern along the x-axis, which can be shown by the following equation 1 i e Àik 0 sin θ⋅x þ 1 Ài e ik 0 sin θ⋅x ¼ 2 cosðk 0 sin θ ⋅ xÞ sinðk 0 sin θ ⋅ xÞ (1) Figure 1. Schematics of various intensity holography recording processes. The dark-bright fringes in each diagram indicate the high (bright) and low (dark) light intensities of interfering fringes. a) Formation of a silver-halide-based amplitude hologram. b) Formation of a holographic photopolymer.
The arrows indicate the moving directions of monomers. c) Formation of an HPDLC. The arrow inside the LC droplet indicates the averaged director orientation. d) Reorientation of LC droplets in an HPDLC with an applied voltage.
where k 0 is the wavenumber and θ is the incident angle. The photoalignment material, which tends to align perpendicular to the electric field direction, records the pattern. Then the LC in contact with patterned photoalignment material replicates the pattern and forms a functional HOE. In the early stage of PAPH, a nematic liquid crystal (NLC) is used to form the HOE, [10d,11] as shown in Figure 2b. When an NLC layer satisfying the half-wave phase retardation condition is patterned following the photoalignment layer, it extracts the original wavefront with an incident CP light. This can be explained by the Jones matrix of the patterned half-wave plate The handedness of incident CP light is flipped, and the phase information is restored. Such an HOE is often referred to as geometric phase optical element (GPOE) or Phancharatnam-Berry optical element (PBOE). [10d,11d] In this case, the LC directors would maintain the alignment pattern along the z-direction. However, due to the inherent property of a LC whose directors tend to align uniformly to lower the free energy, this type of LC configuration is unstable because the NLC is deformed along the x-direction. Its structure will start to distort when the pattern period is smaller than the NLC layer thickness. [12] A direct consequence is that the diffraction angle of PBOEs is relatively small (≤10 ).
Lately, a new PAPH approach has been extended to cholesteric liquid crystals (CLCs) to form CLC optical elements (CLCOEs). [6b,10c,10d,12b,13] Unlike an NLC, which tends to align uniformly, a CLC tends to form a helical structure to find its lowest free energy state. Therefore, when a layer of CLC is in contact with the sinusoidal alignment pattern, to maintain the helical structure the bulk CLC tends to tilt to match the k-vector of bottom pattern, [12b] as Figure 2c shows. The tilt angle can be calculated by the following equation As a result, there exists a transitional region where the LC directors change from bottom planar alignment to a volumetilted helical structure. The deformation leads to nonzero free energy in the region. Such a transitional region is usually very thin (%10 nm), [12b] so its contribution to the total free energy is minimal. The self-forming CLC helical structure is very sensitive to the handedness of the input CP light. For example, if the incident left-handed circularly polarized (LCP) light has the same handedness as the CLC structure, it will experience a strong Bragg reflection and be diffracted into first reflection order. On the other hand, the right-handed circularly polarized (RCP) light will simply pass through the structure uninfluenced. Due to the inherent stability of the CLC structure, the diffraction angle a CLCOE can accommodate is very large (%70 ) because there is no alignment issue as in PBOEs.

Properties of HOE
Before we dive into detailed discussion of various HOEs and their properties, it is necessary to define some basic parameters. Because the local region of a general HOE can be regarded as a grating, we will first focus on the diffraction behavior of grating. For the volume grating shown in Figure 3a, two basic parameters are the slant angle α and grating pitch Λ. These two parameters together define a grating vector k ⇀ G with length k G ¼ 2π=Λ. When the incoming light interacts with the grating, it will be diffracted into multiple orders in the most general case. The diffraction angle of each order can be calculated by the following equation where m is the diffraction order, λ is the wavelength, n 1 and n 2 are the refractive indices in input and out regions, and Λ x ¼ Λ/cosα is the grating pitch in the x-direction. Diffraction efficiency is defined as the power ratio of a given order to the total input light. When the input and output light k-vectors k ⇀ in and k ⇀ out form a triangular relation with the grating vector k ⇀ G , as plotted in Figure 3b, the Bragg condition is satisfied. At Bragg condition, maximum interaction between input light and grating occurs, so the diffraction efficiency for that order is usually the highest.
When the k-vectors do not perfectly satisfy the Bragg condition due to deviation of incident angle or wavelength, the diffraction efficiency drops. However, this efficiency decrease is dependent not only on the degree of deviation but also on the index modulation within the grating. Understanding such a grating behavior is important to optimize HOEs for AR displays where the incident angle and wavelength usually vary in a large range. In Section 3.2, we will discuss this subject in detail.

Transmissive and Reflective HOEs
Depending on the configuration of the reference beam and reconstructed beam, both transmissive HOEs (t-HOEs) and reflective HOEs (r-HOEs) can be fabricated, as shown in Figure 4a,d. The main difference between t-HOEs and r-HOEs is the grating period in the z-direction. As shown in Figure 4b,e, the components of the grating k-vector in xand z-directions are k ⇀ Gx ¼ k 0 sin θ and k ⇀ Gz ¼ k 0 ð1 À cos θÞ for the t-HOE, where k 0 is the wavenumber. For the r-HOE, the portions of the k-vector in When the deflection angle is small, the size of the grating vector in the z-direction is much larger for the r-HOE than for the t-HOE, which indicates a higher requirement for grating resolution and scattering suppression.
For volume holograms in both intensity and polarization holography, the fabrications of t-HOEs and r-HOEs mainly differ in the configuration of recording beams. As Figure 4c shows, if the recording beam and reference beam are on the same side of the sample, then the resultant interfering fringes have a large period in the z-direction, which forms t-HOEs. When the two beams are on the opposite sides of the sample, the fringes with small period in the z-direction form r-HOEs, as Figure 4f shows.
In PAPH, because the photoalignment layer only records the information in the x-y plane, the fabrications of t-HOEs and r-HOEs only differ in the overcoated LC material. As discussed previously, when using an NLC or a CLC with low chirality, the grating k-vector in the z-direction is small and therefore results in t-HOEs. When a high-chirality CLC is used, the Bragg structure brings a large k-vector in the z-direction and therefore produces r-HOEs.
For the application of optical combiners in AR, r-HOEs are more widely used because of the requirement of good see-through ability. As shown in Figure 4g, for t-HOEs, the biggest issue is the diffraction of environmental light, which causes stray light to enter the viewer's eye and results in a ghost image of real objects. In contrast, for r-HOEs, stray light from the environment is reflected backward and does not influence the seethrough view, as shown in Figure 4h. Such a reflective property also enables the configuration of the display source on the viewer's side, which can effectively reduce the size of the whole system. Still, there are some specific applications of t-HOEs in AR, such as the input coupler of waveguide displays [6b,14] and the sandwiching of t-HOEs to form combiners free of stray light. [15] But r-HOEs generally enable wider applications as the direct combiner. Thus, from hereon we will focus on r-HOEs.

Diffraction Properties
The diffraction properties of various HOEs mainly depend on the refractive index modulation δn. The index modulation of a www.advancedsciencenews.com www.adpr-journal.com photopolymer-type volume HOE is usually small (%0.02), so it needs a thickness of %15 μm to produce efficient Bragg diffraction. For an HPDLC-type HOE, because the LC birefringence can reach Δn ≥ 0.2, the index modulation δn can be higher (%0.1). More specifically, if we assume the LC directors within droplets in Figure 1c are randomly distributed, then the macroscopic behavior of LC droplets is isotropic. The averaged refractive index can be calculated as [16] n ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi The index modulation is δn ¼n À n p ¼n À n o . If we use n e ¼ 1.7 and n o ¼ 1.5, then we get δn ¼ 0.07. Note that δn can be further increased if the switching property is not required, which means n p does not have to be the same as n o and it can be further reduced. For a CLCOE in PAPH, because the diffraction is based on the helical structure of the CLC itself, the index modulation is equal to the LC birefringence δn ¼ n e À n o ¼ Δn.
For HOEs, the spatial variance of the fringe period is generally much smaller than the fringe period itself, so the local region can be treated as a grating. Therefore, we study the basic diffraction properties of gratings of the aforementioned HOEs. For comparison, both the spectral and angular dependences of the first-order diffraction efficiency in the aforementioned HOEs are calculated. The calculation is based on rigorous coupled wave analysis (RCWA). [17] The index distributions in HPDLCs and photopolymer HOEs follow the sinusoidal pattern, with δn ¼ 0.02 for photopolymers and δn ¼ 0.07 for HPDLCs. For CLCOEs, n e ¼ 1.7 and n o ¼ 1.5 are used for calculation. The grating configuration is the same for all three HOEs. Namely, the Bragg pitch is 175 nm and the slant angle between the Bragg surface and the substrate is 20 . Both the input and output media are glass substrates with a refractive index of 1.58. The comparison of these three types of grating is plotted in Figure 5a,b. Because the spectral response is dependent on the incident angle and the angular response is also related to the wavelength, to offer a good guidance, we choose a wavelength and incident angle that can separately produce the maximum full width at half www.advancedsciencenews.com www.adpr-journal.com maximum (FWHM) in angular and spectral responses. Namely, the spectral response in Figure 5a is at 20 incident angle, and the angular response in Figure 5b is at λ ¼ 525 nm. As shown in Figure 5, both the spectral and angular bandwidths are proportional to the index modulation δn. Among these three types of gratings, the CLCOE has the largest spectral bandwidth (%100 nm) and angular bandwidth (%60 ) in glass. The spectral and angular bandwidths of HPDLC-type grating are %25 nm and 25 in glass, while for the photopolymer type the values are %8 nm and 15 in glass. Another thing to note is that the polarization state of incident light does not influence the results for HPDLC-and photopolymer-type gratings. This is because the studied gratings are of reflection type. The efficiency is generally insensitive to the difference in phase modulations for transverse electric (TE) and transverse magnetic (TM) when the grating thickness is large enough. For the CLCOE type, the situation is quite different. For CP light with the same handedness as the CLC (RCP here), the efficiency is high, as plotted in Figure 5. But for the opposite handedness (LCP), its efficiency drops to zero. This extreme polarization selectivity is a unique feature of CLCOEs compared to other types of HOEs.
To further investigate the dependence of grating efficiency on sample thickness, the spectral responses with different sample thicknesses for each type of grating are also plotted in Figure 5c-e. For the CLCOE type, a sample thickness of %2 μm can already yield an efficiency >90%, while for the HPDLC and photopolymer types, the required thickness is 6 and 15 μm, respectively.

Types of HOEs
According to the designed functions, HOEs can be categorized into grating HOEs, [10c,12b,13a,b,18] lens HOEs, [13c,e,19] lens-array HOEs, [19e,20] and diffuser HOEs. [20c,21] Most of the previously reported HOEs are fabricated using photopolymers, but they can also be fabricated using HPDLCs with the same recording configurations. The recording processes are shown in Figure 6. Fabrication of grating HOEs is the simplest, with only a collimated recording beam and a reference beam, as Figure 6a shows. The angle between these two beams and the wavelength define the volume fringe pattern, and therefore the diffraction behavior of the recorded HOE. For other types of HOEs, a template is usually needed to produce the desired wavefront. Such a template is usually placed near the recording medium to conserve the original wavefront, as shown in Figure 6b-d. For a diffuser HOE, although the recording beam has a random wavefront after the diffuser template and therefore forms randomized fringes when interfering with the reference beam, its www.advancedsciencenews.com www.adpr-journal.com see-through ability is still not affected. Because the angular response of photopolymer HOEs is narrow, only light close to the Bragg condition can be scattered, which means only light with incident angle nearly the same as that of the reference beam can be scattered. Another important feature of photopolymer HOEs is the capacity of recording multiple holograms into one sample. [19f,g,22] The recording process is simply the consecutive combination of required exposure processes. Taking a lens-diffuser HOE as an example, in the first exposure process, the dosage is usually set at a relatively low level (e.g., half of the saturation dosage), so the monomers are not depleted and can be used for next HOE recording. Then the exposure process of the diffuser HOE can have a high dosage to fully consume the remaining monomers. Naturally, this multiplexed hologram has a tradeoff between the efficiencies of two separate HOEs, which is determined by the distribution ratio of monomers into each HOE. For the multiplexing of more than two HOEs, the working principle remains similar. For CLCOEs using PAPH, the fabrication configuration has more varieties. Because only the wavefront in plane is recorded, the configuration with two beams at the same side or on opposite sides can produce the same wavefront. The difference lies in the polarization state of the recording beams. The grating component in a CLCOE is often referred to as polarization volume grating (PVG). [10c,13b] Two types of recording processes of PVG are plotted in Figure 6e,f. When two CP beams with opposite handedness are on the same side of the photoalignment layer, they form a sinusoidal pattern, as discussed previously. If these two beams are on opposite sides of the photoalignment layer, the handedness of the two beams needs to be the same to produce the same pattern.
In addition to PVG, various lenses can also be fabricated using PAPH, which is referred to as polarization volume lens (PVL). [13e] The PVL is also a useful photonic device based on patterned CLCs. More specifically, such a CLC is precisely aligned in the horizontal plane to provide a parabolic phase profile and twisted in the vertical direction along a slanted helical axis. Compared to the radius of lens curvature (several centimeters), the lens period (several hundreds of nanometers) is too short to be observed visually. Therefore, when we zoom into a very small area to observe the alignment under a polarizing optical microscope, the pattern looks more like a grating than a lens. Generally, at a macroscopic level, the PVL has a parabolic phase profile, but when we focus on each small area, the LC structure looks like a PVG with a linear phase profile. Therefore, the PVL can be treated as the combination of a reflective PVG and a concave/convex lens. The exposure setups are sketched in Figure 6g,h. For two beams on the same side of the sample, the template lens is placed at 2f distance from the sample so as not to influence the reference beam. For beams on the opposite sides of the sample, the template lens can be placed near the sample, as shown in Figure 6b.
For CP incident light (here we take the LCP as an example), the off-axis incident beam is converged to a point by the PVL, whose handedness of helical twist is same as that of the incident light. As discussed earlier, the diffraction efficiency of a PVL is Figure 6. Recording processes of HOEs. Schematics of recording configuration of photopolymer-and HPDLC-type a) grating HOEs, b) lens HOEs, c) lens-array HOEs, and d) diffuser HOEs. Configuration of PAPH recording setup using e) CP beams with opposite handedness on the same side of the sample and f ) CP beams with same handedness on opposite sides of the sample. The recording configuration of lens HOEs in PAPH g) with template lens located at 2f distance from the sample and h) with template lens placed near the sample.
www.advancedsciencenews.com www.adpr-journal.com highly dependent on the film thickness. Normally, we need several micrometers (about ten pitches) to establish the Bragg condition and obtain >90% diffraction efficiency. Moreover, the bandwidth of the diffraction spectrum and angular response is directly related to the LC material used and can be easily controlled by using an LC with an appropriate δn. Thanks to the maturing of the LCD industry, nowadays the LC birefringence can cover a wide range (from 0.05 to 0.4). Compared to holographic lenses, the PVL has more flexibility in material selection, so its bandwidth is also easier to adjust.

HOEs for AR Displays
Several AR display systems have adopted HOEs as combiners.
To evaluate an AR display, factors such as FOV, eye box size, form factor, light efficiency, and 3D capability should be considered. To clearly understand the performance of an AR system, we first give a brief introduction of these factors. The FOV determines the virtual image size perceived by the viewer's eye. The human eye has a large FOV: %160 in horizontal and 130 in vertical directions for each eye (monocular vision). The overlapped binocular vision still has 120 FOV in the horizontal direction. Therefore, for an AR display to have a decent viewing experience, a modest estimate is that an 80 by 80 (100 diagonal) FOV is required. The eye box size determines the spatial range in which the eye can be placed when seeing the image without vignetting or total disappearance. The eye box should be large enough to accommodate users with different eye locations and wearing positions. Form factor is another aspect concerning the wearing comfort. For a comfortable daily wearing experience, a glasses-like form factor is favored. Another display quality is light efficiency, which is often related to image brightness and contrast ratio. For a virtual image to be observable in a bright ambience, an optical combiner with high efficiency along with a bright image source is required. Finally, for an AR display to deliver vivid virtual objects to a viewer's eye, 3D capability should be considered. Table 1 summarizes and compares the strengths and weaknesses of various display systems adopting HOE combiners. It should be noted that for each display system, the performance of a specific parameter mentioned previously can be improved, but often at the cost of other parameters. Here, the evaluation of each parameter listed in Table 1 is based on general system performance while considering the potential cost of improvement.

Projection Combiner
In a projection AR system, [20c,21] the displayed image is directly projected onto the combiner, which is usually a diffuser HOE made with a photopolymer. The image light is then scattered and forms an image with focus on the combiner. As Figure 7a shows, the light projected from the image source forms an image on the diffuser HOE. To achieve a full-color display, the diffuser HOE has to be recorded three times using red, green, blue (RGB) lasers. Usually the image source is a 2D display such as liquid crystal on silicon (LCOS) or a digital micro-mirror device (DMD), and the image focus after the projection lens is set to be on the diffuser plane to obtain the best image resolution. This projection system can accommodate more than one diffuser HOE due to the angular selective property of HOEs. The multiple image planes are able to construct a 3D image with the proper image content on each plane. [21b] As shown in Figure 7b, diffusers 1 and 2 respond to the incoming light with different incident angles and do not interfere with each other. Therefore, diffuser 1 only displays the image from projector 1, and so does diffuser 2. The spatial separation between these two diffusers forms a multiplane display system. Through optimizing the display contents, the light field of the 3D object can be decomposed into two planes.
However, because the image focus is directly on the HOE, the viewer should keep a certain distance (%1 m) from the HOE to get a good viewing experience. This means the projection system can be used in AR application for fixed scenes such as exhibitions and commercials. For near-eye displays (NEDs), the combiner is usually close to the viewer's eye, so the projection system is not suitable.

Free-Space Combiner
AR systems with free-space combination are usually intended for NEDs and have been implemented in commercial products, such as Meta 2 and DreamGlass. A key optical element in this system is a partial reflector, which is used not only as a combiner to guide the light to human eye but also as magnifying optics for additional optical power. Conventional free-space combiners usually use partial mirrors, which leads to tradeoff between the see-though capability and image brightness.
Alternatively, lens HOEs can also be used as combiners. [19b,c,g,23] In this case, the aforementioned tradeoff in brightness vanishes because only a small portion of environment light is diffracted by the HOE and most of the display light can be diffracted into the viewer's eye if the Bragg condition is satisfied. As shown in Figure 7c, this type of system is usually pupil-forming, which means it implements a relay optics to first relay the original image to a place and then deliver the relayed image to the viewer's eye with the lens HOE. The image source can be a conventional 2D display or a 3D image source such as a digital holographic display using a spatial light modulator (SLM) and laser light source. Usually, the virtual image is far from the viewer, which means the relayed image is near the focus of the lens HOE. Because of the diffractive nature of the HOE and the off-axis system configuration, aberrations such as coma and astigmatism are large and need to be tackled with These depend on the FOV and eye box design.
www.advancedsciencenews.com www.adpr-journal.com sophisticated optical design. Although all the previously discussed HOEs (photopolymer, HPDLC, CLCOE) can be used in this system, their difference in angular response should be considered in the system-level design. [19b] As shown in Figure 7d, the black dashed lines indicate the recording beam configuration, with point F being the focal point of the template lens. The local region of a lens HOE can be regarded as a grating and therefore has an angular range within which the incident rays can be efficiently diffracted. The rays within the angular range (red rays) have a high diffraction efficiency and they can be delivered ideally. But the rays outside the angular range (blue rays) pass through the HOE and are not diffracted. This angular selectivity means only a proportion of the whole FOV can be delivered to the viewer's eye. In this sense, HOEs with a large angular bandwidth such as PVLs are more suitable for practical applications.
In the pupil-forming system, there is another tradeoff between the FOV and eye box (or exit pupil). This is, similarly to conventional partial-mirror optical systems, due to the conservation of étendue, which equals the product of the FOV and eye box. Here we do not consider the influence of an HOE's angular selectivity, which can further reduce the system étendue. The étendue is determined by multiplying the size of the image source (display panel) with the numerical aperture (NA) of the relay system. A larger étendue implies a larger optics, which can be problematic for NEDs due to the compromised wearing comfort.

Integral Imaging Combiner
Integral imaging (InI) is a promising technique to achieve a nakedeye autostereoscopic 3D display. [24] In an InI display, usually a lens array is used to convert the light from display pixels to rays with arbitrary spatial angles. As shown in Figure 8a, to display a virtual 3D object, we can perform a reverse ray trace on the spatial points (A and B) and turn on the corresponding pixels on the display Figure 7. Working principles of AR systems with projection combiner and free-space combiner. a) The projection AR system with one diffuser HOE. b) The projection AR system with two diffuser HOEs and two image projectors. The projectors and HOEs work in separate spatial angles, so there is no crosstalk in between. The images on diffuser HOEs are separate in space and can be used to produce a 3D image. c) Sketch of the free-space system. The relay system relays the original image to near the focus of the lens HOE, which then delivers the image to the viewer's eye. d) Illustration of the influence of angular selectivity on the free-space system. The black dashed lines stand for the recording wavefront of the HOE. Point F indicates the focus. Point A is near the focal point, so the angular deviation is small. Point B is far from the focal point and has a large angular deviation. Rubik's Cube is a registered trademark and used by permission of Rubik's Brand Ltd, www.rubiks.com. www.advancedsciencenews.com www.adpr-journal.com panel. Then the light field on those points can be approximated with discrete emitting rays. The distance between the display panel and lens array is usually equal to the lens focus (case 1) to ensure collimated light after the lens and therefore a large depth of field (DOF). This configuration also produces the maximum view number, which equals the number of discrete rays emitted from each spatial point. The resolution, which is inversely proportional to the view number, is the lowest in this case. Based on the principle of InI, the lens array can be replaced by a lens-array HOE to form an InI-AR system. [5b,20b,d,21a] A typical configuration of an InI-based AR system is plotted in Figure 8b. The projection system is used to relay the original image from the image source to near the focus of the lens-array HOE, similarly to the free-space combination system. The relayed image then works in the same way as shown in Figure 8a and produces the light field to display 3D virtual objects.
If the distance between the lens array and display is not equal to the lens focus length (case 2), the view number decreases and the image formed by the lens can therefore have a higher resolution. But the light after the lens array is no longer collimated and has a divergent angle. This means the image has a limited DOF, out of which the 3D image gets blurred significantly. To overcome the issue of DOF, laser beam scanning (LBS) can be used. [5b] As shown in Figure 8c, the LBS source generates laser rays with different light intensities to show an image. The laser beams can be considered to have a large DOF even after the lens-array HOE. Therefore, the projection system is no longer required and the problem of DOF is also solved. Note that in this case the reference wave in the recording of the lens-array HOE is no longer a plane wave but a spherical wave with the focal point matching the position of the LBS scanning point. Figure 8. Illustration of the working principles of an InI-based AR system. a) Basic principle of an InI system. The red (blue) pixels on the display panel form divergent rays on spatial point A (B). The whole image content of the virtual 3D object can be rendered similarly. b) System configuration of the InI-AR system based on projection system and lens-array HOE. c) Sketch of the InI-AR system based on LBS image source. Rubik's Cube is a registered trademark and used by permission of Rubik's Brand Ltd, www.rubiks.com. www.advancedsciencenews.com www.adpr-journal.com Although the InI-based AR system can deliver a 3D light field with correct focus cues, it suffers from low resolution. The resolution loss is proportional to the view number of the system. The view number represents how many discrete rays are emitted from a single spatial point. In the previously discussed cases, the view number equals the lens pitch divided by the display pixel pitch (case 1) or the magnification number of the lens array (case 2). [25] The larger the view number, the smoother is the light field and thus the more natural the defocus-blur we can get, but at the same time it means we have to bear a higher resolution loss. This tradeoff between resolution and view number is inherent because the total information (pixel number) is fixed.
Overcoming this issue remains a major challenge for InI systems.

Maxwellian View Combiner
A Maxwellian display, or retinal scanning display, adopts the principle of the Maxwellian view, [26] which directly forms a focus-free image on the retina. As shown in Figure 9a, the collimated image light is focused by a lens and the focal spot is on the eye lens. Therefore, no matter how the optical power of the eye lens is changed, the final image on the retina always stays in focus. The adaptation of a Maxwellian display into AR is straightforward by simply replacing the focusing lens in Figure 9a with a lens HOE. [19a,eÀg,22b,27] As shown in Figure 9b, the light source (a light-emitting diode (LED) or laser) is collimated with a lens and passes an amplitude SLM, which is typically an LCOS (the reflective optical layout is omitted here). The modulated light is then focused by the lens HOE and forms the Maxwellian viewing point. Similar to the InI-AR, the LBS image source can also be adopted here to simplify the system. As Figure 9c shows, the laser rays generated by the LBS already carry the intensity information for the image, so the lens HOE can directly converge the laser beam to form a Maxwellian view. The lens HOE in this case should be recorded with a spherical reference wave whose focus matches the LBS source.
The FOV of the Maxwellian view is determined by the NA of the lens HOE, which can be pushed to a large value without fundamental limitation. A Maxwellian-AR display with a horizontal  www.advancedsciencenews.com www.adpr-journal.com FOV as large as 80 has been demonstrated. [27b] The biggest problem of Maxwellian systems, though, is the small eye box. Because the size of the viewing point is generally smaller than the eye pupil diameter (%4 mm), the eye box size of a single Maxwellian view equals the eye pupil diameter. This small eye box is undesirable for wearing experience because a tiny misalignment makes the image disappear completely. Several approaches have been proposed to enlarge the eye box. The first one is to use a ray duplicator, as shown in Figure 9d. The ray duplicator functions to multiply the incoming rays, which can be achieved by gratings or geometric optics with partial mirrors. [27a,28] The duplicated rays then form multiple viewing points, which expand the effective eye box. The second approach is to use backlight modulation. As shown in Figure 9e, different light source points (A and B) correspond to different final viewing points. In principle, A and B can be lit on simultaneously, [19e] in which case the viewing points are duplicated. But it is also feasible to light one light source at a time, [19g,27d] which requires eye tracking to determine which light source corresponds to the viewing point in the eye pupil. This way, the system is more energy-efficient. The third approach is to use a multiplexed lens HOE, [22b] as shown in Figure 9f. The light incident on the multiplexed HOE directly generates several viewing points that correspond to the multiple focal points in the recording process.
Although the eye box can be enlarged by these approaches, the separation between different viewing points is a tricky issue. Normally, we do not want more than one viewing point within the pupil because that results in a ghost image. The ghosting disappears only when the eye focuses at infinity. This can severely compromise the viewing experience. However, when the separation between viewing points is too large, there can be a gap in the eye box where the image completely vanishes. This is also undesirable. The best way one might think is to set the separation between viewing points equal to the eye pupil diameter. This in principle should resolve the aforementioned issues, but the diameter of the eye pupil is unfortunately not fixed, and it changes in response to the ambient light brightness. Thus, to get a perfect separation of viewing points is challenging.

Waveguide Combiner
Waveguide displays have the advantages of glasses-type form factor and large design freedom to achieve high image performance. [6b,14] Therefore, they are regarded as a promising approach for commercial products. Currently, the commercial products that use this approach include Microsoft HoloLens 1,2 and Magic Leap 1. The term "waveguide" refers to a glass substrate with thickness of around 1 mm and should not be confused with the conventional term in integrated photonics. Due to the ability for fabricating large-angle gratings, HOEs have been widely applied in waveguide displays. [6b,10d,14,18,29] The basic working principle is shown in Figure 10a. The image source is usually a 2D display panel such as an LCOS, DMD, or micro-OLED. The light from the panel is first collimated by a lens system and then diffracted by an incoupler grating. The diffraction angle is large enough to trap the diffracted light in the waveguide by total internal reflection (TIR). Here, both reflective and transmissive gratings can be used, provided that the diffraction angle is large enough and the efficiency is high. After propagating a distance in the waveguide, the light encounters the outcoupler grating, which couples out the light to the viewer's eye. The coupler here is usually a reflective grating to avoid stray light, as discussed previously.
Because of the multiple outcouplings, the eye box in waveguide displays can be enlarged without sacrificing the FOV, which means the conservation of étendue is broken. However, that also brings up the issue of light uniformity and efficiency. Normally, to maintain good uniformity, the grating efficiency should be low so that the light intensity stays relatively invariant across the whole process. But a relatively low efficiency means the image is not bright enough and is likely to be washed out by the surrounding ambient light. Therefore, the tradeoff between uniformity and efficiency is an important aspect to consider in the system design. For a system with moderate efficiency, because of the multiple TIRs and outcouplings, maintaining good uniformity across the whole FOV and eye box is challenging in engineering. As shown in Figure 10b, for a photopolymer grating, to ensure uniform light output, the efficiency of the outcoupling grating should be a gradient, with low efficiency in the beginning part and high efficiency in the ending part. [6b,14] Still, when also considering the different propagating angles, the situation becomes quite complex because the gap between each TIR is different and the outcoupling position of each angle varies. Thus, optimizing the gradient-efficiency distribution can improve the uniformity only to some extent. In addition, to fabricate a gradient-efficiency photopolymer grating requires extra steps of exposure and masking, which, combined with the angular-multiplexing process, can be quite complicated and costly.
To use PVG as an outcoupler, a different mechanism called polarization management [6b,29c] can be considered because the fabrication of a gradient-efficiency PVG is difficult. As shown in Figure 10c, the method entails a polarization management layer (PML) at the bottom of the waveguide. The PML is basically a layer of an LC polymer with spatially varying director direction, which can be fabricated by the photoalignment method with a prepatterned mask. Recall that the efficiency of the PVG is extremely sensitive to the input polarization state. With the PML we are able to manage the polarization state after each TIR and optimize the PML to achieve good uniformity. However, similar to the case with gradient-efficiency grating, the optimization across the FOV and eye box still encounters the complex case of different TIR gaps and can only improve the uniformity partially.
The FOVs in waveguide displays are restricted by two limits of the propagating angle in the waveguide. The lower limit is determined by the critical angle of the glass substrate, which is related to the material refractive index according to θ min ¼ sin À1 ð1=nÞ. A larger refractive index helps to lower this limit. The upper limit comes from the consideration of system-level design. A large propagating angle means a large gap between two consecutive TIRs and causes great light nonuniformity. Normally, the propagating angle does not exceed 80 . In a practical design, we also need to consider the angular range of HOEs. As discussed earlier, the photopolymer-based grating usually has an angular range of around several degrees. Therefore, to fully cover the light angle in the waveguide, several photopolymer gratings have to be multiplexed to enlarge the angular range. [29b] Another issue in waveguide displays is the full-color capability. To fully understand the design challenges of full colors, it is necessary to review the basics of grating diffraction. When a light encounters the periodic index modulation of a grating, it will be diffracted into multiple orders, as described by Equation (4). For a fixed grating pitch, the diffraction angle varies with wavelength. If the RGB channels are to be fabricated in one single waveguide through the stacking or multiplexing of RGB gratings, then it is necessary to eliminate the crosstalk between these channels. For example, if the red light is not only diffracted by the red grating but also by the green grating, through our previous analysis we know the diffraction angles are different and therefore the light diffracted by the green grating becomes stray light, which severely compromises the image quality. To eliminate crosstalk, the reflection bands across the entire FOV have to be separated for RGB colors. [18a] This is relatively easy to do when the angular bandwidth is small (%10 ). But if a larger FOV is required, then the separation is very difficult because a broader angular bandwidth normally means a broader spectral bandwidth. In this case, more than one waveguide should be adopted, which increases the system volume.
Finally, all previous discussions on waveguide displays only involves collimated light because only gratings are used as the incoupler and outcoupler. The output image therefore is located at infinity. However, for a vivid AR experience, it is necessary to display 3D virtual objects at finite depth with correct focus cues. This can be achieved by replacing the grating with a lens-HOElike photopolymer lens or PVL. As shown in Figure 10d, if the grating HOE is replaced by a lens HOE, the outcoupled light is no longer collimated but divergent. But because the TIR light is collimated and therefore has a fixed incident angle on the lens HOE, all the divergent outcoupled light has a fixed divergent point, which is the recording lens focal point if the incident angle is the same as the recording plane wave. For other pixels with different incident angles, the focal point will shift in the horizontal direction, forming a focal plane. This Figure 10. Illustration of the waveguide display. a) Sketch of the basic principle. The red and green rays correspond to different pixels on the display panel. Different incoupling angles result in different propagating angles in the TIR process and therefore different outcoupling angles. b) The method of lightuniformity management with gradient-efficiency grating. c) The polarization management method with uniform PVG and PML. d) The approach to generate finite depth with a lens HOE.
www.advancedsciencenews.com www.adpr-journal.com way, an image plane with finite focus can be generated. To generate multiple focal planes for a high-fidelity 3D image, however, we need to again stack multiple waveguides with different focuses.

Conclusion
In conclusion, we first introduce the fundamentals of intensity holography and polarization holography and then compare the optical properties of these HOEs. Finally, the applications of HOEs in various AR systems are reviewed. Generally, the photopolymer-based HOEs have relatively narrow spectral and angular bandwidths, but the capacity to record multiple different HOEs into one film can alleviate these issues and enable unique functions such as Maxwellian viewing point duplication and RGB in one film. This unique property will continue to impact AR displays with the optimization of recording materials to improve the efficiency and image quality. CLCOEs are recently developed new types of HOEs; they exhibit unique properties of polarization sensitivity and broad angular and spectral bandwidths. Their simple fabrication process would help lower the cost. When combined with an active polarization-switching LC device, CLCOEs also have the potential to achieve other novel functionalities, such as a varifocal lens in a very compact form factor.