Towards a solid‐state light detection and ranging system using holographic illumination and time‐of‐flight image sensing

Light detection and ranging (LIDAR) systems are finding their way into an increasing number of applications including autonomous vehicles. Image sensors based on single‐photon avalanche diode (SPAD) arrays have demonstrated great promise for LIDAR, typically using flood illumination or raster‐scanning of a small spot. Holographic illumination, based on phase‐only liquid crystal on silicon (LCoS) spatial light modulators, has been successfully applied to display systems such as automotive head up displays (HUDs). It holds great promise as the method of illumination for a LIDAR system. We present a novel solid‐state LIDAR architecture using holographic projection and single‐photon detection. We report a simple experiment that demonstrates proof of concept. The results of the experiment show that the novel architecture can improve the depth accuracy of 3D imaging compared to conventional LIDAR systems. We also indicate several key performance advantages of the novel architecture that will increase as the system scales to higher performance.


| INTRODUCTION
Light detection and ranging (LIDAR) systems are generally composed of an illumination sub-system and a detection one. Several criteria can be used to categorize each of them.
The illumination may be diffused across the entire scene (flood illumination), concentrated in arrays of spots, that are regular or form patterns (structured illumination). It may be a continuous modulated waveform or a stream of discrete pulses. The hardware may be solidstate or have moving parts. [1][2][3] Detection can be based on a single-pixel detector or on array-based image sensors. The detector may directly measure the propagation time of pulses of light (direct time of flight [D-ToF]) or may infer the time indirectly based on measurements of the phase delay between the transmission and reception of modulated waveforms (indirect time of flight [I-ToF]). 2,4 Various detector technologies are available, such as avalanche photodiodes (APD), SPADs, silicon photomultipliers (SiPM), and complementary metal oxide semiconductor (CMOS) active pixel sensors (APS). 1,5 The optimal approach is highly application dependent. Reported systems display a wide range of approaches. 1,2,5,6 Noteworthy applications are in automotive industry, where the autonomous driving market is under accelerated growth. 7 LIDAR systems are finding applications in technologies for autonomous vehicular capabilities, such as advanced driver assistance systems (ADAS). 2,8 In this paper, we describe a novel system with unique features that we consider advantageous. Our implementation is a solid-state D-ToF LIDAR that features dynamic holographic projection of arrays of spots, with individual control of the power in each spot, and depthimaging using a CMOS SPAD array. We refer to it as "holographic single-photon LIDAR" (HSP-LIDAR), and we illustrate it diagrammatically in Figure 1. We emphasize that the system reported here is based on existing components and is intended to demonstrate proof of principle. This will support future efforts to quantify the potential of this technology and develop pathways to future systems optimized at a component-and systemlevel for a range of applications.
Efficiency is critical when limitations on the available illumination are imposed, such as eye safety considerations in automotive LIDAR. 2 Implementing a SPADbased LIDAR that makes efficient use of the available illumination poses challenging design trade-offs. Flood illumination is the simplest to implement but is relatively inefficient. The efficiency can be improved by concentrating the light in arrays of spots rather than diffusing it across the entire scene. 1 Conventional systems typically rely on fixed optical elements to illuminate the scene in a pre-defined manner. Sub-optimal performance results when regions of high interest receive equal illumination to ones of less interest. It is highly desirable to illuminate the scene in a non-uniform manner and to do so adaptively to cope with dynamic scenes.
Our projection sub-system is based on a custom reflective phase-only liquid crystal on silicon (LCoS) spatial light modulator (SLM), which is typically used in holographic display systems. 9 We use the SLM as a programmable diffractive optical element (DOE) to dynamically generate structured illumination with high optical efficiency.
The experiments reported here demonstrate the compatibility of illumination patterned by a phase-only LCoS-SLM with detection by CMOS SPADs. The programmable nature of SLMs presents a pathway to using smart illumination schemes using conventional LIDAR detectors. We can realize the benefits of both structured illumination and of D-ToF systems, rather than having to trade-off between the two.
Recent work demonstrated the use of SLMs with APDs in a scanning LIDAR setup. 10 Despite their similarities, the nature of APDs is quite different from that of SPADs. Key advantages of SPADs include their high sensitivity, which enables detection in single-photon regime, and their ability to be densely integrated in large arrays. 11 Our imaging sub-system is based on the "QuantiCam" 192 Â 128 SPAD array, 12 which is used for counting and timing the photons of our laser illumination that return after being reflected from the scene. SPAD arrays show high potential in D-ToF LIDAR systems, thanks to their high sensitivity and temporal precision. 5 F I G U R E 1 HSP-LIDAR for holographic illumination Using SPADs with existing display technologies to improve upon conventional measurement techniques has been reported. 13 Similarly, we argue that the combination of SPAD-based imaging with SLM-based holographic illumination in a single system is a well-matched pairing with significant potential for LIDAR. These devices enable a unique combination of single-photon counting with dynamic structured illumination.
A key performance parameter of LIDAR systems is the accuracy of distance measurements in dynamic scenes, which depends on parameters that vary over time and across the scene such as absolute distance, reflectance, and ambient light. We demonstrate our system's advantage by presenting an early representative application of it in operation. We image a scene with objects of differing reflectance and demonstrate that our system can concentrate the laser illumination on a spot of interest to improve the ToF distance measurement at that location. We show that our system makes efficient use of the available illumination and can achieve more accurate distance measurements, compared to uniform illumination. In this paper, we report an initial open-loop implementation, as a stepping-stone towards an implementation of a truly adaptive closed-loop system that can realize the full benefits that HSP-LIDAR may offer ( Figure 2).

| OVERVIEW OF THE PROJECTION SYSTEM
Our projection system is based on a multi-level reflective phase-only LCoS-SLM that is illuminated by collimated light from a pulsed-laser diode. It can project illumination patterns of definition up to 512 Â 512. In this work, we limit the size to 128 Â 128 to match the number of pixels used in the SPAD image sensor. The frame rate of our SLM can reach 120 Hz, but here we demonstrate the projection of static illumination patterns.
Arrays of spots can be generated by replicating a single laser source using DOEs or by using multiple laser sources, such as vertical cavity surface emitting lasers (VCSELs), arranged in arrays. 1 However, each of these approaches imposes limitations on the LIDAR system.
Conventional DOEs are fixed optical elements. Typical VCSEL arrays may allow some adaptivity, as the output power of the elements can be individually modulated. However, the maximum illumination power of any spot is limited by the VCSEL. Reducing the illumination power of any VCSEL causes a reduction in the total output illumination power.
Our holographic system dynamically projects any pattern that is formatted as a 128 Â 128 greyscale 8-bit image. The SLM modulates the laser light, while efficiently maintaining the total output power. 9 Phase-only SLMs require that the image is converted to a phase-only hologram ("kinoform"). This is a Fourier-transform representation of the image that contains all its information in the phase distribution, while the amplitude distribution is fixed to unity. This can be done by applying computer generated holography (CGH) techniques, such as the iterative Fourier-transform algorithm (IFTA). 14 They "transfer" the image information content from the spatial domain to the Fourier domain's phase distribution. The intricacies of this process are beyond the scope of this paper, but the interested reader may refer to relevant literature. 14 Since the SLM is a modulating optical element, it can be combined with any coherent light source. SPAD-based D-ToF typically requires a pulsed illumination source that can generate nanosecond-width pulses and can operate at MHz pulse repetition rates. Our system works at 25 MHz with 1-ns pulses.
LIDARs typically operate at infrared (IR) wavelengths. To demonstrate the feasibility and advantages of our novel architecture quickly and conveniently, we have used an existing illumination sub-system designed for HUDs that operates at 645 nm. The same operation and design principles can be applied in future to IR illumination.

| OVERVIEW OF THE DETECTION SYSTEM
The detection system is based on an array of CMOS SPADs (also termed Geiger-mode APDs), configured for D-ToF depth-imaging. D-ToF is based on the simple principle that the speed of light c is constant, so the The HSP-LIDAR system setup propagation time of a light pulse depends only on the distance it travels. The light must travel the measurement distance twice (outwards and return), so the ToF measurement value t ToF must be halved to obtain the correct distance d ToF as shown in Equation (1): The key feature of SPADs for time-of-flight imaging is their temporal resolution that allows measuring the propagation time of photons at a picosecond temporal resolution, which equates to ranging with centimeter-scale resolution. 5 Our detector is the "QuantiCam" SPAD-array image sensor that has 33-ps temporal resolution, which equates to 0.5-cm distance resolution, 12 based on Equation (1). The precision of the distance measurements is affected by time jitter inherent to the sensor and is also affected by experimental parameters, such as the duration of the laser pulse, received photon rate from the reflected laser illumination, and ambient noise. 4 Precision can be improved by accumulating distance measurements over multiple laser cycles, based on a technique known as time correlated single-photon counting (TCSPC), as we discuss further later in this section. In this work, under optimum conditions, distance measurement precision is approximately 2 cm. The QuantiCam device is a 192 Â 128 array of SPAD pixels. Each pixel contains a single 14-bit time-to-digitalconverter (TDC) element that acts as a "stopwatch." In other words, it produces a timestamp value that represents the ToF associated with a photon-detection event, with respect to the moment at which a pulse was emitted from the pulsed-laser projection system.
Each SPAD pixel has its own memory element that can store a single timestamp produced from its TDC. When a pixel has timestamped a photon, it saturates, after which subsequent photons are ignored until the pixel has been reset. Due to this TDC saturation, a sensor "image" is made up of zero or one timestamp values per pixel. The pixels of the whole array are reset simultaneously after the data have been read out to prepare for the next image acquisition.
In high ambient light environments, TDC saturation can cause significant performance degradation because the ambient light noise can trigger most TDCs, causing signal photons to go undetected. To keep things simple, in this work, we use relatively low photon rates to ensure that saturation effects do not influence our results. In use cases where saturation is a concern, sensors with alternative TDC architectures and techniques such as temporal photon windowing can be used that avoid TDC saturation, limit its effect, or compensate for it. 4,15,16 The distance measurements are based on the aforementioned TCSPC technique. We give a brief overview of relevant aspects here, while a detailed explanation can be found in relevant literature. 3,4,17 In a single acquired image from the sensor, the timestamps of some pixels are from photons of the laser illumination (signal detection events), and others are from ambient light or internal SPAD noise (noise detection events). When operating at low photon rates, some pixels may not have a photon-detection event at all, in which case they do not produce a timestamp value.
Acquiring a sensor image with missing pixel timestamps is inefficient. If timestamps for all pixels are required, then additional images must be acquired, resulting in an increased total acquisition time. To avoid this issue, it is typical in TCSPC to emit multiple laser pulses for the acquisition of a single sensor image, which results in an increased probability of a detection occurring. 18 Our system operates in this way. It acquires a single image by being exposed to a pulsed laser operating at 25 MHz during an exposure window of 100 μs.
To compensate for noise, we accumulate multiple depth-frames and build a histogram of the SPAD photon event timestamps, for each pixel. Each histogram bin represents a photon timestamp value or equivalently a ToF distance value. Noise from ambient light is typically temporally uncorrelated, so it spreads out across the histogram bins, while the laser photon timestamps concentrate in a few histogram bins and give rise to a peak. The bin location of the peak represents the ToF distance of the target. The size of the peak (photon counts) represents the strength of the signal compared to the noise. Figure 3 shows a typical TCSPC histogram when the number of histogrammed events is large and the signal F I G U R E 3 Typical TCSPC histogram with a strong signal pulse, discernible above the noise floor pulse is relatively strong. The x-axis represents the time bins of the histogram, converted to units of distance using Equation (1. The signal pulse manifests as a peak that is easily discernible from the noise floor. Accurate identification of the peak is straightforward in this case. Our system uses a simple peak-finding algorithm to identify the location of signal peak in the TCSPC histogram. It selects the bin with the largest count, after filtering the histogram data by a Gaussian-weighted movingaverage filter, whose width is chosen to approximately match the full width half max (FWHM) of the laser pulse peak. Filtering this way is a simple approach to improve detection by suppressing lone noise peaks and generating a prominent peak at dense areas with many counts, which is where the signal pulse location is expected. More advanced algorithms exist to locate the signal pulse that can tolerate more demanding conditions, at the expense of increased complexity. 4,19,20 Figure 4 shows a typical TCSPC histogram under more demanding conditions. The number of histogrammed events is low, and the signal pulse produces a peak that is close to the noise floor. Differences from the earlier scenario are that some bins have zero counts and there are several bins with the maximum number of counts (in this case five counts). Discerning the signal pulse and selecting the correct bin is challenging.
The Gaussian-weighted moving-average filter is helpful in these circumstances. The filtered output (amber trace) that is displayed in Figure 4 overlaid on the raw data shows only a single peak bin. While our simple peak-finding algorithm fails to detect the correct peak in the raw data and causes a distance measurement error, it succeeds when applied to the filtered data.
If the conditions deteriorate further (due to fewer signal events in the histogram or increased noise events), the measurement error increases. When the peak is close to, but remains discernibly above the noise floor, the error remains relatively low. As the peak sinks further into the noise floor, the error increases rapidly, because a random peak across the bin range is selected by the peak detection algorithm.
It is worth observing that when the signal peak is prominent such as that depicted in Figure 3, increasing the laser signal strength further does not improve the distance measurement. In fact, the measurement can tolerate a reduction in signal strength with its location accuracy being unaffected.

| IMPROVING TOF-ACCURACY IN A REGION OF INTEREST BY HOLOGRAPHIC LIGHT REDISTRIBUTION
We choose an illumination grid with a small number of spots to simplify this proof-of-principle experiment. Specifically, here we generate illumination patterns composed of four spots in a 2 Â 2 grid arrangement. We arbitrarily select spot (1,1) in the grid and gradually shift the illumination power from the other spots into the selected one. We scan the illumination intensity in the individual spot of interest from 0% to 100% in linear steps of 2.5%, resulting in 41 illumination distribution patterns (numbered 0 to 40).
We generate the intensity profile of the phase-only holograms to investigate how accurately they represent the desired illumination patterns. Figure 5 plots the fractional power programmed in each spot (blue), the fractional power measured in each spot (red), and the modulus of the difference (amber). Some minor non-linearities are discernible across the sweep. These artifacts most likely arise due to the high level of regularity of the input images. These 2 Â 2 spot patterns are regular patterns with highly non-random phase, which is known to affect convergence performance of IFTA algorithms. 14 We project the illumination patterns onto a large white screen that has a small low-reflectance "black tape" target positioned at its center, as shown in Figure 6. Spot (1,1) of the hologram pattern is projected on the lowreflectance target. The remaining three spots are projected on the large white screen. We capture multiple depth-images with the SPAD array for each linear step of the illumination intensity, for subsequent analysis.
F I G U R E 4 Typical TCSPC histogram with a weak signal pulse, annotated with a Gaussian-weighted moving-average filter output In order to obtain absolute measurements of the illumination intensity of projected spots, we place a power meter in the scene, such that it measures spots of interest and then repeat the projection of each pattern. Figure 7 shows the power meter measurements.
The illumination power in spot (1,1) increases with relatively small deviations from true linearity, as expected from the intensity profiles of the phase holograms ( Figure 5) results Concurrently, the illumination power in the remaining spots reduces, but the sum of the power in all four spots remains approximately constant. This verifies that the total power is redistributed among the spots. F I G U R E 6 The scene of a white screen with a "black tape" target, illuminated by a 2 Â 2 grid of spots, at 10-cm separation F I G U R E 7 Spot power measurements as a function of hologram power in spot (1,1) The power meter measurements of Figure 7 describe the illumination power emitted by the projection system, towards the scene. On the other hand, the SPAD array detections depend on the power returned from the scene, so they vary based on the reflectance of objects in the scene. As an example of imaging the illumination reflected from the scene, Figure 8 shows a typical intensity image from the QuantiCam SPAD array, where each pixel value is the number of photons detected. This example uses a hologram pattern where 75% of the illumination intensity is concentrated in spot (1,1). Spot (1,1) appears dimmer than the others in the figure due to the lower reflectance of the "black tape" target that it is projected on.
The purpose of concentrating the illumination on the low-reflectance target spot is to improve the accuracy of its ToF distance measurement. It is appealing to be able to do so while simultaneously maintaining, as far as possible, the accuracy in the distance measurements of the remaining spots.
We generate histograms from SPAD depth-frames and analyze the data corresponding to each spot separately. In each case, we investigate the mean of the counts in the peak bin and the standard deviation of the location of the histogram peak bin. We make the measurements in a dark (low noise) lab environment and add artificial noise for the analysis. This ensures that the effect of TDC saturation and SPAD dead-time is kept low, while allowing a controllable amount of noise. Figure 9 shows the mean of the counts in the peak bin, which gives an indication of signal strength as the illumination power incident on spot (1,1) varies. Figure 10 shows the standard deviation of the location of the peak bin, which indicates the accuracy of the distance measurements. Figure 9 shows that as we progressively shift the illumination power into spot (1,1), the photon count increases in spot (1,1) and reduces in the rest. Figure 10 shows that as this occurs, the standard deviation of the distance measurement reduces for spot (1,1) and increases for the rest. For all spots, the distance measurement accuracy is not affected much by a reduction of illumination, until the counts decrease below a noisedependent level. Then, uncertainty grows rapidly. By comparing Figures 9 and 10, we can identify the count of signal photons required for successful detection. In the case of all four spots, detection is successful when the signal photon counts are approximately 95 or higher.

| ANALYSIS OF THE RESULTS
It is insightful to use a threshold-based "pass/fail" metric for interpreting the results based on whether each individual distance measurement is deemed "sufficiently" accurate. Figure 11 shows the fraction of detection events that are deemed "successful". It also shows the mean "success" fraction across all spots, which provides a simple clear and actionable performance metric.
We can relate the results here to our earlier discussion of TCSPC histograms. We interpret the light redistribution results of Figure 10 by bearing Figures 3 and 4 in mind. When spot (1,1) is given a low fraction of the illumination power, the histogram peak-finding algorithm is unable to distinguish the signal pulse from the noise floor and frequently fails to locate it, resulting in a high standard deviation in the distance measurement. As the illumination power is transferred into spot (1,1) from the remaining spots, the signal pulse becomes discernible. This happens at approximately 45% of the illumination power. Further increasing power does not result in a noticeable improvement.
We turn our attention to the remaining spots, whose illumination power is reduced along the sweep. They tolerate a higher reduction in illumination power compared to spot (1,1), with marginal consequences in the detection performance, because of their higher reflectance. As power is transferred away from them and into spot (1,1), their signal pulses remain detectable until a significant power reduction is reached.
A key observation lies in Figures 10 and 11 in the region where spot (1,1) contains between approximately 45% and 80% of the illumination power. In this region, the distance estimation is at its most accurate for all four spots (standard deviation less than 2 cm). Here, our nonuniform illumination maximizes the number of spots that have a reliable distance measurement.
We can compare the performance of our HSP-LIDAR system with equivalent Flash LIDAR and Scanning LIDAR, using the same results. A typical Flash LIDAR operates at the uniform illumination point, as depicted in Figure 10, where each of the four spots would receive 25% of the available power, thus resulting in an inaccurate measurement. Our system outperforms this when it distributes the illumination optimally.
A typical scanning LIDAR concentrates 100% of the illumination power in a single spot. On Figure 10, this is the point on the far right for spot (1,1) and on the far left for the remaining spots. All measurements would be accurate, but illumination is used very inefficiently, since each spot receives far more illumination than needed for an accurate measurement. In addition, a scanning LIDAR requires a separate measurement per spot, which means four separate measurements in this case. In comparison, our system requires only a single measurement. It should be noted that while a lower number of measurements is beneficial, the scene imaging rate depends on time for each individual measurement, which can vary significantly between different applications and LIDAR implementations. 2,21 A limit to the update rate of the illumination pattern in our approach is set by the modulation speed of the SLM. Recently, LCoS SLMs with sub-millisecond refresh rates have been demonstrated. 22,23 The above observations highlight the "in principle" performance advantage of our flexible illumination system over typical alternatives. It is worth noting that our system is scalable. In fact, our performance advantage is more profound for grids larger than the 2 Â 2 one used here.
In the case of Flash LIDAR with uniform illumination, a larger grid means that spots receive an even smaller fraction of the available power. In Figure 10, this would push the uniform illumination operating point further to the left, where spot (1,1) measurements are inaccurate.
In the case of scanning LIDAR, while measurements remain accurate, scaling up the grid size requires a quadratic increase in the number of measurements. Scaling up a grid by a factor of "m", from N Â N spots (N 2 measurements) to (mN) Â (mN), requires m 2 N 2 measurements. In contrast, as our system scales up in grid size, the required number of measurements remains one.
F I G U R E 1 1 Fraction of successful signal-pulse location detections

| CONCLUSION AND IMPACT
We have demonstrated a novel LIDAR system that can dynamically adjust structured illumination patterns to illuminate and image a scene efficiently. To our knowledge, our HSP-LIDAR system is unique in combining holographic structured illumination and single-photon counting D-ToF imaging technologies.
Our detection system benefits from using SPAD arrays that are based on CMOS image sensor manufacturing processes. The rapid pace of development of the field will enable huge performance improvements and will allow future systems to use custom SPAD designs, with integrated electronics optimized for the architecture. 3,24 The patterns are projected as phase-only holograms based on diffraction, so the total output illumination power is maintained for all patterns. Light is not lost when the pattern changes. It is redistributed for each new frame.
Our system can generate virtually any structured illumination pattern. It holographically "redirects" the illumination from regions of the scene where it is not required and delivers it to where it is needed. It can concentrate all the illumination power in a single spot (like raster-scan illumination in typical Scanning LIDAR), it can project it uniformly (like flood illumination in typical Flash LIDAR), or it can operate anywhere in between. This flexibility creates huge potential in LIDAR.
We envisage a future system that will dynamically concentrate the illumination in regions of interest. For instance, in automotive applications, interesting regions may be pedestrians in proximity or road hazards. Starting with a scene-dependent illumination pattern, our system will interrogate the whole scene, identify points of interest, and efficiently guide more of the laser light to those points. It will repeat this process continuously and adapt to changes in the scene.
The versatility and programmability of our system point towards smart illumination schemes. Paths forward are likely to incorporate scene-aware illumination, neural network-based adaptive imaging, object detection, and tracking.
This report demonstrates proof-of-principle results using our novel system architecture. Improved performance can be obtained by optimizations at a componentand at a system-level. Recent reports on optimizing LCoS-SLMs 25 for LIDAR and, separately, SPADs 26 for LIDAR, suggest that these technologies have a promising future in the field. The architecture we propose will benefit from these improvements.