Universal Approach for Calibrating Large‐Scale Electronic and Photonic Crossbar Arrays

Analog electronic and photonic crossbar arrays have been emerging as energy‐efficient hardware implementations to accelerate computationally intensive general matrix–vector and matrix–matrix multiplications in machine learning (ML) algorithms. However, the inevitable nonuniformity in large‐scale electronic and optoelectronic devices and systems prevents scalable deployment. Herein, a calibration approach is reported that enables accurate calculations in crossbar arrays despite hardware imperfections. This approach is experimentally validated in a small‐scale free‐space photonic crossbar array based on cascaded spatial light modulators and demonstrated the scalability and universality of this approach in various large‐scale electronic and photonic crossbar arrays. The improved performance of calibrated crossbar arrays in an ML model inference is further demonstrated to classify handwritten digital images.


Introduction
The widespread application of machine learning (ML) algorithms, such as computer vision, [1,2] the discovery of new materials and molecules, [3,4] and chip design, [5] calls for energy-efficient hardware to accelerate the most computation-intensive matrix-vector multiplication (MVM) operations in these algorithms.New processors leveraging analog approaches become promising to offer low energy consumption solutions compared to their digital counterparts. [6]or example, in-memory computing with electronic crossbar arrays of nonvolatile memories, such as those made from resistive switching materials [7][8][9][10] and phase-change materials, [11][12][13][14] has attracted extensive research interest in various disciplines from materials, devices, and systems.Furthermore, photonic architectures are recently emerging as a new hardware platform to accelerate MVM operations in a highly parallel and energy-efficient manner, thanks to the parallelism and multiplexing of photons and nearly zero static energy consumption.Both two-dimensional (2D) photonic integrated circuits [15][16][17][18][19][20] and three-dimensional (3D) free-space optical systems [21][22][23][24][25][26] have been demonstrated.For example, photonic crossbar arrays incorporating electro-optic reconfigurable units with nonvolatile materials [20,27,28] can enable photonic in-memory computing.
In addition to successful system demonstrations, material and device innovation keeps moving forward the frontier of emerging highperformance ML hardware accelerators.For example, the unique electronic and optoelectronic properties of 2D materials and their heterostructures have enabled a variety of electronic and photonic memristors with novel synaptic functionalities, [29][30][31][32][33][34] as well as compact high-speed electro-optic modulators. [35,36]owever, the biggest challenge associated with these nanomaterials and their devices is the inevitable nonuniformity and imperfection of devices in large-scale arrays.For specific ML tasks such as the classification of handwritten digits, the practical system can be incorporated into the training process so that it becomes aware of real hardware physics and noise for accurate deployment. [37]However, it is still unclear whether and how we can deploy intrinsically imperfect novel nanomaterial-based large-scale hardware accelerators, such as crossbar arrays, to be mathematically isomorphic to general MVM operations in ML algorithms and other applications, such as combinatorial optimization. [38]ere, we report a universal approach to calibrating large-scale electronic and photonic crossbar arrays for accurate mathematical calculations with imperfect hardware components.Instead of directly training physical hardware for specific ML models, the developed calibration algorithm can achieve the operationby-operation mathematical isomorphism of MVM operations in an imperfect crossbar array.In addition to a small-scale proof-of-concept experimental verification in a free-space photonic crossbar array, we demonstrate that the developed algorithm is not only scalable to large arrays but also universally applicable to general analog electronic and photonic crossbar arrays.Furthermore, we deploy the calibrated crossbar arrays to an ML model for classifying handwritten digits in the Modified National Institute of Standards and Technology DOI: 10.1002/aisy.202300147Analog electronic and photonic crossbar arrays have been emerging as energyefficient hardware implementations to accelerate computationally intensive general matrix-vector and matrix-matrix multiplications in machine learning (ML) algorithms.However, the inevitable nonuniformity in large-scale electronic and optoelectronic devices and systems prevents scalable deployment.Herein, a calibration approach is reported that enables accurate calculations in crossbar arrays despite hardware imperfections.This approach is experimentally validated in a small-scale free-space photonic crossbar array based on cascaded spatial light modulators and demonstrated the scalability and universality of this approach in various large-scale electronic and photonic crossbar arrays.The improved performance of calibrated crossbar arrays in an ML model inference is further demonstrated to classify handwritten digital images.
(MNIST) dataset and show that the calibrated hardware can achieve high accuracies that are close to those obtained with general-purpose graphics processing units (GPUs).Our results offer a new strategy for accurately performing calculations with intrinsically noisy analog ML hardware accelerators.

Result
Figure 1a illustrates the general architecture of analog crossbar arrays to perform MVM operations in an example of the multiplication of a 4 Â 1 input vector and a 4 Â 4 matrix.Physical carriers, such as electric voltage and current in electronic crossbar arrays and light intensity and photocurrent in photonic crossbar arrays, are selected to represent mathematical information.There are vector and matrix encoders to encode the mathematical elements of the input vector and matrix to externally controllable physical quantities, such as the conductance of resistive switching components and the transmittance of reconfigurable electro-optic components.The output from input vector encoders is fan-out to a 2D array of matrix encoders.After passing through both vector and matrix encoders, the selected physical carrier is modulated by the multiplication of physical quantities represented by both encoders.A line of output from the 2D matrix encoders is then combined and added together on an array of output decoders, which convert the physical output to mathematical information.
Figure 1b,c illustrates two examples of electronic and photonic crossbar arrays.In an electronic crossbar array with a size n Â m shown in Figure 1b, a list of input voltages (V 1 to V m ) that encode input vector elements is applied across m parallel electrical wires.n parallel output electrical wires are orthogonal to input wires.An array of resistive switching devices is at the intersection of input and output wires.The conductance (G ij ) of each device is tunable with external control.The current on output wires is thus proportional to the summation and multiplication of input voltages and corresponding tunable conductance, which follows MVM operations.Similarly, in an integrated photonic crossbar array in Figure 1c, the input vector is encoded into a list of the intensity of light waves (I 1 to I m ) propagating in m optical waveguides.For each of these m waveguides, an array of splitters is carefully designed to equally distribute light into n output waveguides, [20] which are further connected with an array of electro-optic reconfigurable components with tunable transmittance (T ij ).The photocurrent at the end of output waveguides is also proportional to the summation and multiplication of input light intensities and tunable transmittance, which follows MVM operations.
For the experimental demonstration of a physical crossbar system, we construct a free-space 2 Â 2 photonic crossbar array implemented through cascade spatial light modulators (SLMs) and a camera, as shown in Figure 1d.The first SLM is implemented using a reflective SLM (R-SLM) and encodes the information of a 2 Â 1 input vector v.The second SLM is implemented in a transmissive SLM (T-SLM) and encodes the information of 2 Â 2 matrix W. The spatial modulation of R-SLM and T-SLM occurs during light reflection and transmission, respectively.The driving voltage of each modulation unit is generally denoted as "gray level" or "grayscale" with 0 representing the minimum voltage, 1 representing the maximum voltage, and any bit-quantized values between 0 and 1 representing intermediate voltages.To minimize the crosstalk, a group of electro-optic modulator pixels on both SLMs (e.g., 60 Â 60 pixels on the R-SLM), which are defined as active regions, are tunable for encoding vectors and matrices.The gray levels of pixels within the same active region are changed together.Active regions are separated by nonactive pixels, whose gray levels are kept constant for minimum transmittance.Furthermore, an accurate active alignment process is employed to identify the corresponding active regions on both SLMs and the camera, which further helps reduce the crosstalk between active regions; see Experimental Section for a more detailed description of the experimental setup.The elements v 1 and v 2 inside v are physically represented by the electrically controllable optical transmittance T v,11 and T v,12 of a row of two active regions in the R-SLM.The same information of v 1 and v 2 is also physically represented in the second row of two active regions (v 21 and v 22 ).The regions on the same column will have the same gray levels.This information replication of a 1D vector into a 2D SLM removes the need for optical fan-out components, such as lenses, and facilitates optical alignment.Similarly, all four matrix elements w 11 , w 12 , w 21 , w 22 in W are physically represented by the optical transmittance T w,ij ði, j ¼ 1, 2Þ of active regions in the T-SLM.When the collimated incoherent light sheet passes through cascaded SLMs, the output light intensity is proportional to the multiplication of the optical transmittance of corresponding modulators, which fulfills the multiplication calculations in MVM operations.There is a camera at the end to capture the generated image and the summation in MVM operations is done electronically by adding the corresponding readings from camera pixels.Specifically, the addition of camera readings on the first row is proportional to w 11 v 1 þ w 12 v 2 and the first element of the output vector (o 1 ) of Wv.
To represent the bipolar elements in vector and matrix with positive physical quantities (e.g., voltage, current, transmittance) in physical crossbar arrays, each element v i and w ij (i, j ¼ 1, 2) can be represented as the difference of two positive values, such that to obtain the bipolar output vector from MVM operations.We consider all elements in W and v to be in the range of [À1, 1], and the encoding of elements is based on the tunable ranges of vector and matrix encoders.In a perfect system, the responses of all vector and matrix encoders are the same, and the interconnects distribute and collect signals equally and with integrity across rows and columns.For example, the input voltage V 1 in Figure 1b is assumed to be the same on the tunable conductances G 11 , G 21 , : : : , G n1 ; the input light intensity I 1 in Figure 1c is assumed to be the same on the tunable transmittance T 11 , T 21 , : : : , T n1 ; and the input light sheet is assumed to be uniform in our prototype free-space photonic crossbar in Figure 1d.Furthermore, the response curves of tunable conductance and transmittance for each practical reconfigurable electronic and photonic component are assumed to be the same.
However, the assumption that devices and interconnects have uniform and ideal behaviors is challenging to be achieved in large-scale systems.For example in a 2 Â 2 crossbar in Figure 2a, when only the vector encoder for v 1 is tuned and all other encoders are fixed, the output readout for o 1 and o 2 can be in different signal scales and shapes because of imperfect interconnects.Their exact contexts depend on the specific type of crossbar array being considered, such as nonuniform input beam profile and modulation unit insertion loss in the SLM-based free-space photonic crossbar array, nonuniform conductance of memristors in the electronic crossbar array, and the deviation of the splitting ratio of manufactured splitters from design and nonuniform modulation unit insertion loss in the integrated photonic crossbar array.Such imperfections can lead to inaccurate vector encoding.In addition, the tunable ranges for the pairs of vector and matrix encoders that contribute to the same output readout can be different, because of the nonuniformity in encoder responses.For example, there are two pairs of encoders, (v 1 , w 11 ) and (v 2 , w 12 ), for the o 1 output readout.Since the control of vector and matrix encoders are separate, the output readout from the pair encoders for v 1 and w 11 can be expressed as a variable separable equation vðxÞwðyÞ, where x and y represent external stimulus.Although a full characterization of vðxÞwðyÞ to perform calculations using the bipolar representation generally requires a 2D parameter sweeping, the separation of variables simplifies the characterization down to four linear sweeps.Specifically, they are the sweeping of vðxÞ when wðyÞ is at its minimum (w min ), the sweeping of vðxÞ when wðyÞ is at its maximum (w max ), the sweeping of wðyÞ when vðxÞ is at its minimum (v min ), and the sweeping of wðyÞ when vðxÞ is at its maximum (v max ).As shown in Figure 2b, these four sweeps (two red and two blue arrowed lines) enclose a 2D surface.Four vertices on the surface, w max v max , w max v min , w min v max , and w min v min , represent the tunable range F R1 as w max v max þ w min v min À w max v min À w min v max ¼ ðw max À w min Þðv max À v min Þ, which encodes the mathematical information of v 1 ¼ 1 and w 11 ¼ 1 under the bipolar representation.Other v 1 w 11 values are represented proportionally as v 1 w 11 F R1 .Similarly, there is a tunable range F R2 for v 2 w 12 , which is not necessarily the same with F R1 .The difference between F R1 and F R2 can lead to ambiguity when converting the physical quantities from output readout to mathematical values.Eventually, all imperfections in hardware systems can lead to inaccurate MVM operations.
We develop a universal calibration approach that only utilizes existing output readout instead of creating additional measurement probes at all grid points, whose number is proportional to the square of the vector length.Right after the manufacturing of crossbar arrays, we first perform four sweeps as mentioned before for each pair of vector and matrix encoders so that we can obtain 4 Â N 2 calibration curves for a N Â N-size crossbar array.These curves are stored as a lookup table in memory space proportional to N 2 and then accessed to determine driving signals of vector and matrix encoders during MVM calculations.Note that the access of a lookup table can be very efficient, such as utilizing hash tables with Oð1Þ complexity.Thus, the additional overhead introduced by accessing the lookup table of calibration curves during MVM operations is negligible.We then select the minimum tunable range along the same row (e.g., min(F R1 , F R2 ) for o 1 ) as mathematical one.The tunable ranges of other columns are clipped to this minimum value, and other mathematical numbers are also proportionally scaled based on this minimum value.Thus, any mathematical values can be physically represented in all columns along the same row and the output readout can be converted to numbers unambiguously.
Furthermore, for a given input vector element number, such as 0.5, we determine input driving signals based on the obtained modulation curves from both o 1 and o 2 , so that they yield the output readout corresponding to the proportionally scaled value of the minimum and maximum readout on each curve.For example in Figure 2c, as the encoder for v 1 is swept and other matrix encoders are all at their maximum output, the readout signals from o 1 and o 2 have different shapes and tunable ranges as v R1 and v R2 .From these two curves v 1 @w 11,max and v 1 @w 21,max that are accessible from stored curves, we can determine x 1 and x 2 so that v o 1 ðx 1 Þ ¼ 0:5v R1 and v o 2 ðx 2 Þ ¼ 0:5v R2 .Then, we choose the maximum driving signal of these two (e.g., x 2 in Figure 2c) to drive the v 1 encoder.Since matrix encoders have the smallest granularity of tunability, we can adjust all matrix encoders to the physical quantities that can yield correct output readout based on the minimum tunable range.For example, the encoder for w 11 needs a smaller value to generate correct o 1 output readout based on min(F R1 , F R2 ), since the input vector encoder produces a larger value than expected 0.5v R1 .This matrix encoder tuning is guaranteed to be achievable because of the two strategies of minimum tunable range and maximum vector driving signals.As a result, our calibration approach takes hardware imperfection as a whole from a pure input-output perspective and focuses only on obtaining an achievable accurate final output readout, so that it can mitigate the effects of different kinds of hardware imperfection altogether.The overall flow of our calibration approach is summarized in Figure 2d.If the operation behavior of a crossbar array gradually changes in the long term, such a system can be treated as newly manufactured non-calibrated hardware and the re-calibration can be done following the same flow in Figure 2d.
To demonstrate the calibration approach in the free-space SLM-based photonic crossbar array, we experimentally measure four sweeping curves for each pair of vector and matrix encoder active regions, as shown in Figure 2e,f.For example, we set gray levels of the second SLM to have the smallest transmittance, which is not necessarily to be zero.We then sweep gray levels of active regions, such as v 11 and v 21 in Figure 1d, of the first SLM simultaneously, while other regions, such as v 12 , and v 22 , are set with gray levels yielding a small transmittance.The same sweeping processes are repeated for v 12 and v 22 , as well as matrix encoders; see Figure S3, Supporting Information, for illustrations.Despite the 8-bit precision (256 available gray levels) in both SLMs hardware, we choose 64 gray levels (6-bit precision), which are normalized to the range [0,1].Clearly, these modulation curves are nonuniform.Figure 2g displays a captured image on the camera when the light passes through each pair of active regions under the same gray levels on both SLMs.The nonuniform brightness is also clearly observed.
We perform the MVM calculations of 1000 randomly generated 2 Â 2 matrices W and 2 Â 1 input vectors v, where each element is uniformly randomly generated in the range [À1, 1].The measured value of the elements of output vectors (õ) and the expected value obtained from standard digital computers (o) are used to define the calculation error as ðõ À oÞ=0 Â 100%.Figure 3a shows the scatter plot of measured and expected values, which is roughly along the line of y ¼ x.The corresponding histogram plot of the calculation error distribution is shown in Figure 3b.To have insight into those relatively large errors, we decompose the distribution for expected values in different ranges.For clarity, we only display the contribution of positive expected values.The large positive errors (cyan area in Figure 3b) correspond to the expected values in the range [0, 0.2] (cyan dots in Figure 3a); the golden area and dots correspond to the range [0.2, 1]; and the black area and dots cover the full value range [À2, 2].It is clear that small values contribute to large calculation errors since the expected value is in the denominator of error calculation, which magnifies errors.However, in neural networks, the small weights do not affect network performance much and can be pruned. [39]As a result, the large calculation errors for small numbers are not important for accurately performing ML tasks.In a contrast, if all modulation curves are assumed to be identical without any calibration process, the measured and expected values display larger errors, as shown in Figure 3c,d.
Furthermore, we demonstrate the scalability and universality of the calibration approach.In experimentally measured modulation curves shown in Figure 2e,f, 64 gray levels (6 bits) are randomly selected from 256 gray levels (8 bit) to generate 100 Â 100 pairs of the modulation curves of vector and matrix encoders.Various levels of device response uniformity can be generated; See Experimental Section for more details.The uniformity is defined as one minus the ratio of the difference between the maximum and the minimum tunable ranges over the maximum tunable range.A 100% uniformity corresponds to the ideal case.Figure 4a displays the standard deviation of calculation errors as a function of device response uniformity in the SLM-based crossbar.Although the error increases with decreasing device uniformity, the calculation is always more accurate for the system after calibration than that without calibration.In addition, an electronic crossbar array (Figure 1b) with memristor conductance reported in the study of Li. [41] and an integrated photonic crossbar array (Figure 1c) with nonvolatile phase change material transmittance reported in the study of Li et al. [41] are introduced with device response nonuniformity; See Experimental Section for more details.As shown in Figure 4b,c, the developed calibration approach can also improve the calculation accuracy.
Finally, we deploy these crossbar arrays for a ML task of classifying handwritten digit images in the MNIST dataset.The MVM operations implemented using crossbar arrays can be extended to calculate general matrix-matrix multiplication (GEMM) through block matrix multiplications. [42]We model the error distributions of the MVM calculations in crossbar arrays by adding noise to the standard GEMM multiplication function, such as PyTorch matmul function.The noise is modeled as a random variable following a Cauchy distribution, whose parameters are fit so that the noisy multiplication function can generate the same calculation error distribution as the experimental measurement; see Figure 5a.We build a multilayer perceptron (MLP) neural networks model for the classification task; see Experimental Sectionfor the details of the MLP model.The MLP model is trained on a general-purpose GPU and the calculations of all linear layers in the inference are performed through the noisy GEMM function.As shown in Figure 5b-d, the calibrated SLM-based photonic crossbar array, electronic crossbar array, and integrated photonic crossbar array all display higher and closer-to-ideal classification accuracies than those from noncalibrated systems.

Conclusion
In summary, we demonstrated a universal approach of calibrating large-scale electronic and photonic crossbar arrays with hardware imperfections to perform accurate MVM calculations.In contrast to the physics-aware training approaches that treat hardware systems as "black boxes" and directly incorporate the physical output quantities from hardware systems for a specific ML task, the develop calibration approach can achieve the operationby-operation mathematical isomorphism.Hence, the calibrated crossbars can be deployed to not only ML tasks but also other applications involving MVM operations, such as combinatorial optimization.

Experimental Section
Experimental Setup: The schematic and photo of the free-space optical crossbar array experimental setup are shown in Figure S1a,b, Supporting Information.The incoherent light beam was from a red light-emitting diode with a center wavelength at 625 nm and linewidth 17 nm (Thorlabs M625L4).The incident light was coupled into a multimode fiber with a numerical aperture 0.39 and then was collimated by a 2 in.off-axis parabolic mirror with a focal length 152.4 mm.The beam diameter of the collimated beam was 1.2 cm.A reflective SLM (R-SLM corresponding to SLM 1 in the main text) and a transmissive SLM (T-SLM corresponding to SLM 2 in the main text) were used to encode vectors and matrices, respectively.A linear polarizer (LP1) with the transmission axis perpendicular to the workbench (i.e., the polarization direction is perpendicular to the workbench) was used to configure the input light polarization state for the R-SLM.A 50:50 beam splitter was placed in front of the R-SLM (Meadowlark Optics, 1920 Â 1152) to route the reflective light beam to pass through the following T-SLM (HOLOEYE LC 2012).The array size of the R-SLM was 1920 Â 1152 with the pixel size 9.2 μm Â 9.2 μm and the filling factor 95.7%.The array size of the T-SLM was 1024 Â 768 with the pixel size 36 μm Â 36 μm and the filling factor 58.0%.A linear polarizer (LP2) was placed in front of the T-SLM to configure the output light polarization state from the R-SLM for the largest transmission modulation.At the output end of the T-SLM, there was another linear polarizer (LP3).The rotation angles of LP2 and LP3 were optimized to maximize the modulation range of T-SLM.
Output images were taken by a CMOS camera (Thorlabs CS165MU).The exposure time of the camera was adjusted according to input light intensity to avoid saturation.Ten-image frames were taken and averaged.A 2f spatial filter, consisting of a pair of lenses (focal lengths = 25 and 35 mm) and an iris, was placed in front of the camera to remove any coherent diffraction effect, which could lead to the crosstalk between camera pixels and thus wrong calculations.Particularly, such crosstalk-induced calculation inaccuracy is hard to be calibrated because these errors are input dependent.An area of 60 Â 60 pixels on the R-SLM was used to represent one element in vectors.This area roughly corresponded to an area of 20 Â 20 pixels on the T-SLM, which was used to represent one element in weight matrices.These groups of pixels on the R-SLM and T-SLM are called active regions, and we performed the search algorithm of active regions on the T-SLM to have precise alignment; see the following paragraph for more details.The averaged intensity in an area of 20 Â 20 pixels in each bright region on the camera was added accordingly to calculate the elements in output vectors and implement MVM operations.
Four active regions on the R-SLM were first chosen around the center region of the input light beam profile.Specifically, each active region contained 60 Â 60 pixels and the horizontal and vertical spacings between each region were set to 150 pixels.The relative position of beam size and active regions are illustrated in Figure S2a, Supporting Information.First, we loaded a two-vertical-line pattern on the R-SLM (i.e., the corresponding pixels were turned on) with one line connecting the active regions 1 and 3 and the other connecting the active regions 2 and 4; see red dashed rectangles in Figure S2a, Supporting Information, and the image in Figure S2b, Supporting Information.We swept the T-SLM from the left to right and obtain the blue curve in Figure S2c, Supporting Information, which consists of the contribution of open pixels and the beam profile background of closed pixels.To obtain the bean background profile, we turned off all pixels on the R-SLM and T-SLM.We then swept the T-SLM from the left to right, by turning on one column one time.As a result, we obtained the orange curve in Figure S2c, Supporting Information, which consists of the beam profile background on those closed pixels.By subtracting the orange curve from the blue curve, we obtain the green curve in Figure S2c, Supporting Information.From the left to the right, the first onset of rapidly increasing camera reading corresponds to the left edges of active regions 1 and 3.Then, the beginning of the first plateau corresponds to the right edges of active regions 1 and 3. Similarly, the next onset of rapidly increasing camera reading corresponds to the left edges of active regions 2 and 4, and the beginning of the plateau corresponds to the right edges of active regions 2 and 4. As a result, all vertical edges of active regions were determined.By loading a two-horizontal-line pattern on the R-SLM (blue-dashed rectangles in Figure S2a, Supporting Information, and the image in Figure S2b, Supporting Information) and performing similar sweep procedures from the bottom to the top, we further determined the horizontal edges of active regions (Figure S2d, Supporting Information).
Large-Scale Device Response Generation: Both R-SLM and T-SLM have 8-bit precision (256 gray levels).We swept the gray levels of a pair of aligned active regions on both SLMs to obtain a 256 Â 256 light intensity table.The column index was the gray level of one SLM and the row index was the gray level of the other SLM.We randomly picked 10 000 groups of 64 Â 64 matrix data from this table to generate 100 Â 100 pairs of 6-bit modulation curves for vector and matrix encoders.For the electronic crossbar shown in Figure 1b, the conductance at the intersection was from the memristor conductance described in Figure 3.4 in the study of Li. [40] for matrix encoders, which have 6-bit precision.The vector encoders were assumed to have a perfect linear response with infinite precision.The modulation range for a pair of vector and matrix encoders was scaled by multiplying a random number with a uniform distribution in a range ½a, 1 with a < 1.By selecting a as different numbers (e.g., a ¼ 0.5), 100 Â 100 pairs of 6-bit modulation curves for vector and matrix encoders were generated with different nonuniformity.Similarly, for an integrated photonic crossbar shown in Figure 1c, the transmittance at the intersection was from a nonvolatile phase change material described in Figure 2b in the study of Li et al. [41] for matrix encoders, which have 34 available levels (about 5-bit precision).The vector encoders and nonuniform device responses were generated using the same approach for the electronic crossbar.
MLP Model for the MNIST Classification: The MLP model consisted of three dense layers, with 144, 64, and 10 neurons from input to output.To fit the MNIST dataset images into this model, we reshaped the images into ð12, 12, 1Þ resolution to match the input size of the MLP model.Each dense layer consisted of sigmoid activation function and bias tensor.The activation function of the output layer was LogSoftmax.The total number of training iterations was 10 with the learning rate set to be 0.001.The Adam optimizer with CrossEntropyLoss loss function was used.The implementations were constructed using PyTorch v1.9.0 and run on an Nvidia RTX 3090 Turbo GPU.

Figure 1 .
Figure 1.Overview of crossbar arrays.a) Schematic of the general architecture of crossbar arrays to perform matrix-vector multiplication.Illustrations of b) an electronic crossbar array, c) an integrated photonic crossbar array, and d) a prototype 2 Â 2 free-space optical crossbar array implemented through cascaded spatial light modulators.

Figure 2 .
Figure 2. Calibration approach.Illustrations of a) a generic 2 Â 2 crossbar with different potential hardware imperfections, b) the modulation 2D mapping and curves for a pair of vector and matrix encoders, and c) calibration process for vector encoders.d) Flowchart of calibration and calculation processes.The intensity modulation curves of e) the SLM 1 and f ) the SLM 2 under different conditions in the free-space SLM-based photonic crossbar array.The grayscale range is normalized and the intensity data are directly exported from camera readings with arbitrary unit.g) One captured image on the camera showing light intensity nonuniformity.

Figure 3 .
Figure 3. Experimental results of the free-space SLM-based photonic crossbar array.a) Scatter plot of measured and expected multiplication results of 1000 randomly generated matrices and vectors.The measured results are obtained after calibration.b) The error distribution for (a).The cyan area in (a) and dots in (b) indicate the expected values in the range [0, 0.2].Golden area and dots represent the range [0.2, 2].Black area and dots cover the full value range [À2, 2].c) Scatter plot and d) error distribution for the measurements with and without calibration.

Figure 4 .
Figure 4. Large-scale and universal calibration.The standard deviation of calculation error as a function of uniformity for a) a 100 Â 100 SLM-based crossbar array, b) a 100 Â 100 electronic crossbar array, and c) a 100 Â 100 integrated photonic crossbar array with and without calibration.

Figure 5 .
Figure 5. Machine learning application demonstrations.a) Error distributions obtained from the SLM-based crossbar experiment and the noisy GEMM calculation.Classification accuracies of the MNIST dataset as a function of uniformity from b) the SLM-based photonic, c) electronic, and d) integrated photonic crossbar arrays with and without the calibration (red and blue dot-lines).Black dashed lines indicate the accuracy obtained from a general-purpose graphics processing unit without any noise.