In-memory eigenvector computation in time O(1)

In-memory computing with crosspoint resistive memory arrays has gained enormous attention to accelerate the matrix-vector multiplication in the computation of data-centric applications. By combining a crosspoint array and feedback amplifiers, it is possible to compute matrix eigenvectors in one step without algorithmic iterations. In this work, time complexity of the eigenvector computation is investigated, based on the feedback analysis of the crosspoint circuit. The results show that the computing time of the circuit is determined by the mismatch degree of the eigenvalues implemented in the circuit, which controls the rising speed of output voltages. For a dataset of random matrices, the time for computing the dominant eigenvector in the circuit is constant for various matrix sizes, namely the time complexity is O(1). The O(1) time complexity is also supported by simulations of PageRank of real-world datasets. This work paves the way for fast, energy-efficient accelerators for eigenvector computation in a wide range of practical applications.

vector multiplication (MVM), [1] which is an elementary operation in several algebraic problems, for instance, the training and inference of neural networks, [2,3] signal and image processing, [4,5] and the iterative solution of linear systems [6] or differential equations. [7] In such implementations, the crosspoint MVM is executed for several iteration cycles according to the algorithmic workflow, which might raise an issue in terms of processing time and energy efficiency of the computation. Recently, a crosspoint memory circuit architecture has been proposed and demonstrated for solving matrix equations in one step, including solving linear systems and computing eigenvectors. [8] Although the one-step solution capability can solve the inefficiencies of the iterative approach, the underlying time complexity of the circuit needs to be rigorously evaluated to assess the computing performance.
Eigenvector calculation is a fundamental problem in a broad scope of computing scenarios, e.g. webpage ranking, [9] facial recognition, [10] dynamic analysis and solving differential equations in fields such as physics and chemistry. [11] In the conventional computing paradigm, the dominant eigenvector (the eigenvector corresponding to the largest eigenvalue) of a matrix can be calculated using the power iteration method with a time complexity of O(kN 2 ), where N is the matrix size and k is the number of iterations. [12] In this work, we show that the Computing an eigenvector means solving the matrix equation where is a square matrix, is an eigenvalue of , is the unknown eigenvector corresponding to . To solve Equation 1, matrix is mapped by the conductance matrix of a crosspoint memory array, which plays the role of a feedback network in a circuit ( Figure   1a). The feedback configuration is enabled by transimpedance amplifiers (TIAs) and analog inverters. by the circuit where represents the eigenvector. Note that is real and positive in Figure 1a, which is always the case for the largest eigenvalue of a positive matrix, according to the Perron-Frobenius theorem. [13] For the negative case, the inverters in the circuit should be removed, and the absolute value of is mapped by the feedback conductance . [8] In Figure   1a we consider a positive matrix, since the conductance of a resistive memory device can only be positive. For matrices containing negative elements, two crosspoint arrays are needed to split the matrix with two positive matrices. [8] The in-memory calculation of eigenvectors was conducted in an array of resistive switching memory (RRAM) devices. In the RRAM device, the conductance can be changed by the formation and the dissolution of a conductive filament by local migration of ionized defects. [14] The RRAM conductance can be continuously tuned, thus enabling the analog storage in a crosspoint array for in-memory matrix computation. [5,8] Figure  In the conventional power iteration method, MVM is executed through element-wise multiply-accumulate operations and a number of iterations are required, resulting in a high computational complexity. [12,15,16] On the other hand, the MVM is instantaneously executed in the eigenvector circuit by physical laws in the crosspoint array, while discrete iterations are eliminated in favor of a higher computational speed.
To analyze the time complexity, the eigenvector circuit is illustrated as a block diagram where is the × coefficient matrix and is a diagonal matrix defined as = where is the × identity matrix. Considering the single-pole OA model, [17] namely becomes: where the insignificant terms have been omitted, due to the fact that 0 is usually much larger than 1. The inverse Laplace transform of Equation 4 implies a second-order differential equation in the time domain, that is: which describes the time response of the eigenvector circuit. To study the computing time of the circuit, Equation 5 is converted into a first-order differential equation [18] by defining: which leads to: The two equations of Equation 6 are merged as one, which reads: where is the × zero matrix. By defining the 2 × 2 matrix according to: which is associated with matrix , and defining a 2 × 1 vector as Equation   7 becomes: According to the finite difference (FD) method, Equation 9 can be expressed as: where 2 is the 2 × 2 identity matrix, ∆ is the incremental time and is a dimensionless constant defined as = 0 0 ∆ .
For Equation 9 to have a nontrivial solution, the spectral radius of matrix 2 + has to be larger than 1, which implies that the highest eigenvalue (or real part of eigenvalue) ℎ of matrix must be positive, assuming the eigenvalues of are ranked in a descending order according to their real parts. This condition on ℎ is satisfied if the implemented is slightly smaller than the largest eigenvalue of , namely To assess the circuit dynamics, we simulated the time evolution of ( ) from the FD model in Equation 10 for the experimental matrix in Figure 1c. Figure  According to Equation 10, the speed of the eigenvector circuit is controlled by ℎ of matrix , namely, the larger the ℎ , the faster the computation. To study the computing time of the circuit, we conducted a series of simulations by varying the eigenvalue difference for the eigenvector computation in Figure 1c. One concern about the computing time analysis is the parasitic wire resistance in the crosspoint array. [19] To investigate the impact of wire resistance on the time complexity of the circuit, we considered the interconnect parameters at 65 nm adopted from the ITRS (International Technology Roadmap for Semiconductors) table. [20] For the same dataset in As a practical case study, we addressed the PageRank of a real-world dataset. The PageRank algorithm is widely used for ranking webpages in search engines, [9] link prediction and recommendation in social media. [21] PageRank aims at calculating the dominant eigenvector, [22] which can be naturally accelerated by the crosspoint eigenvector circuit. We adopted the Harvard500 database, [23] which contains 500 relevant webpages of the Harvard University to be ranked according to their connections. In the PageRank of a webpage network, the citations among webpages give a citation matrix , which is defined as follows: if page j contains a link to page i, the citation element is set to 1, otherwise = 0. More pages citing the same page indicates that the latter is more important. Also, citation by important pages gives rise to the importance of the page. Figure 5a shows the citation matrix of Harvard500, which is a sparse logical matrix. To rank the webpages by their importances, a transition matrix is defined according to: where = 500 is the number of pages, = 0.85 is the random walk probability, = 1− is the probability for randomly picking a page. A uniform probability 1/ is assigned if a page gets no link. [23] The transition matrix is basically a stochastic matrix with the largest eigenvalue always being 1 and the dominant eigenvector giving the importance scores of webpages. [22] The resulting transition matrix for Harvard500 is illustrated in Figure S5 (Supporting Information).
The transition matrix was stored in the crosspoint array, and the largest eigenvalue was mapped in the feedback conductance with a mismatch degree to compute the eigenvector in the circuit. Figure 5b shows Regarding the solution accuracy of PageRank, we show the comparison between the simulated importance scores and the ideal ones for the Harvard500 database in Figure S7 (Supporting Information), indicating a good consistency between the two solutions. In particular, we ranked the top 10 pages for the ideal case and the 4 simulated cases, showing that all the ideal top 10 pages are preserved in the top 10 places in the simulations, except for the case with = 0.04, where one page was missed out. We also studied the wire resistance issue for the PageRank of Harvard500 subsets, with the results shown in Figure S8 (Supporting Information). The parasitic wire resistance causes a small increase of computing time for relatively large , thus leading to an N-dependence to the time complexity of eigenvector computation. These results suggest a careful choice of for circuit implementation to achieve the best performance regarding both the computing time and the accuracy of the results. A strategy of dynamic tuning of might be adopted to achieve both high speed and accuracy. In this algorithm, a large can be used in the initial phases to accelerate the transition of the output voltages, then can be reduced in the later stages for fine tuning of the final solution.
As the mismatch degree is generally considered to be small to maintain the eigenvector accuracy, it may suffer from the conductance variation, i.e., the feedback conductance values of the TIAs being slightly different. In this case, the associated matrix of Equation 8 becomes where is a diagonal matrix that is defined as = ( (1) , (2) , ⋯ , ( ) ), assuming ( ) is the i-th practical implementation of the nominal eigenvalue . We simulated the PageRank of With such a low time complexity, this work supports the significant time/energy efficiency gains of in-memory computing for big data analytics in a wide range of real-world applications.

Experimental Section
Experimental Devices: The RRAM devices characterized in this work employ an HfO2 thin film as the switching layer, whose thickness is 5 nm. The HfO2 dielectric layer was deposited