We first gives a brief introduction of the ID proposed by Liberty et al. . Suppose C is a complex m × n matrix of rank k with k ≤ m and k ≤ n. There exist a complex k × n matrix P and complex m × k matrix B whose columns consists of a subset of the columns of C such that
1. some subset of the columns of P makes up the k × k identity matrix;
2. no element of P has an absolute value greater than 1;
4. the least (that is the k − th greatest) singular value of P is at least 1; and
5. when k < m and k < n, , where σk+1 is the (k + 1)-st greatest singular value of C.
 Based on these statements, an approximation can be derived as
when the exact rank of Cm×n is greater than k, but the (k + 1)-st greatest singular value of Cm×n is small.
 The ID employs randomness to reach the decomposition described in equation (3). It begins with generating a random vector ω with Gaussian distribution and forming the product of y = ωHC, where the superscript H means adjoint operation. Vector y can be regarded as a random sample from the range of C. Repeating this sampling process l(l > k) times:
Owing to the randomness, the set ω(i) : i = 1, 2, ⋅⋅⋅, l of random vectors form a linearly independent set and no linear combination falls in the null-space of C. Therefore, to produce an orthonormal basis of the range of C, we just need to orthonormalize the sample vectors by rewriting equation (4) into the compact form,
Employing some stable methods for performing the orthonormalization, such as the pivoted QR factorization, a k × n matrix P whose columns form an orthonormal basis for the range of Y can be obtained, such that
where the columns of Ll×k constitute a subset of the columns of Y. That is to say, there exists a set of integers i1, i2, ⋅⋅⋅, ik that, for any j = 1, 2, ⋅⋅⋅, k, the j-th columns of L is the ij-th column of Y. Collect the corresponding columns of C into a complex m × k matrix B, so that, for any j = 1, 2, ⋅⋅⋅, k, the j-th columns of B is the ij-th column of C.
 The ID algorithm typically requires [Liberty et al., 2007]
floating-point operations, where CH is the cost of applying CH to a vector.
 As shown by Liberty et al. , l = k + 5 or l = k + 10 is sufficient. In practice, the rank k is rarely known in advance. The ID are usually implemented in an adaptive fashion where the number of samples is increased until the error satisfies the desired threshold εID as discussed in Section 5.2. This will at most double the cost [Liberty et al., 2007]. Due to the randomness used, the ID does have the possibility to fail. However, the possibility is very slim [Liberty et al., 2007]. In a word, compared with the classical pivoted QR factorization, the cost is reduced a lot since we need only to factorize the small matrix Y.
 In some cases, it is more efficient to construct matrix Ωl×m in such a manner that the resultant matrix consists of uniformly randomly selected rows of the product of the discrete Fourier transform matrix and a random diagonal matrix [Liberty et al., 2007].
3.2. Approximating Matrix by ID