Reconstructing a rotor from initial and final frames using characteristic multivectors: With applications in orthogonal transformations

If an initial frame of vectors {ei}$$ \left\{{e}_i\right\} $$ is related to a final frame of vectors {fi}$$ \left\{{f}_i\right\} $$ by, in geometric algebra (GA) terms, a rotor, or in linear algebra terms, an orthogonal transformation, we often want to find this rotor given the initial and final sets of vectors. One very common example is finding a rotor or orthogonal matrix representing rotation, given knowledge of initial and transformed points.


INTRODUCTION
In a Geometric Algebra approach, orthogonal transformations are carried out via rotors, R, which act two-sidedly on general objects, M, within the algebra via M  → RM R. Often we want to be able to determine R given initial and final information about a frame of vectors which has been subject to the transformation.Specifically, given an initial frame of vectors {e i } and a final frame { i } that we know to be related via  i = Re i R, we would like to recover the rotor R. Results for this in 3D Euclidean space and 4D spacetime have been known about for some years (Hestenes & Sobcyzk 1 and Doran and Lasenby 2 ), but recently, Shirokov 3 has given a general result which in principle can be used to find R from a knowledge of {e i } and { i } in any dimension and any signature of metric.This is however, subject to some caveats: firstly, as presented, it assumes the {e i } form a standard orthonormal set.In fact, it should be possible to recover R for any starting set of vectors, as long as they span the complete space.Secondly, the proof given by Shirokov 3 is several pages long and quite detailed, relying on aspects of matrix algebra that are somewhat extraneous to a 'pure GA' approach.In this paper, we will present a GA proof of the main result which can be carried out in just a few lines, and also immediately generalises to non-orthonormal frames of vectors.For these reasons, it is useful to re-examine the important result in Shirokov 3 from a pure GA point of view, in order to provide a much shorter proof of a more general result.Furthermore, in the process of doing this, we will find some very interesting connections with some topics in a Geometric Algebra approach to linear algebra, which have been pioneered and drawn attention to by David Hestenes 1 but which have gained little visibility as yet, despite being potentially very important.This concerns the subject of characteristic multivectors and their associated use in a GA version of the Cayley-Hamilton theorem.We shall see that our reformulation of the result in Shirokov 3 brings in both of these elements in interesting ways, pointing towards yet further generalisation.
Given that all Lie Groups can be represented in a rotor formulation (Doran et al. 4 ) and that the concept of an 'initial state' being mapped to a 'final state' is ubiquitous in both classical and quantum physics, as well as in engineering, the range of application of formulas such as we discuss here is potentially very wide.However, in this paper, we illustrate the results we have found in just two example applications, which are the notion of orthogonal transforms in signal processing, where we will look at a rotor formulation of the Haar transform and the 4 × 4 discrete cosine transform (DCT), and the concept of finding the closest orthogonal transform to a given non-orthogonal transform.As well as achieving this aim, of finding an underlying rotor structure for the transform, the approach taken here allows novel concepts such as 'fractional' transforms, or indeed continuous interpolation between classic transform states.Such an approach would obviously generalise to other families of orthogonal wavelet transforms as well, again raising interesting possibilities for future work.This paper will be structured as follows.Section 2 will describe the concept of characteristic multivectors as given in Hestenes and Sobczyk 1 and their relation to the characteristic polynomial and the Cayley Hamilton theorem.Section 3 will then give the result and proof of how we construct a rotor relating a frame and its transform from these characteristic multivectors.We will start by assuming orthogonal frames and then generalise to non-orthogonal frames.Section 4 will consider comparisons between this method and other results in the literature.Section 5 will look at applications of our method to classical orthogonal transforms used ubiquitously in image processing and at how it also provides a method for straightforwardly obtaining the 'closest' orthogonal transform to a transform that is assumed to have been formed by noise added to an underlying orthogonal transform.

CHARACTERISTIC MULTIVECTORS
Characteristic multivectors were introduced in Chapter 3 of Hestenes and Sobczyk. 1 They generalise the information about linear functions that ordinary matrix algebra provides in the form of the 'trace' and 'determinant' of a matrix, to a set of further quantities which are also invariant under change of basis.As we shall briefly describe below, the scalar parts of these quantities figure in the Cayley-Hamilton Theorem and in the 'characteristic polynomial' for a linear function, and hence have been in effective use for a long time.However, the full characteristic multivectors consist of all even grades up to the largest even grade in the space, and hence potentially carry a good deal more information than just the scalar parts which have been used up to now.
We will define them here using the notion of simplicial derivatives, which are defined in Chapter 2 of Hestenes and Sobczyk. 1 These are generalisations of the vector derivative,  c , which, for a vector c and a basis {e i } (with reciprocal frame {e i }), can be defined as with sum over repeated indices assumed here and throughout unless otherwise stated.This definition is prototypical of all the derivatives we will define here, in that it is independent of the basis used in its definition.If we instead used a frame {  }, with reciprocal frame {  }, and chose to write the components of c via c = c    , then we would find This frame-independence extends to the simplicial derivatives we now define, and although many of the definitions below could be stated in terms of the combinatorics of products of frame elements, we find it helpful to motivate their original definitions in terms of derivatives, the geometrical nature of which is clear.Once one has these initial definitions in place, one can then proceed to a representation in terms of explicit frames, which need not be orthonormal.For example, for the vector derivative just discussed, we can note the identity of any of the forms where the dots represent the quantities sandwiched in the implied sums.Note finally, in terms of preliminaries, that in what follows juxtaposition represents the geometric product between quantities in the algebra, and that inner, outer and scalar products take precedence, in terms of ordering, over geometric products.
So suppose we have a set of linearly independent vectors v 1 , v 2 , .… , v p , which live in an m-dimensional space.We define a simplicial variable v (p) = v 1 ∧ v 2 ∧ … ∧ v p and a simplicial derivative  (p) relative to this simplicial variable as Here, for each  v i in the product, we are using the concept of the vector derivative, as defined above.Now, if  is a vector-valued linear function of a vector a living in an m-dimensional space, V m , and the output  (a) lives in the same space (the simplest case) then we can define the rth simplicial derivative of  as follows.
We let {a  },  = 1, … , m be a frame for the space and {a  } its reciprocal frame.We then look at sets of simplicial variables, is then (with no sum over repeated indices) The derivative form of this expression tells us that the resulting objects will not depend on whether the frame {a k } is orthonormal, or even orthogonal, and that the definitions will work in non-Euclidean signatures, since (as long as reciprocal frames can be defined), this aspect is taken care of in the definition of the vector derivative.
Nevertheless, for practical computations, it is useful to expand the  a  i … a  i combinations using frames in the manner described above.For clarity, we will first do this using a general frame {e i } and its reciprocal {e i }, to obtain At this point we can notice that we may as well use the already-given frame {a  } in place of the {e i }, and also impose an ordering on sub-and superscripts which will remove the 1∕r! factor.Also we will denote b  =  (a  ) for  = 1, … , m.Thus we obtain our final form for this specialised rth simplicial derivative, which we will call the rth characteristic multivector of the linear function  : where the sum is over all sets of r indices such that 0 <  1 < … <  r ≤ m.The point about these multivector quantities is that they provide invariant information about the function  .The invariance is in the sense that any frame {a  } could be chosen, and we would still get the same objects; they are therefore in some sense 'intrinsic' to the space V m and the function  .This is similar to what happens in Lie Group theory, where given a basis set of generators { i } say, i = 1, … , n, if we form the reciprocal basis { i } then the quadratic combination is a well known invariant for the group, known as the quadratic Casimir invariant, and can be shown to be independent of the initial choice of basis.Examples of computing the Casimir invariants for the group SU(3) in a geometric algebra approach are briefly discussed in a companion paper in this collection, 5 but we note that it turns out they are not directly equivalent to characteristic multivectors as discussed here, since they only involve a subset of the elements of the space (viz.the particular bivectors representing the generators).
This completes what we need to say about the definition of characteristic multivectors in a GA context.However, in order to give further understanding of characteristic multivectors as given in their final form (4), in a context which may be useful to those approaching Clifford algebras in a more mathematical way, we can note their relation to the exterior algebra spaces Λ r (V), where V is the vector space on which the Clifford algebra is based.These are normally given by exterior products of orthogonal vectors, and hence for this purpose, we will temporarily restrict the frame {a  } to be orthogonal.
What we are doing in ( 4) is forming the extension or outermorphism of the linear function  , which initially acts on single vectors, to act on the element a  1 ∧ … ∧ a  r of Λ r (V).We now form the geometric product of this with the further element of Λ r (V) defined by a  r ∧ … ∧ a  1 .Such a geometric product can be defined ultimately via a tensor product V ⊗ V quotiented by an ideal expressing the requirement that the action of a quadratic form on a vector yields a scalar.Finally we take the trace over the corresponding frames and dual frames, to obtain a frame-independent answer.Some elements of this technique for extracting information from  appear in the proposals by Sergei Winitzki for doing linear algebra in a coordinate independent way using just the exterior algebra, 6 and an interesting item for future work will be to investigate the links between his methods and those of Hestenes and Sobczyk, in the form we have described here.

The characteristic polynomial and the Cayley-Hamilton theorem
We can now employ the results of the previous section to look at the characteristic polynomial and Cayley-Hamilton theorem.These use just the scalar parts of the various simplicial derivatives.As shown in Hestenes and Sobczyk 1 Section 3-2, the characteristic polynomial C  () of a linear function  is given by where  (s) *  (s) means 'take the scalar part of the quantity  (s)  (s) ' and  (0) *  (0) is taken as 1.  is the scalar argument of the polynomial function.If  is an eigenvalue of  , that is,  (a) = a, then it is a root of the characteristic polynomial, that is, we will have C  () = 0.The Cayley-Hamilton theorem states that a linear function satisfies its own characteristic equation, which then tells us that for any input vector a.  (r) is the r-fold application of  and  (0) (a) is interpreted as a.

THE BASIC RESULT FOR RECONSTRUCTING A ROTOR
We now let  be a rotor, that is,  (a) = Ra R where R R = 1.Let us suppose that the set {a i }, i = 1, … n, form an n-dimensional frame, and b i =  (a i ) = Ra i R are what the a i 's are mapped to under  .
Theorem 1.Our basic claim is that R is a scalar multiple of the sum of the characteristic multivectors of  , that is, where  is a real scalar, and the  (r)  (r) are given in terms of the 'input' and 'output' frames by Equation ( 4).This holds in any dimension and signature provided that there are no null basis vectors.
Proof.We start with the case where the frames are orthonormal, and write a i = e i , i = 1, … , n.The sum on the r.h.s. of ( 7) is then (again with the sum over the repeated indices restricted by 0 We will thus have proved the result in (7) if we can show that the double sum within the brackets in the last line of ( 8) is just a scalar.If we divide the overall 2 n dimensional GA space into vectors, a, bivectors B, trivectors T, quadrivectors Q, and so on up to Q n say, where n is the dimension of the space, then it is easy to see this quantity is We see that Equation ( 9) is true from the definition of the differentiation wrt to an r-vector, X = X K e K , where the {e K } are the basis r-vectors: (10)   which is precisely what the inner sum in (8) gives us.At this stage, it is clear that the rotor nature of R is irrelevant, and we in fact seek to show that is a scalar for a general even element M, which will then make the LHS of Equation ( 8) proportional to R. We can make this easier by recognising that what we are now doing is forming a derivative with respect to all 2 n independent elements in the space.In particular, if we write this set of 2 n elements symbolically as E J , J = 1, … , N, where N = 2 n is the dimension of the entire space, then what we want to show is that, in the usual multivector derivative notation, for any even element M, and E is the set of all {E J }.This means that we want  E ⟨M⟩ r E = 0 for each even grade r > 0 in the space (as this is a reflection which preserves grade).This is easy to establish as follows. Rewriting we can see that the result we want would follow if each even basis element of the space, apart from the scalar identity, commutes with N∕2 elements and anticommutes with N∕2.In this case, the sandwiching above would result in everything cancelling except scalar quantites.This result is not difficult to show for M even, but it is in fact more general and holds for any element of the space, not just even elements.As this result could be useful in other contexts, the proof for any element of the space (not just even elements) is given in Appendix A. The overall result is therefore: □

An important special case
We can see that an important special case will arise, and the above method will fail to return a rotor, if it happens that the rotor we are trying to recover has zero scalar part, since then the sum of characteristic bivectors will just return 0.
There is nothing wrong with the above mathematics in this case, it is just that the factor in front on R in the last line of Equation ( 8), which we now know to equal 2 n ⟨R⟩, will be 0, and hence we can't invert it to recover R.
As an example of a rotor with zero scalar part, we can consider 180 • spatial rotations.For example, in Euclidean space, the rotor R = e 1 e 2 rotates both the e 1 and e 2 axes by 180 • , whilst leaving the other axes alone, and clearly has no scalar part.
It is also possible to have rotors with neither a scalar nor bivector part.For example, in Euclidean 4D space, the pseudoscalar I = e 1 e 2 e 3 e 4 is a rotor, since it satisfies I Ĩ = 1.This quantity rotates all four axes by 180 • .
If we have a rotor R with no scalar part, but possessing a non-zero bivector part, then by multiplying it with each basis bivector B i in turn, then we must certainly at some point reach a (combined) rotor RB i with a non-zero scalar part.Of course, if we only have available information about the initial and final frames, and not the rotor itself, then we cannot explicitly form RB i .However, we can simulate the effect that the B i would have on the final frame by noting that if B i = e  e k say, has negative square, then it will flip the th and kth axes by 180 • , and if it has positive square if will flip all the axes except the th and kth ones by 180 • .
This then provides us with an algorithm to deal with the case where we take the sum of characteristic bivectors, as in the l.h.s. of Equation ( 8), but find this gives 0. We now form alternative sums, where the signs of the b  's are flipped in accordance with the object we are conceptually multiplying the desired rotor R on the right by.As described, there will be two flips if the object is a negative square bivector, and n − 2 flips for a positive square one.For each such object tried we can ask if we then get a rotor S for which there is a scalar part.If we do, the process terminates, and if the bivector basis element concerned was B i say, we can form the desired R from S by multiplying on the right by B −1 i , since S = B i R. If we have gone through the entire bivector basis and still not found an S with non-zero scalar part, then we go to the grade 4 basis elements that could be rotors, and work through these, flipping the signs of the b  's according to the effects that sandwiching in this grade 4 element and its reverse would have on the initial {e i } set of basis vectors.If this fails, then we go to the grade 6 basis elements which could be rotors, and so on.
Unless R is actually 0, then this process must terminate at some point, and we will have succeeded in recovering R.

Extension to non-orthogonal frames
In Section 3 we made the statement that Equations ( 7) and (8) (evaluating the simplicial derivatives from the 'input' and 'output' frames as given in Equation 4) held for any start and end set of frames and in any signature and dimension; however, in Section 3, we gave a proof using orthogonal frames.We now note that there is nothing in the proof that specified the signature of the space (provided no basis elements square to zero), and also that it was for any dimension.Additionally, the important point about the characteristic multivectors of a transformation is that provided the frame {a k } spans the space, the characteristic multivectors themselves are independent of the choice of frame, and depend just on the transformation  and the space.This means the proof goes through unchanged for non-orthogonal frames.This can be seen specifically in Equations ( 9) and (10), where all that matters is that we have the R sandwiched by the frame (and bivector, trivector … , extensions) and its reciprocal; there is nothing now that requires them to be orthogonal.

COMPARISON WITH EXISTING RESULTS
This section will look at some existing results for recovering rotors from initial and final frames, in particular, the results in Shirokov. 3Note, in this section we again adopt the Einstein summation convention for clarity, such that repeated indices are summed over.

Rotors in Euclidean 3D
If we have a frame (not necessarily orthogonal) {e i } in 3D Euclidean space which is rotated by a rotor R to a frame { i } (so that  i = Re i R), it is well known that we recover the rotor via the following simple expression: where the constant  ensures that R R = 1.This is undoubtedly the simplest form for recovering the required rotation.This result first appeared in Hestenes and Sobczyk 1 and can also be found in Doran and Lasenby. 2 If we were now to use our characteristic multivector formula for recovering the rotor, we have: where i, , k = 1, 2, 3 and  < k.Initially, this does not look like Equation ( 15), but in fact, it is not difficult to show that they are equivalent.Take one of the bivector-bivector terms, say Recall that in 3D Euclidean space the reciprocal frame vectors take the following form: We are therefore able to write: where I 3 is the unit pseudoscalar in Euclidean 3-space and It can be shown that E = 1∕F, which then leads to Applying the same logic to the other terms leads us to the conclusion that We then note that, e i  i = e i  i , this is true in any space since we can write e i  i = (e i Re i ) R ≡ ( a Ra) R and e i  i = (e i Re i ) R ≡ ( a Ra) R.This therefore tells us that the bivector-bivector term is the same as the vector-vector term.
For the trivector-trivector term we see that and where E and F are as given in Equation (17).Therefore, this term is equal to 1, and we hence have a total expression where R ∝ 1 + e i  i , as expected, which gives the standard result on taking the reverse.

Rotors in 4D spacetime
Now, again consider a spacetime rotor R such that  i = Re i R, i = 1, .., 4 and the signature of the space is (+, −, −, −).It is again well known (see Hestenes & Sobczyk 1 and Doran & Lasenby 2 ) that the rotor can be recovered by the remarkably simple formula: For our characteristic multivector formula to work we need the bivector-bivector, trivector-trivector and 4-vector -4-vector terms to jointly cancel out the scalar term and give a multiple of the vector-vector term.This is indeed exactly what happens; depending on the nature of the rotation, the bivector-bivector and trivector-trivector parts are multiples of the vector-vector part or multiples of the bivector part of the vector-vector part, while the 4-vector -4-vector part gives a scalar.Showing this theoretically is more involved than the calculations in Section 4.1 but follows the same lines of argument.
It is interesting to note that the above relations between terms in our characteristic multivector expression can often be related via the characteristic polynomial.This will be discussed further elsewhere.

Formula in Hestenes and Sobcyzk 1 for orthogonal transformations
Chapter 3 in Hestenes and Sobczyk 1 describes how Linear Algebra is dealt with in GA.Here, we see that if two orthonormal frames {e k } and { k } are related by a rotor R, where k = 1, 2, … , n ( k = Re k R) and the space is Euclidean, we can write Here, each R k is a rotation in an elemental plane so the overall rotor is made up of rotations in orthogonal planes.These planes can be shown to be the planes formed by the eigen-bivectors (which can also be formed from the complex eigenvectors in a standard matrix decomposition).In a later section, we will look specifically at orthogonal transformations and show how the characteristic multivector approach compares to this plane-wise decomposition.We also note here that the recent paper by Roelfs and de Keninck 7 shows how we can decompose either a rotor R or a bivector B into mutually commuting simple rotors or simple bivectors respectively.

Shirokov's formulation
Shirokov 3 gives the main result in his equation (6.1).This is that if two orthonormal frames, {e a } and { a }, are related by a rotor S, that is, where a = 1, … , n, then if is non-zero, then Here, the notation  1 … k , k ≤ n, is explained in his equation (2.4), which has the definition We should note carefully that in 3 the notation for upper multi-indices is such that it is assumed that e A = (e A ) −1 , which therefore gives e a 1 … a k = (e a 1 … a k ) −1 = e a k • • • e a 1 and not e a 1 … e a k , (which might be assumed from the way the lumped downstairs indices work), because e a i = (e a i ) −1 .We thank Dmitry Shirokov for this clarification concerning the notation.
This formula is therefore the same as our formula in Equation ( 7) for the orthogonal case.As the result is written entirely in terms of orthogonal frames, it is difficult to extend it to non-orthogonal frames, but does hold for arbitrary signature, provided no basis vectors square to 0, and arbitrary dimension.Note that we have also given the extension to cases where M = 0.

APPLICATIONS TO ORTHOGONAL TRANSFORMS
In this section we will look at how expressing some conventionally very important orthogonal transforms as rotors; this will enable us to create some novel linear mappings, such as fractional transformations.
As examples, we will look at the 2 × 2 Haar Transform and the 4 × 4 Discrete Cosine Transform [DCT].Both have been extremely important over the years in image processing and image coding -one reason for their importance is because they are energy preserving transforms.
Recall, a real n × n matrix Q is orthogonal/orthonormal if its rows/columns are orthonormal vectors, so that where I is the n × n identity matrix and we are working in an nD Euclidean space.Let us look at orthogonal transforms in two ways: First: Let an orthogonal basis in this space be {e i }, i = 1, .., n.The matrix Q will take this set of n basis vectors to a set of n orthogonal vectors { i }, where the { i } are the columns of Q.
If we are given {e i } and { i }, we would like to recover the rotor, R which takes {e i } to { i } [assuming one exists].Second: For an n × n matrix, consider each pixel as a dimension.We then act two-sidely (can think of this as acting on columns and rows sequentially) on an n × n array of pixels, call this X, to produce a transformed block of pixels Y via (recall, we noted that this is an energy preserving transform:  T  = x T x).We can standardly write X as a linear combination of basis arrays: where M i = q T i q  is an n × n array, and where q i is the ith row of Q. • Now take M i and unwrap row by row to form a vector-call this m k , with It is not hard to show that the set {m k } is an orthogonal set if the matrix Q is orthogonal.• We then look for the rotor which takes the set of pixels, each one being a basis vector, to the orthogonal set of vectors {m k }, k = 1, … , n 2 .• In most image processing/coding applications, we learn a lot about the effects of our transforms by decomposing into basis functions, so this approach may well yield something interesting.
We will now look at some explicit examples.

Example 1: The 2x2 Haar transform/Haar wavelet
The Haar transform has, over the years, been an important tool in image processing.It is also the simplest form of wavelet that displays desirable characteristics.The form of the 2 × 2 Haar transform, T, most commonly used is Since we can see that the Haar transform has lowpass and highpass components.
Recall we form the familiar Haar basis functions by taking t T i t  , so that ] FIGURE 1 The high-and low-pass basis functions of the 2 × 2 Haar transform These are illlustrated in Figure 1.
We now look for the 4D rotor R which performs the following mappings (unwrapping row by row): [note, we could choose to do this unwrapping other ways, for example, column by column].
Using the characteristic multivector formula in 4D (noting that orthogonality implies e i = e i , we have the following expression for the reverse rotor: [e 12 + e 24 − e 13 − e 34 + e 23 ] Note 1: if you apply the formula in Equation ( 8), you find ⟨R⟩ 0 = 0: to get around this we can apply a simple rotor to the  i s, work out the new rotor and unwrap the simple rotor, as described earlier.Note 2: if you unwrap the basis function column by column, you get a different rotor, which is related to the above by a rotor that simply permutes (and in some cases negates) the  i s.

Fractional transforms
Having expressed the Haar transform as a rotor, we can now investigate what fractional transforms might look like.The prescription of generating the fractional rotor is simply to extract the bivector, B, from the rotor, R, and then to create a rotor R  from B, where  ∈ [0, 1].In order to extract the bivector, we first choose how we wish to express our rotor as a function of the bivector, for example, exponential, Cayley, outer exponential.Below we give the prescription and examples using the Cayley form as the inversion is easy to illustrate, but it is likely that the exponential form may produce more interesting fractional transforms.
• Write the extracted rotor R as a function of B using the Cayley transform and invert to give B in terms of R: • Create a new rotor using B,  ∈ [0, 1]: We can now look at the properties of these new transforms generated by the R  .
Example 1.As a first example, look at  = 0.5 in the 2 × 2 Haar transform.Taking  i = R  e i R and forming the orthogonal matrix, F from the  i s, gives (where we are unwrapping the pixels to form vectors, i.e., Method 2): with a = 0.68, b = 0.32, c = 0.24 √ 2, d = 0.04.If this F now acts on a 2 × 2 block of pixels that we unwrap to give a vector x = [x 1 , x 2 , x 3 , x 4 ], and we rewrite the resulting vector  = Fx as a 2 × 2 block of pixels, we have the following result: These new basis elements are shown in Figure 2.
Example 2. As our next example we look at the 4× 4 Discrete Cosine Transform (DCT) The 4×4 DCT is the basis of the compression standard JPEG XR.The transform matrix, T DCT , is standardly given by ) − √ ) − √ ) ) √ ) The basis functions/arrays of T DCT are shown in the left image of Figure 3; we have 16 4 × 4 basis arrays, which show increasing horizontal and vertical frequencies as we move from the top left to the bottom right.The application of the 4 × 4 DCT to the Lenna image is shown in the right image of Figure 3 (reordered to view low-low frequency components at the top left and high-high frequency components at the bottom right).
Here, we will use Method 1 to look for the rotor which takes an orthogonal 4D Euclidean frame {e i } to { i }, where the { i } are the columns of the DCT.To do this we will use the method of characteristic multivectors to give  which is clearly an eigen-bivector but not an eigenblade.As noted in Section 4.3, orthogonal transformations can be written as a composition of rotors which represent rotations in orthogonal planes, Hestenes and Sobczyk 1 also showed how these relate to the complex eigenvalues/vectors of the matrix.
For the 4 × 4 DCT we have 4 complex eigenvalues and 4 complex eigenvectors.Take the two which are not just congjugates: These result in the following eigen-bivectors: where B1 and B2 commute.It is then not hard to show that Therefore the sum of the commuting blades which are the eigenblades gives the eigen-bivector formed via the characteristic multivectors.An alternative approach would be to take the B formed from the R and split it into commuting blades via, for example, techniques outlined in Roelfs and de Keninck. 7aving expressed the DCT as a rotor, we can again investigate what fractional transforms look like by following the procedure below (again, while we use the Cayley transform here, other functions of the bivector can be used): • Use the Cayley transform to find B given R: • Create a new rotor using B,  ∈ [0, 1]: Figure 4 shows the basis functions for three fractional DCTs:  = 0.1, 0.5, 0.7.As expected, for  = 0.1 each basis function is approx a delta function at a given pixel, while the other basis functions are more difficult to interpret, but are moving more towards the case of  = 1 shown in Figure 3.
What uses might we put such fractional transforms to?One possibility is, for a given image, to compress using a transform optimised for that particular image; the decoder would only need to know one extra scalar, .However, finding an optimal  would perhaps be difficult (it would perhaps need to be learned!).

FORMING THE 'CLOSEST' ORTHOGONAL TRANSFORM
One common task which occurs in a number of fields (e.g., computer vision, molecular dynamics, robotics etc.) is to find, given noisy sets of vectors (which span the space), the rotor which takes one to another.Another version of this (which can be viewed as having noise only on the target set of vectors) is to find the 'closest' orthogonal matrix to a given non-orthogonal matrix.
This problem has been addressed multiple times in the literature Kabsch, 8 Horn et al, 9 Lasenby et al, 10 Wu et al, 11 and in this volume the problem has also been approached from a particular decomposition viewpoint by Sarabandi and Thomas 12 (also to appear in this volume-citation when available).
Given two sets of noisy vectors, {e i } and { i }, (or one set of vectors mapping to another set of vectors with added noise) where the true (no-noise) vectors are related via an underlying rotor, can we use the characteristic multivector formula directly (with the noisy vectors) to get the closest [in some sense] orthogonal transform/rotor relating the sets?Using the characterisic multivector formula will produce a rotor for Euclidean signatures (this will be shown elsewhere, but it is easy to verify symbolically for a given dimension), but how does this rotor relate to, for example, the rotor produced by the standard SVD formula, which we know should be optimal under Gausssian noise (as it satisfies the least squares criterion)?
Here, we present a preliminary investigation of this (a more in-depth study will be presented elsewhere) by asking the following question: given a 4 × 4 matrix, M, find the rotor relating {e i }, i = 1, .., 4  (always give a valid rotor) and investigate the nature of this rotor.For a given rotor R, our simulations will add both uniform and Gaussian noise to the vectors { fi } where fi = Re i R (adding noise to each of the components of the vector) and will then compare the recovered rotor to R.

Some preliminary results
In this section we will compare the standard Singular Value Decomposition (SVD) method for finding the 'optimal' rotor between two sets of vectors, with the Characteristic Multivector (CM) described in earlier sections.This will be done for the 4D case and for the case of an orthogonal set {e i } mapping to a non-orthogonal set, { i }, which is equivalent to finding the closest orthogonal matrix to a given non-orthogonal matrix (so that its columns are the { i }).Following the presentation of results by Sarabandi and Thomas, 12 and in order to compare with their findings at some later stage, the generation of the 4 × 4 orthogonal matrices to which noise is added follows the same procedure of doing this via sampling points on a 4-sphere (see below).
Using double precision arithmetic, we repeat the procedure outlined below for both uniform and Gaussian noise.For the uniform case, additive noise is taken from a uniform distribution in the interval [−a, a] with a in the interval [0, 0.1] with 100 intermediate noise levels.For the Gaussian case, additive noise is taken from a normal distribution with zero mean and standard deviation , where again  takes 100 evenly spaced values between 0 and 0.1.The simulation then proceeds as follows: 1. Generate two sets of 100 quaternions, q r and q l using the 'Method of choosing a point on the 4-sphere' as presented in Marsaglia. 13This method chooses a point according to a uniform distribution on the surface of the unit 4-sphere.2. Any 4D rotation (orthogonal) matrix can be written as a commutative product of two orthogonal matrices (see Cayley 14 and Kim & Rote 15 ).In geometric algebra terms, this involves splitting the bivector, B (such that, for example, the rotor R corresponding to the rotation, can be written as exp B) into commuting blades, B 1 and B 2 (B = B 1 + B 2 ), so that our two commuting rotors are exp B 1 and exp B 2 .In matrix terms, this can be done by decomposing into right-and left-isoclinic matrices as in Kim and Rote. 15Since each of these commuting rotors or right-and left-isoclinic matrices can be written as quaternions, it can also be said that any 4D rotation can be represented by a pair of quaternions or a double quaternion.These double quaternions are then converted to 4D rotation matrices: create the 4D rotation matrix by writing it as a Van Elfrinkhof matrix (Elfrinkhof, 16 Sarabandi et al 17 ).3. Add uniform/Gaussian distributed noise to each element of the 4D rotation matrices generated.4. Compute the 4D 'nearest' rotation matrices using both the MATLAB 18 in-built function for SVD and the CM method.This is done using double precision arithmetic for both methods.For the rotor obtained from the CM method, also convert to a rotation matrix.5. Compute the maximum and the average squared Frobenius norm of the difference between the rotation matrices without noise, and the obtained nearest rotation matrices from the SVD and CM, respectively.The Frobenius norm is where a ij is the ith element of the matrix, and N is the matrix size, 4 in this case.6. Compute the maximum and the average orthogonality error obtained using the nearest rotation matrices from the SVD and CM methods; if R or R is the original rotor/rotation matrix, and R * or R * is the estimated rotor/rotation matrix, the orthogonality error is given by  Figure 5 shows the results for uniform noise; both the maximum and mean Frobenius norm (over 100 trials) are shown.We see that the mean results are very similar, but note that the CM method produces a more variable maximum error for higher noise levels.However, we see in Figure 6 that the CM method produces a marginally 'more orthogonal' rotor, but also note that here we are looking at machine precision.
Figures 7 and 8 show the equivalent results for Gaussian noise where, again, the SVD produces a marginally better average Frobenius error while the CM method is marginally more orthogonal.We note that if single precision arithmetic is used the errors are significantly higher for the SVD method, but that the CM method results are affected less.We also note that while we know that the SVD solution is minimising the sum of the squared differences between points, we do not know what the CM is minimising.We will show in future work that while the SVD appears to perform better in the noise cases considered above, the CM method appears to be more robust in cases where the initial point correspondences are unknown, so registration involves estimation of correspondences and the transformation.
As stated in the introduction, this paper was inspired by the work of Shirokov 3 who essentially gave the rotor formula in Equation 7 for restricted cases; our aim was to give a more general formulation (one that works for non-orthogonal frames and general non-degenerate signatures), and a proof that required only a knowledge of basic geometric algebra.We also took inspiration from the work of Sarabandi et al 12 for an investigation into what the nature of the characteristic multivector sum is when the start and end vectors are not related by a rotor.
Given any set of frame vectors {e i } (working in any dimension and any signature, provided no basis vector is null) and a set of frame vectors { i } which are related to the original frame via a rotor R, ie  i = Re i R, we have shown how to recover R via a closed form expression (Equation 7).Expressing orthogonal transformations as rotors provides us with a variety of possibilities, such as defining fractional transforms by interpolating the rotor; some illustrative examples are given for the Haar and DCT transforms.Finally, we note that if the {e i } and { i } are not related by a rotor, the characteristic multivector (CM) formula still gives a rotor in the Euclidean case.This rotor appears to be closely related to the UV T (or VU T ) construction obtained from the SVD; in fact, we believe that one possibility is that it is the optimal rotor obtained when minimising over the difference of all geometric objects not just points.We also show that if we use the CM method to obtain the 'closest' orthogonal matrix to a noisy orthogonal matrix, the CM performs similarly (in terms of Frobenius norm and orthogonality measure) to the SVD under double precision arithmetic.In more complex cases, for example, where we do not know point correspondences, it is likely that the CM method will be more robust.
The hope is that now this method has been shown to work in practice and that it is very easily implementable, more diverse applications will be found.

FIGURE 2
FIGURE 2The basis functions for the  = 0.5 fractional Haar transform

FIGURE 3 4 )
FIGURE 3 Left: The 16 basis functions of the 4 × 4 DCT.Right: The 4 × 4 DCT applied to the Lenna image

FIGURE 5
FIGURE 5 The maximimum (left) and mean (right) Frobenius norm between the original (R) and estimated (R * ) rotation matrices with the SVD and CM methods (over 100 trials), against noise level.x-coordinate indicates noise is taken from a uniform distribution between [−x, x] [Colour figure can be viewed at wileyonlinelibrary.com]

FIGURE 6
FIGURE 6The maximimum (left) and mean (right) orthogonality error between the original (R) and estimated (R * ) rotation matrices with the SVD and CM methods (over 100 trials), against noise level.x-coordinate indicates noise is taken from a uniform distribution between [−x, x] [Colour figure can be viewed at wileyonlinelibrary.com]

FIGURE 7
FIGURE 7 The maximimum (left) and mean (right) Frobenius norm between the original (R) and estimated (R * ) rotation matrices with the SVD and CM methods (over 100 trials), against noise level.x-coordinate indicates noise is taken from a normal distribution with zero mean and standard deviation x [Colour figure can be viewed at wileyonlinelibrary.com]