The model-free or soft-modelling resolution of multivariate data sets is a surprisingly simple idea. The task is best represented graphically:
Typically, the data matrix D is a collection of absorption spectra, where the ns spectra, measured at nl wavelengths, are forming the rows. According to Beer–Lambert's law, this matrix can be decomposed into the product of two smaller matrices C and A leaving matrix E of experimental error or variance unexplained by the model. The columns of the matrix C can then be interpreted as the concentration profiles of nc components, and the matrix A contains, row-wise, their molar absorption spectra. Naturally, all elements of C and of A must be positive; there are no negative concentrations nor are there negative molar absorptivities. The matrix E contains only noise. Assuming E to be zero, the aforementioned equation is a system of ns × nl equations, one equation for each element of D, with nc × (ns + nl) unknowns, the elements of C and A.
Often the number of equations is larger, sometimes much larger than the number of unknowns, so the first question is whether there is a solution at all. The answer is yes: one solution must be the real concentration profiles and spectra. However, the reality is not known in advance, and the second question is how the solution can be found or how can the system of equations be solved? And a third important question is whether the solution is unique.
There are several algorithms for the calculation of the unique or of one of the possible solutions, one of the most widespread methods being the ALS (alternating least squares) algorithm (e.g. at http://www.mcrals.info/). Alternative approaches have been published.
Unique solutions are rather the exception and generally these model free algorithms converge to any one out of the range of all feasible solutions, and there is the risk that the user might take this result as a fact, not being aware of its intrinsic limitation as being only one out of the range of feasible solutions.
If there is no unique solution, things start to get more challenging. The complication of having a range of feasible solutions is commonly known as ‘rotational ambiguity’, which is an unfortunate expression, and transformational ambiguity would be a much more appropriate and descriptive term.
The aforementioned equation represents the situation: any matrix Ĉ and Â, where Ĉ = C × T and Â = T-1 × A, are solutions as much as C and A themselves are solutions, as long as they fulfil the restrictions that are imposed by the physics and chemistry of the process, for example, all elements of C/Ĉ and A/Â must be positive. T is a general transformation matrix, which has to be non-singular or invertible.
Unfortunately, the determination of the complete range of feasible solutions is anything but easy and straightforward. Whereas the general solution for the complete range of feasible solutions in the two-component case has been given a long time ago, comprehensive methods for three and more components have only been developed recently and are not yet included in readily available packages. On the other hand, approximations for an estimation of the range of feasible solutions are available, for example, (multivariate curve resolution) MCR-Bands in the aforementioned website.
The number of active researchers in the field of determination of the range of feasible solutions is small, some of them attended the XIII Chemometrics in Analytical Chemistry (CAC) in Budapest, and upon invitation by Róbert Rajkó, we met after the conference on the 30th of June 2012 in Szeged, Róbert's hometown in the south of Hungary.
Subjects that were discussed include the following:
- The reliable and quick detection of rotational ambiguity. Iterative algorithms such as the ALS algorithm usually converge to any of the feasible solutions, and many users will take it as a ‘fact’ without realising that it is only one of a possibly wide range of solutions. Detection of rotational ambiguity and appropriate warning are required.
- Improved algorithms for the complete computation of rotational ambiguity.
- The reduction of rotational ambiguity by applying well-founded restrictions that go beyond the usual non-negativity of responses and concentrations. Introducing hard constraints is a clear improvement, but it is reasonable to mention that there are not many data sets amenable to such restrictions.
- The representation of the results of the analysis of rotational ambiguity. This is a very important aspect and these analyses will only find application in the wider world of chemistry if researchers can present the results other than in a figure that shows a band rather than a line for either a spectrum or a concentration profile. Some kind of numerical output would be preferable, then, as an example, comparisons with published results would be much simpler.
All these subjects are still a matter of discussion among researchers, with no definitive solutions yet to all of the aforementioned remaining problems. The good side is that we all went home with these questions on our minds, and naturally, we need a follow-up meeting to look at the results of our investigations. The group is open for suggestions for the location. Presently, the favoured place is Seal Rocks (32°26′9.48″S; 152°31′55.42″E), a little fishing village on the Pacific, near Marcel's home. Unfortunately, the rather large distance for most other interested chemometricians is a significant disadvantage.
We named our meeting the first intercontinental meeting on rotational ambiguity, FIMORA. Obviously, the next meeting, the second intercontinental meeting on rotational ambiguity, will be called SIMORA.