Abstract
- Top of page
- Abstract
- 1. Introduction
- 2. Model
- 3. Data analysis
- 4. Simulation studies
- 5. Application
- 6. Discussion
- Acknowledgements
- References
- Appendix
This paper presents a clusterwise simultaneous component analysis for tracing structural differences and similarities between data of different groups of subjects. This model partitions the groups into a number of clusters according to the covariance structure of the data of each group and performs a simultaneous component analysis with invariant pattern restrictions (SCA-P) for each cluster. These restrictions imply that the model allows for between-group differences in the variances and the correlations of the cluster-specific components. As such, clusterwise SCA-P is more flexible than the earlier proposed clusterwise SCA-ECP model, which imposed equal average cross-products constraints on the component scores of the groups that belong to the same cluster. Using clusterwise SCA-P, a finer-grained, yet parsimonious picture of the group differences and similarities can be obtained. An algorithm for fitting clusterwise SCA-P solutions is presented and its performance is evaluated by means of a simulation study. The value of the model for empirical research is illustrated with data from psychiatric diagnosis research.
1. Introduction
- Top of page
- Abstract
- 1. Introduction
- 2. Model
- 3. Data analysis
- 4. Simulation studies
- 5. Application
- 6. Discussion
- Acknowledgements
- References
- Appendix
Behavioural researchers often examine whether the underlying structure of a set of variables differs between known groups of subjects. To this end one may, firstly, perform a separate principal component analysis (PCA: Jolliffe, 1986; Pearson, 1901) for each group (e.g., McCrae & Costa, 1997). This implies that, for each group, the variables are reduced to a smaller number of components (see Table 1) which explain as much of the variance in the data as possible. The resulting group-specific loading matrices represent the relations between the variables and the components and yield insight into the structure of the variables within the different groups. This approach leaves plenty of freedom to trace differences between the groups, but it may be hard to gain insight into the structural similarities. Besides, when the number of groups is large, comparing all the loading matrices is practically infeasible.
Table 1. Restrictions imposed by the different component methods for modelling the within-group structure of multivariate data from different groups.| Method | Component loadings | Component variances | Component correlations |
|---|
| PCA by group (Jolliffe, 1986) | Free | Free | Free |
| Clusterwise SCA-P (this paper) | Equal for all groups in the same cluster | Free | Free |
| Clusterwise SCA-ECP (De Roover et al., 2011) | Equal for all groups in the same cluster | Equal for all groups in the same cluster | Equal for all groups in the same cluster |
| SCA-P (Timmerman & Kiers, 2003) | Equal for all groups | Free | Free |
| SCA-ECP (Timmerman & Kiers, 2003) | Equal for all groups | Equal for all groups | Equal for all groups |
Secondly, one may perform simultaneous component analysis (SCA: Kiers, 1990; Kiers & ten Berge, 1994a; Timmerman & Kiers, 2003). In SCA, the data of all groups are modelled simultaneously, assuming that the same components underlie the data of the different groups and thus that a common loading matrix can be used to summarize the data. As such, SCA is much more parsimonious than the separate PCA strategy and sheds light on the structural similarities of the groups. On the downside, having only one loading matrix for all groups makes it hard to trace structural differences between the groups. Specifically, the only differences that can be detected are differences between groups in the variances (across subjects within a group) of and the correlations between the components. Which of these differences can be uncovered depends on the SCA variant used (Timmerman & Kiers, 2003). In the most constrained variant, called SCA-ECP (i.e., with equal average cross-products constraints), component correlations and variances must be equal across the groups, which implies that there is no room for structural differences between the groups (see Table 1). Using the most general variant, SCA-P (i.e., with invariant pattern constraints), one can trace differences in component correlations as well as variances (see Table 1).
Recently, a generic modelling strategy that encompasses both SCA and separate PCA as special cases was proposed that deals with the disadvantages of these approaches: clusterwise SCA (De Roover et al., 2011). In clusterwise SCA, the different groups of subjects are assigned to a limited number of mutually exclusive clusters and the data within each cluster are modelled with SCA. Thus, groups that are classified into to the same cluster share a loading matrix, whereas groups that are assigned to different clusters have different loading matrices. Note that, although factor-analytic alternatives exist for PCA and SCA (e.g., Dolan, Oort, Stoel, & Wicherts, 2009; Lawley & Maxwell, 1962), no factor-analytic counterpart exists for clusterwise SCA, that is, no model is available that provides a clustering of the groups of subjects based on the differences and similarities in factor loading structure.
Within the clusterwise SCA framework, one specific model had already been developed: clusterwise SCA-ECP, which uses the most constrained SCA variant, SCA-ECP, within each cluster. Hence, clusterwise SCA-ECP imposes a very strict concept of structural similarity (see Table 1). First, within each cluster, the correlations among the component scores are constrained to be equal for all groups. This is less ideal if some groups have the same component structure, but differ strongly with respect to component correlations. In such cases, clusterwise SCA-ECP would require additional clusters to adequately summarize the data.
Second, in clusterwise SCA-ECP the variances of the component scores are constrained to be one for each group. This is too restrictive if one is interested in modelling between-group differences in variability across subjects. For example, when a personality questionnaire is administered to several groups of subjects, the ‘neuroticism’ personality trait may underlie the data of all groups, but the variance of this component can be different for groups of healthy persons and clinical groups. In this case, thoughtless application of clusterwise SCA-ECP could even result in inappropriate model estimates. To avoid such problems, the model could be fitted to autoscaled data (i.e., data in which each variable is standardized by group). However, this type of preprocessing has the clear disadvantage that the between-group differences in variability are lost.
To meet the need for a clusterwise SCA model that allows for within-cluster differences in component variances and correlations, we introduce clusterwise SCA-P which models the data within a cluster with SCA-P. Thus, compared to clusterwise SCA-ECP, clusterwise SCA-P is based on a less strict concept of structural similarity which only concerns the component loadings (see Table 1).
The remainder of this paper is organized as follows. In Section 2 the clusterwise SCA-ECP model is recapitulated and the new clusterwise SCA-P model is introduced. Section 3 describes the loss function and an algorithm for clusterwise SCA-P analysis, followed by a model selection heuristic. In Section 4 an extensive simulation study is presented to evaluate the performance of this algorithm and model selection heuristic. In Section 5 clusterwise SCA-P is applied to data from psychiatric diagnosis research. In Section 6 we conclude with a few points of discussion, including directions for future research.
5. Application
- Top of page
- Abstract
- 1. Introduction
- 2. Model
- 3. Data analysis
- 4. Simulation studies
- 5. Application
- 6. Discussion
- Acknowledgements
- References
- Appendix
In this section, we illustrate clusterwise SCA-P by applying it to data from psychiatric diagnosis research. In this field, the structure of diagnostic categories is extensively investigated, given the heavy criticism of standard diagnostic systems such as the different versions of the DSM (Kendel & Jablensky, 2003; Kendler, 1990; Zachar & Kendler, 2007). Specifically, as these systems define a diagnostic category by indicating which pattern of symptoms is typical for patients that belong to this category, a number of questions can be raised: one can wonder (1) whether clinicians agree about the extent to which different symptoms apply, (2) whether some structure can be discerned in the opinions of clinicians who disagree (do they disagree on the presence of single symptoms that seem randomly selected or on the presence of meaningful types of symptoms?), and (3) whether for some categories clinicians agree more than for others.
To shed light on these questions, we applied clusterwise SCA-P to data that were collected by Mezzich and Solomon (1980). These authors asked 22 clinicians to imagine a typical patient for four diagnostic categories: manic-depressive depressed (MDD), manic-depressive manic (MDM), simple schizophrenic (SS) and paranoid schizophrenic (PS). These categories are part of the nomenclature of mental disorders (DSM-II) issued in 1968 by the American Psychiatric Association. Subsequently, the 22 clinicians rated each archetypal patient on 17 psychopathological symptoms, on a 0 (absent) to 6 (extremely severe) Likert scale. As such an 88 patients by 17 symptoms data set was obtained (see Mezzich & Solomon, 1980)), where each patient belonged to one of the four diagnostic categories. Considering the diagnostic categories as the groups and the patients as the subjects, nested within the groups, we centred the data for each diagnostic category separately and standardized the symptoms across categories (see Section 2.1). In this way the mean symptom profiles of the four diagnostic categories are removed from the data, but the information on the amount of disagreement for each category is retained.
To these data we fitted clusterwise SCA-P models with Q varying from 1 to 6 and C varying from 1 to 4 (i.e., the number of diagnostic categories). In Figure 4, the VAF percentage of the obtained solutions is plotted. The model selection procedure presented (see Section 3.3) suggests retaining two clusters, since the average scree ratio is maximal for the solutions with two clusters (Table 6, above). With two as the number of clusters, the solution with three components has the highest scree ratio (Table 6, below). Therefore, we decided to retain the solution with two clusters and three components.
Table 6. Scree ratios for the number of clusters C given the number of components Q (top), and for the number of components Q given two clusters (bottom), for the archetypal patients data. The maximal scree ratio in each column is highlighted in bold face.| No. of clusters | No. of components | average |
|---|
| 1 | 2 | 3 | 4 | 5 | 6 |
|---|
| C given Q |
| 2 | 1.41 | 1.28 | 1.49 | 1.68 | 1.85 | 2.00 | 1.62 |
| 3 | 1.16 | 1.29 | 1.16 | 1.06 | 1.13 | 1.21 | 1.17 |
| Q given C= 2 |
| 3 | | 1.21 | 1.29 | 1.17 | 1.27 | | |
In the solution selected, the partition matrix P (not shown) reveals that the PS and SS categories are assigned to the first cluster and the MDD and MDM categories to the second cluster. Therefore, these clusters can be called ‘schizophrenia’ and ‘manic depression’, respectively.
The varimax rotated component loadings of these two clusters are displayed in Table 7. In the schizophrenia cluster, the first component can be labelled ‘grandiosity’ since this is the only symptom with a very strong loading on the component. Given the high loadings for ‘tension’, ‘depressive mood’, and ‘guilt feelings’, the second component of this cluster is named ‘affective symptoms’. On the third component motor and behavioural symptoms such as ‘mannerisms and posturing’, ‘hallucinatory behaviour’ and ‘motor retardation’ load high; therefore, it is labelled ‘behavioural symptoms’.
Table 7. Varimax rotated loadings for the clusterwise SCA-P solution for the archetypal patients data with two clusters and three components. Loadings which are larger than ± .50 are highlighted in bold face.| | Cluster 1: Schizophrenia | Cluster 2: Manic depression |
|---|
| Grandiosity | Affective symptoms | Behavioural symptoms | Blunted affect | Anxiety | Cognitive symptoms |
|---|
| Depressive mood | 0.17 | 0.87 | 0.24 | 0.00 | 0.14 | −0.08 |
| Excitement | −0.24 | 0.59 | 0.32 | −0.03 | −0.08 | 0.05 |
| Guilt feelings | 0.02 | 0.79 | 0.13 | −0.06 | 0.47 | −0.21 |
| Anxiety | 0.14 | 0.63 | −0.14 | −0.02 | 0.91 | 0.05 |
| Tension | 0.05 | 0.81 | 0.01 | 0.45 | −0.18 | 0.43 |
| Somatic concern | 0.31 | 0.62 | 0.12 | −0.13 | 0.84 | 0.12 |
| Conceptual disorganization | −0.05 | 0.65 | 0.44 | 0.36 | 0.01 | 0.65 |
| Unusual thought content | 0.39 | 0.43 | 0.33 | 0.27 | 0.09 | 0.92 |
| Hallucinatory behaviour | 0.32 | 0.33 | 0.61 | −0.38 | 0.03 | 0.69 |
| Mannerisms and posturing | 0.05 | 0.20 | 1.00 | 0.09 | 0.30 | 0.16 |
| Motor retardation | −0.01 | 0.04 | 1.09 | 0.17 | 0.17 | 0.09 |
| Grandiosity | 0.88 | 0.16 | 0.01 | 0.30 | 0.01 | 0.28 |
| Uncooperativeness | 0.53 | −0.13 | 0.36 | 0.29 | 0.59 | 0.51 |
| Suspiciousness | 0.45 | 0.17 | 0.07 | −0.28 | 0.06 | 1.01 |
| Hostility | 0.37 | 0.00 | −0.08 | 0.60 | −0.30 | 0.07 |
| Blunted affect | −0.40 | −0.04 | 0.39 | 0.95 | 0.01 | −0.12 |
| Emotional withdrawal | −0.21 | 0.17 | 0.33 | 0.48 | 0.41 | −0.01 |
In the manic depression cluster, the first component is called ‘blunted affect’, because of the high loading of this symptom. The ‘somatic concern’ and ‘anxiety’ symptoms have high loadings on the second component, which is thus labelled ‘anxiety’. On the third component cognitive symptoms such as ‘conceptual disorganization’, ‘suspiciousness’ and ‘unusual thought content’ load high; therefore it is named ‘cognitive symptoms’.
The variances and correlations of the component scores are presented in Table 8. From this table, it can be concluded that the variances of the component scores differ substantially between the diagnostic categories that belong to the same cluster. Specifically, in the schizophrenia cluster, the variance on the ‘behavioural symptoms’ component is larger for the simple schizophrenic patients than for the paranoid schizophrenic patients. This indicates a relatively large disagreement among psychiatrists about the severity of behavioural symptoms in simple schizophrenic patients. For the manic-depressive patients with depression, there appears to be strong disagreement about the extent to which they are characterized by ‘blunted affect’. These differences in the amount of disagreement about the symptoms of PS and SS, on the one hand, and MDM and MDD, on the other hand, may be explained by the fact that the symptoms of simple schizophrenia and manic depression depressive are mostly ‘negative’ (i.e., normal aspects of a person's behaviour disappear), such as mental and motor retardation, reduction of interest, apathy and impoverishment of interpersonal relations. In contrast, paranoid schizophrenia and manic-depressive illness manic are psychiatric disorders with very salient ‘positive’ symptoms (i.e., abnormal symptoms that are added to the behaviour), such as hallucinations, aggression, talkativeness, accelerated speech and motor activity. Therefore, it is not surprising that there is less disagreement about the symptoms of these disorders than about the symptoms of simple schizophrenia and manic-depressive illness depressive.
Table 8. Variances and correlations of the component scores by diagnostic category for the clusterwise SCA-P solution for the archetypal patients data with two clusters and three components.| Cluster | | | Variances | Correlations |
|---|
| Cluster 1: | | | | Affective | Behavioural |
| Schizophrenia | | | | symptoms | symptoms |
| | Simple | Grandiosity | 0.96 | .22 | .21 |
| | schizophrenia | Affective symptoms | 1.15 | | −.23 |
| | | Behavioural symptoms | 1.58 | | |
| | Paranoid | Grandiosity | 1.04 | −.24 | −.40 |
| | schizophrenia | Affective symptoms | 0.96 | | .51 |
| | | Behavioural symptoms | 0.42 | | |
| Cluster 2: | | | | | |
| Manic | | | | | Cognitive |
| depression | | | | Anxiety | symptoms |
| | Manic depression, | Blunted affect | 1.67 | .04 | −.08 |
| | depressive | Anxiety | 1.14 | | .09 |
| | | Cognitive symptoms | 0.85 | | |
| | Manic depression, | Blunted affect | 0.33 | −.12 | .16 |
| | manic | Anxiety | 0.86 | | −.09 |
| | | Cognitive symptoms | 1.15 | | |
Table 8 also shows the correlations between the component scores for each of the four diagnostic categories. In general, these component correlations are rather low. This indicates that the opinion of clinicians on symptoms of one type is quite independent of their opinion on symptoms of another type.
We conclude that clusterwise SCA-P allows us to formulate fine-grained yet parsimonious answers to the three research questions outlined above. (1) The psychiatrists indeed disagree on the symptoms of the four disorders. (2) The specific symptoms for which disagreement exists can be grouped into meaningful types, which differ between the schizophrenia and the manic-depressive disorders. (3) The amount of disagreement about the types of symptoms differs between the categories within a cluster. More specifically, the clinicians disagree more about the disorders with negative symptoms (MDD and SS) than about the disorders with positive symptoms (MDM and PS).
6. Discussion
- Top of page
- Abstract
- 1. Introduction
- 2. Model
- 3. Data analysis
- 4. Simulation studies
- 5. Application
- 6. Discussion
- Acknowledgements
- References
- Appendix
In this paper, the clusterwise SCA-P model was proposed for detecting and modelling structural differences and similarities between data of several groups. Clusterwise SCA-P is more flexible than clusterwise SCA-ECP, as clusterwise SCA-P allows component variances and correlations to vary freely within each cluster. Therefore, clusterwise SCA-P may result in more comprehensive and/or more parsimonious solutions (in terms of the number of clusters) than clusterwise SCA-ECP. For the sake of clarity, we focused on data from different groups of subjects in this paper. However, clusterwise SCA is also applicable to multivariate time series data from multiple subjects (for illustrative applications, see De Roover et al., 2011; De Roover, Ceulemans, & Timmerman, 2011).
We see at least three possible directions for further research. First, in this paper, the number of components was fixed across the clusters. Due to this restriction, differences in the nature of the underlying dimensions are captured rather than differences in number of underlying dimensions. This is often not ideal. For example, in personality psychology, personality trait structure is often defined by five dimensions (Goldberg, 1990). However, some authors claim that in some cultures extra dimensions might be needed to adequately describe the structure of personality (Diaz-Loving, 1998). Therefore, in future research it would be useful to allow the number of components to vary between clusters. This generalization is not as straightforward as it may seem, as it would result in non-arbitrary problems with respect to the model estimation. Meanwhile, researchers can use the following strategy: inspect the within-cluster component models of the clusterwise SCA solution obtained and look for signs of overextraction (e.g., one of the components is determined by only one variable, or has low loadings for all variables) and, when indicated, fit an SCA solution with a lower number of components to the data of the groups that belong to the cluster at hand.
Second, clusterwise SCA clusters the groups on the basis of the within-group structures, ignoring between-group differences in variable means. However, these differences in means could reveal interesting additional information. Therefore, one may consider developing an extension of clusterwise SCA in which the group means are modelled as well. Such an extension has already been described for SCA (Timmerman, 2006), which implies a PCA of the group means next to an SCA of the within-group structure. Alternatively, one could model the group means by reduced K-means (Bock, 1987; de Soete & Carroll, 1994; Timmerman, Ceulemans, Kiers, & Vichi, 2010), which would entail a clustering of the groups as well as a dimension reduction of the variables.
Third, it may be useful to introduce group-specific weights to correct for the unwanted dominance of some groups (for an overview of possible weighting strategies, see Van Deun, Smilde, van der Werf, Kiers, & Van Mechelen, 2009). For instance, one may want to give more weight to the data of smaller groups, to avoid the analysis results being primarily influenced by the larger groups.