Abstract
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
This paper discusses the benefits of using Latent Class Analysis (LCA) versus Kmeans Cluster Analysis or Hierarchical Clustering as a way to understand differences among visitors in museums, and is part of a larger research program directed toward improving the museumvisit experience. For our comparison of LCA and Kmeans Clustering, we use data collected from 190 visitors leaving the exhibition Against All Odds; Rescue at the Chilean Mine in the National Museum of Natural History in January 2012. For the comparison of LCA and Hierarchical Clustering, we use data from 312 visitors leaving the exhibition Elvis at 21 in the National Portrait Gallery in January 2011.
This paper discusses the benefits of using Latent Class Analysis (LCA) rather than Kmeans Cluster Analysis or Hierarchical Clustering as a way to understand differences among visitors in museums, and is part of a larger research program directed toward improving the museumvisit experience. For our comparison of LCA and Kmeans Clustering, we use data collected from 190 visitors leaving the exhibition Against All Odds: Rescue at the Chilean Mine in the National Museum of Natural History, Smithsonian Institution, during the winter of 20112012. For the comparison of LCA and Hierarchical Clustering, we use data from 312 visitors leaving the exhibition Elvis at 21 in the National Portrait Gallery in January 2011. We are publishing this article here for two reasons: 1) It provides additional mathematical support for the four dimensions of experience preference in the IPOP theory presented in Pekarik et al in this issue (2014). And 2) it may encourage readers who are working on statistical methodologies to consider enlisting LCA to help understand the people who use our museums.
In social science research, there is the need to reduce a large number of initial variables into smaller groupings. These groupings may consist of constructed composites derived directly from values that are found in the original variables, such as socioeconomic status. Alternatively they can be latent constructs, derived indirectly from the original variables. Latent constructs point to an underlying characteristic or set of characteristics not directly measured; these values can be identified through a mathematical model. There are, currently in use, a number of different ways of identifying latent variables. The most prominent of these are Clustering and Factor Analysis. Clustering methods divide the data into groupings based on measures of “distance” between data points. Factor Analysis is based on correlations of variables.
The purpose of this paper is to examine two common clustering techniques—Kmeans Clustering and Hierarchical Clustering—in comparison to Latent Class Analysis (LCA). Latent Class Analysis, a type of Structural Equation Modeling, is based on identifying structure within cases. LCA has existed for quite some time, but advances in mathematics and in software development have made it much more powerful than before, with flexibility not available in other methods.
Kmeans Clustering
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
Kmeans Clustering has been widely used in marketing, especially in market segmentation, and it has been a popular analysis technique in the social sciences for several decades. Krantz, Korn, and Menninger argued that Kmeans Clustering is a practical and useful tool for exploring differences among museum visitors (2009).
The Kmeans method is used to identify relatively homogeneous groups of cases using characteristics that interest you. Kmeans Clustering uses data that is interval or ratio in nature. (With interval data the distances between neighboring values are equal; ratio data is interval data with a zero point). It attempts to reduce variables or to cluster cases into smaller groups. The method requires that the analyst specify the number of clusters and decide how well clustering works based on a subjective interpretation of the clustering results. A general type of museum research question related to Kmeans is, “What are some identifiable groups of museums that attract similar visitors within each group?” You could cluster museums into k homogeneous groups based on visitor characteristics (where “k” stands for an integer representing the number of distinct groups).
Krantz, et al. mentioned criticisms of the Kmeans method:
[S]ome statisticians do not think Kmeans Cluster Analysis is rigorous enough. In particular, the random assignment of the cores of clusters is problematic to some, and the researchers' determination of natural clusters is problematic to others. Moreover, cluster results may not be robust. Adding cases to an existing data set or using an entirely new data set may yield a cluster solution that is quite different (2009, 297).
There are indeed many problems with Kmeans Clustering, even among clustering methods generally. The potential difficulties include sensitivity to outliers (outliers are extreme values that can skew the results) as well as the need to use interval or ratio data—which means that, in calculating distances, you have to know whether the numbers actually add up—and some concerns about the order in which data is assembled.1 In some cases, data may not be appropriate for the Kmeans method. More fundamentally, the stability of clusters cannot be assumed because traditionally there has been no objective set of criteria to judge the suitability of solutions. Kmeans will always produce a solution, and some of those solutions are likely to fit your expectations.
Hierarchical Clustering
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
Hierarchical Clustering is a clustering technique that is used when the data is dichotomous in nature, such as when people answer yes or no to survey questions. This technique starts with each point as its own cluster and then proceeds to measure distances and combine the closest clusters. In the example below, the points are visitors, though clustering of items (variables) is also possible. This process continues until all points are in one cluster. For small data sets, generally under one thousand observations, hierarchical clustering is popular.
Once the analysis is completed, a dendrogram, a branching treelike diagram, provides a graphic representation of how the survey respondents cluster together into nested groups. The problematic aspect, as with Kmeans Clustering, comes in choosing the number of clusters to retain. This is similar to the problems in Exploratory Factor Analysis. The interpretation of a Factor Analysis is often misunderstood as the “discovery” of an underlying structure for a set of variables, but this interpretation is not warranted by the mathematics. There is a fundamental indeterminacy due to the fact that any correlation matrix can be explained by an infinite number of factor structures (Mulaik 1976; Steiger 1990; Steiger and Shoenemann 1978). One cannot uniquely infer a “correct” factor structure. Instead we must select, from the infinite number of possible structures, those that are parsimonious and meaningful. It is conceivable that two researchers analyzing the same data could select very different solutions.
Latent Class Analysis
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
Latent Class Analysis (LCA) was developed about fifty years ago as a way to characterize latent variables in the analysis of nominal and ordinal data—the kind more typically obtained in surveys (McCutcheon 1987). (Nominal data are labels, not quantities; ordinal data have order, but the distances between values are not known.) LCA quite easily overcomes all of the problems with Kmeans clustering that were cited above. The increase of computing power in the 1990s made LCA a very efficient technique. In the literature, LCA is referred to in different ways. It has been called:
 Latent Structure Analysis (Lazarfeld and Henry 1968).
 Mixture Likelihood Clustering (McLachlanand Basford 1988; Everitt 1993).
 Modelbased Clustering (Banfield and Raftery 1993; Bensmail, et al. 1997; Fraley and Raftery 1998a; 1998b).
 Mixturemodel Clustering (McLachlan, et al. 1999).
 Bayesian classification (Cheeseman and Stutz 1995).
 Latent Class Cluster Analysis (Vermunt and Magidson 2000; 2002).
The best way to distinguish between LCA and cluster analysis is to note that LCA is modelbased and cluster analysis is not. By “modelbased,” we mean that there is a statistical model that is assumed to come from the population from which the data was gathered (Vermunt and Magidson 2002). Both Kmeans and LCA are seeking divisions that maximize the betweencluster differences and minimize the withincluster differences. But in Kmeans this decision is arbitrary or subjective. In LCA, a statistical model allows the comparison to be statistically tested, so that the decision to adopt a particular model is less subjective. In addition, the items used in the analysis do not need to have the same scale or have equal variances. Finally, LCA allows for the examination of the residuals between items used in the analysis. In other words, LCA is useful in examining the data that does not fit the model, thus allowing the analyst to judge the overall quality of the model.
Magidson and Vermunt (2002) ran a simulation study to compare Kmeans analysis and LCA against discriminant function analysis, a method generally considered to be the “gold standard” in testing how well variables predict group membership. In the study, group membership was known in advance and the authors applied the three methods to the data to see how well they did. They argued that they used data that favored Kmeans analysis. Even so, the results of the comparison showed that Kmeans had an 8 percent misclassification rate versus 1.3 percent for LCA.
LCA versus Kmeans
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
This article compares LCA and Kmeans by using each method to examine the same dataset collected from an exhibition, Against All Odds, which tells the story of the 2010 Chilean mine rescue in which 33 miners were trapped underground for 69 days before being brought to the surface through an international effort. Fiftytwo percent of visitors in the sample were coming to the museum for the first time. Thirty percent were alone on the visit, and 82 percent were living in the United States. The average age was 38.75 with a range of 18 to 84, and 55 percent were males.
The data to be analyzed here is from a brief survey about general behavior preferences—activities that people like to do and identify with—outside of the museum. The survey uses the instrument described in the article “IPOP: A Theory of Experience Preference,” in this issue (Pekarik, et al. 2014), and is based on the IPOP theory of visitor preferences in four categories: Ideas, People, Objects, and Physical. In IPOP research, museum visitors are given a selfadministered questionnaire that asks 38 questions (in the long form), or 20 or eight questions (in the two shorter forms). These instruments are printed in this issue in Pekarik, et al. Appendix A, and are also available from the authors.
The 20question survey was used here. These items were of the form:
… bring people together (and so on).
For each item, respondents made a selection from a fourlevel scale: Not me at all, A little me, Me, Very much me.
How many clusters should we choose to start with in the Kmeans Clustering? In the analysis software JMP 10, the Kmeans Clustering function provides eigenvalues, a statistic derived from the covariance matrix of the variables. As the number of clusters increases, the eigenvalues decrease. The point at which the drop in eigenvalues markedly decreases (“the elbow point”) suggests an approximate number of clusters. The eigenvalues, as shown in figure 1, support a solution somewhere between 3 and 4 clusters.
We chose three clusters to start. JMP 10 also provides a statistic known as Cubic Clustering Criteria (CCC), which compares the clusters created in Kmeans with what would be obtained from a uniformly distributed set of points (Sarle 1983). CCC is similar to eigenvalues in that it also is used to compare changes in values across different numbers of clusters. You generally want a number between 2 and 3 for the CCC. The 3.59 CCC fit value is uninterpretable alone, but it suggests that the threecluster model is not the best fit. The negative value indicates that there are outliers.
We also calculated a fourcluster Kmeans model. The fourcluster result has a CCC fit value of 4.94. Therefore, the threecluster model fits better because it has a higher (in this case closer to zero) CCC value. The negative value indicates that it, too, has some outliers and misfits. As figure 2 illustrates, the clusters overlap very little, but there are also quite a few misfits, or residuals. Within Kmeans there is no stable way to examine why the residuals are there or to test the model further, such as through a bootstrap technique.
LCA Analysis and Results
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
The LCA analysis, calculated in the software program Latent Gold 4.5, produced quite a different model. We started by examining 1, 2, 3, 4, 5, and 6 cluster models simultaneously.
In table 1 we see diagnostic statistics for models with one to six clusters. In general, analysis of the models focuses first on three values: LL, pvalue, and BIC. LL is Log Likelihood, the logarithm of the likelihood ratio, a test that compares the fit of two models by examining how much more likely the data are predicted by one model compared to the other. The Log Likelihood can be used to compute a pvalue, a measure of statistical significance. BIC is the Bayes Information Criterion, which is a statistic created to aid model selection by penalizing the number of factors in a model. Looking at these statistics, the fourcluster model might be the best fit because it has the lowest BIC value. But the loglikelihood value for the sixcluster model is the lowest. This gives us two candidates for best fit, the fourcluster model and the sixcluster model.
Table 1. Fit Values for Six Cluster Models—Against All Odds Data.  LL  BIC(LL)  Npar  L^{2}  df  pvalue  Class.Err. 


1Cluster  4147.289  8598.281  59  6523.841  113  2.5e1296  0 
2Cluster  3961.822  8335.445  80  6152.907  92  6.4e1236  0.0219 
3Cluster  3864.413  8248.723  101  5958.088  71  6.9e1214  0.0295 
4Cluster  3786.108  8200.21  122  5801.477  50  3.4e1201  0.0674 
5Cluster  3734.772  8205.636  143  5698.806  29  6.3e1202  0.0406 
6Cluster  3704.647  8253.484  164  5638.556  8  1.5e1215  0.0623 
We begin by looking more closely at the sixcluster model. The first feature to consider is the parameter estimates (table 2). Four variables stand out—divide into categories, buy things, run, and ski—because they are not helping separate the clusters (that is, their p values are not significant). They could be removed or set to correlate and the model rerun. That is one option to consider, but there are more diagnostic features to examine first.
Table 2. Parameter Estimates for the Sixcluster Model.Models for Indicators  Clusters    

 1  2  3  4  5  6  Wald  pvalue  R^{2} 

Identify Patterns  0.29  1.04  0.72  2.51  0.29  3.70  30.04  1.40E05  0.39 
Divide into Categories  0.47  0.96  0.23  0.29  0.65  0.72  10.83  0.055  0.27 
Analyze Situations  1.14  0.51  0.18  3.18  2.86  5.59  32.46  4.80E06  0.49 
Learn Philosophy  0.34  0.43  0.64  1.72  0.17  3.31  20.28  0.0011  0.19 
How Things Work  0.31  3.48  0.44  4.12  2.02  4.83  52.77  3.70E10  0.60 
How Things are Made  0.66  3.25  1.30  3.43  1.84  2.88  56.98  5.10E11  0.56 
Buy Things  0.23  0.30  0.52  0.01  0.10  0.10  9.18  0.1  0.07 
Construct Things  0.27  1.61  1.19  1.75  0.02  1.89  42.25  5.20E08  0.44 
Help Others  0.52  0.44  0.56  3.69  2.11  2.07  32.24  5.30E06  0.36 
Talk About Families  0.28  0.55  1.38  0.81  1.07  0.85  16.10  0.0066  0.29 
Leisure With Others  0.39  0.05  1.51  0.80  0.65  1.22  12.08  0.034  0.27 
Emotionally Connect  0.28  1.93  3.40  3.46  2.59  2.07  37.88  4.00E07  0.55 
Bring People Together  0.08  0.78  1.85  2.06  1.41  1.64  32.62  4.50E06  0.37 
Run  0.09  0.06  0.23  0.15  0.22  0.11  2.71  0.75  0.02 
Ski  0.04  0.10  0.13  0.16  0.97  0.54  2.90  0.72  0.03 
Camp  0.18  0.37  0.36  0.94  1.87  0.74  19.04  0.0019  0.17 
Play Sports  0.04  0.28  0.41  0.34  0.29  0.04  9.32  0.097  0.07 
Next, we examine the bivariate residuals. The bivariate residual (BVR) is the association between any pair of indicators. It is a metric created by dividing the chisquare value by the degrees of freedom (df). In this case, we would expect that value to be around 1, because there is 1 df in the test. Values close to 1 would indicate that there is no serious problem with the model. BVRs over 3.84 (1 df) would indicate that the model is not explaining the relationship well.
The BVR table is too large to reproduce here, but in the case of the fourcluster model there is one BVR over 20 and six over 10; it seems that some of these might have similar meaning or a common underlying factor. For example, Talk About Families and Leisure Time With Others (BVR=19.82) are both socialoriented activities. We could fix residual pairs like these by correlating them and reanalyzing the model. Doing this would provide a better fitting model, but a wiser course of action is to consider an alternative model. In the sixcluster model, there is one very problematic residual (between Talk About Families and Leisure Time With Others), which is over 10. On the basis of the fact that it has fewer problematic residuals, we accept the sixcluster model as the best fit.
LCA also provides the ability to look at profiles of the clusters from multiple perspectives. One of these is a Profile Perspective. Table 3 presents the Profile Perspective for five items in the sixcluster model. These five items are the ones that we had intended to use in order to identify the People dimension in this version of the IPOP survey instrument.
Table 3. Profile Perspective of Five Items in the Sixcluster LCA Model.  Cluster1  Cluster2  Cluster3  Cluster4  Cluster5  Cluster6 

Cluster Size  0.38  0.25  0.16  0.14  0.04  0.02 
Indicators       
Leisure With Others       
Not me at all  0.01  0.01  0.00  0.00  0.05  0.09 
A little me  0.26  0.26  0.01  0.05  0.46  0.56 
Me  0.56  0.56  0.23  0.42  0.44  0.33 
Very much me  0.17  0.17  0.76  0.54  0.06  0.03 
Help Others       
Not me at all  0.02  0.00  0.00  0.00  0.22  0.21 
A little me  0.28  0.10  0.09  0.00  0.55  0.55 
Me  0.56  0.54  0.52  0.06  0.22  0.23 
Very much me  0.14  0.36  0.39  0.94  0.01  0.01 
Talk About Families       
Not me at all  0.03  0.07  0.00  0.00  0.19  0.19 
A little me  0.35  0.51  0.04  0.13  0.60  0.60 
Me  0.50  0.37  0.39  0.51  0.20  0.20 
Very much me  0.13  0.05  0.57  0.35  0.01  0.01 
Emotionally Connect With Others      
Not me at all  0.01  0.07  0.00  0.00  0.16  0.09 
A little me  0.19  0.53  0.00  0.00  0.60  0.55 
Me  0.75  0.39  0.25  0.24  0.23  0.36 
Very much me  0.06  0.01  0.75  0.76  0.00  0.00 
Bring People Together       
Not me at all  0.06  0.15  0.00  0.00  0.31  0.38 
A little me  0.32  0.44  0.04  0.03  0.47  0.46 
Me  0.53  0.37  0.44  0.39  0.21  0.16 
Very much me  0.09  0.03  0.52  0.58  0.01  0.01 
In this table, the columns for each cluster in an item add up to one hundred percent (give or take a little rounding error to make the table smaller). Consider Cluster 3 item Leisure With Others. The decimal values are for each category choice; 76 percent of Cluster 3 visitors chose “Very much me” for that item. You can also create a visual display showing how clusters compare on each item (figure 3).
Figure 3 illustrates items from the survey across two clusters, Cluster 3 and Cluster 6. (With multiple clusters, it is sometimes easier to examine them in pairs.) Cluster 3 indicates a group that appears to engage with other people while Cluster 6 indicates a group that seems more drawn to physical activities (ski, camp, and play competitive sports). The four rightmost items on the chart—run, ski, camp, and play competitive sports—were the items intended to identify the Physical dimension.
Similarly one of the clusters indicates a group that appears to favor the Idea items and another points to those more likely to identify strongly with the Object items. In other words, the LCA analysis confirms that the responses to this survey form four groups that support the four dimensions of the IPOP model (Idea, People, Object, Physical). The remaining two clusters contain cases that tended to mark many items high or many items in the lowmiddle range of the scale.
LCA and related software for LCA allow you to add covariates and random and fixed effects into the model, and run bootstraps to compare cluster models and the model itself. Latent Gold, which is the software we used for this study, can also compare a dichotomous factor analysis to the cluster analysis. It is additionally possible to use the class analysis results to create Bayesian prediction models based on these results in conjunction with the observed behaviors of the visitors.
There are further analyses that can be done with this approach, but the purpose here is to introduce and demonstrate how the latest analytical methods can be used to improve our understanding of museum visitors.
LCA versus Hierarchical Clustering
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
For this comparison we are using data from a survey evaluation of visitors leaving the exhibition Elvis at 21: Photographs by Alfred Wertheimer. This was an exhibition of photographs of Elvis Presley (19351977) taken by photojournalist Alfred Wertheimer in 1956, the moment the 21yearold singer was about to enter the national stage. The exhibition was studied during its presentation at the National Portrait Gallery, Smithsonian Institution, Washington, D.C. In the middle of January 2011, 210 entering visitors and a separate sample of 312 exiting visitors were surveyed by the Smithsonian Institution's Office of Policy and Analysis. Fortysix percent of the exiting visitors were male, 45 percent indicated this was a first visit to the museum, 33 percent specifically came to see the exhibition, and the average age was 39.8.2
Visitors were asked the following question and were allowed to choose as many answers as they wished:
Which experiences did you find especially satisfying in Elvis at 21? (Mark one or more):
Connecting with the emotional experiences of others
Enriching my understanding
Getting a sense of the everyday lives of others
Reflecting on the meaning of what I saw
Seeing rare, valuable, or uncommon things
This list of response items derived from a longstanding effort by the Office of Policy and Analysis to understand the expectations and experiences of museum visitors.3 The selection of options for this particular study was influenced by the IPOP theory of experience preference. Two items were intended to address the Idea dimension (Gaining information; Enriching my understanding); two addressed the People dimension (Connecting with the emotional experiences of others; Getting a sense of the everyday lives of others); and two addressed the Object dimension (Being moved by Beauty; Seeing rare, valuable, or uncommon things). We did not attempt to phrase experiences in the Physical dimension for this art exhibition.
Hierarchical Clustering and Results
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
For this analysis, we focused on clustering the exiting visitors according to their responses on the question about especially satisfying experiences. Using JMP 10 we applied the hierarchical clustering function with Ward method. In JMP 10, two graphic representations are produced, the dendrogram and the scree plot showing where each visitor entered a cluster. Figure 4 shows the scree plot.
This plot suggests that the number of clusters to retain is four or five. There is a large distance between four (just above 15 on the distance: yaxis) and five (just below the mid mark of 13: yaxis). Beyond this there are no diagnostics that can assist in making a decision about the ideal number of clusters. The automated analysis within JMP actually chose an unwieldy 17 clusters as the cutoff. Visual inspection of the dendrogram suggests that there could be anywhere from two to eight clusters depending on where you think the cutoff should be (figure 5). The blue star curved graph at the bottom is the same data as the scree plot above, but in a different format.
The crucial issue here is that we do not have a good idea where to draw the line, so to speak. We also cannot compare clusters from a technical standpoint. We can choose what seems parsimonious to us, but have no external basis for deciding.
LCA Analysis and Results
 Top of page
 Abstract
 Kmeans Clustering
 Hierarchical Clustering
 Latent Class Analysis
 LCA versus Kmeans
 LCA Analysis and Results
 LCA versus Hierarchical Clustering
 Hierarchical Clustering and Results
 LCA Analysis and Results
 Conclusion
 References
 Biographies
The first step in this LCA analysis was to run multiple clusters simultaneously and examine the diagnostic statistics (table 4). We ran clusters ranging in size from 1 to 5.
Table 4. Initial LCA Analysis of Elvis at 21 Satisfying Experience Data.   LL  BIC(LL)  Npar  L^{2}  df  pvalue 


Model1  1Cluster  1660.08  3367.444  8  350.9455  247  1.50E05 
Model2  2Cluster  1617.08  3334.65  17  264.9545  238  0.11 
Model3  3Cluster  1596.62  3346.921  26  224.0284  229  0.58 
Model4  4Cluster  1584.69  3376.25  35  200.16  220  0.83 
Model5  5Cluster  1571.42  3402.906  44  173.6189  211  0.97 
As we described earlier in this paper, analysis of the models focuses first on three values: LL, pvalue, and BIC. The analyst is looking for lower values of LL and BIC pvalues that are nonsignificant. It is common to seek a pvalue that is less than .05, but in this case, the pvalue should be greater than .05, in essence accepting that the data fits the model. From all of the diagnostic data here, Model 2 has the lowest BIC and a nonsignificant pvalue, so that is a good starting point. The twocluster model has 11 bivariate residuals (BVR) that are above 3.84, indicating that the model is not explaining the relationship between those variables well. On closer inspection this makes sense, because the two clusters consist of those who chose most of the eight items and those who did not. The 3Cluster model has similar bivariate residual problems.
Therefore, the 4 and 5Cluster models look like better candidates for correct specification. There is one residual that is high for the 4Cluster model, and although that could technically be adjusted for, the 5Cluster model looks like the best fit, because it does not have any BVR problems. Although the 5Cluster model appears to fit the data best, it is advisable to run a bootstrap analysis just to doublecheck the results against the 4Cluster model. The bootstrap process creates thousands of new samples from the original sample by using random sampling with replacement. Statistics calculated on each of these many samples are then examined against one another. The bootstrap results indicate that the 5Cluster model is a better fit than the 4Cluster model, specifically, the loglikelihood difference test provides a difference of 26.54 with a pvalue of 0.04.
Next, we look at the profile plot (table 5). The profile plot shows the distribution across each cluster of the respondents who selected a particular item (0 = not selected; 1 = selected). Values over 0.5 for those who selected an item are highlighted.
Table 5. Profile Plot for the Five Cluster Model.  Cluster1  Cluster2  Cluster3  Cluster4  Cluster5 

Cluster Size  0.50  0.26  0.15  0.06  0.03 
Indicators      
Being moved by beauty (Beauty)     
0  0.9191  0.9212  0.7895  0.0704  0.1153 
1  0.0809  0.0788  0.2105  0.9296  0.8847 
Seeing rare, valuable or uncommon things (Rare)   
0  0.8287  0.7244  0.568  0.1088  0.2978 
1  0.1713  0.2756  0.432  0.8912  0.7022 
Enriching my understanding (Understanding)   
0  0.8966  0.6931  0.0287  0.6661  0.0142 
1  0.1034  0.3069  0.9713  0.3339  0.9858 
Gaining information (Information)      
0  0.8764  0.7339  0.3888  0.9956  0.0182 
1  0.1236  0.2661  0.6112  0.0044  0.9818 
Connecting with the emotional experience of others (Emotion)  
0  0.9922  0.0773  0.7521  0.535  0.022 
1  0.0078  0.9227  0.2479  0.465  0.978 
Getting a sense of the everyday lives of others (Everyday lives)  
0  0.7527  0.5781  0.6409  0.8555  0.0182 
1  0.2473  0.4219  0.3591  0.1445  0.9818 
Recalling memories (Memory)     
0  0.8137  0.6735  0.8944  0.9267  0.108 
1  0.1863  0.3265  0.1056  0.0733  0.892 
Reflecting on the meaning of what I saw (Meaning)   
0  0.8872  0.9115  0.6923  0.8302  0.2326 
1  0.1128  0.0885  0.3077  0.1698  0.7674 
In the table results, Cluster 1 includes 50 percent of the visitors in it, Cluster 2 has 26 percent, Cluster 3 has 15 percent, Cluster 4 has 6 percent, and Cluster 5 has 3 percent. The rows in table 2 indicate, for example, that only 8 percent in Cluster 1 chose “Being moved by beauty” as an especially satisfying experience, also 8 percent in Cluster 2, but 92 percent in Cluster 4 and 88 percent in Cluster 5. The columns indicate that respondents in Cluster 5 tended to select all the items. Another way to look at this is to note that Cluster 4 participants are much more likely to respond that they found rare objects the most satisfying compared to participants in the other clusters.
At this point, we can also entertain some general differences across the clusters, such as the fact that those in Cluster 1 did not find anything especially satisfying, those in Cluster 2 seemed to align to Emotion and Everyday lives (the two People items), those in Cluster 3 to Understanding and Information (the two Idea items), those in Cluster 4 to Beauty and Rare (the two Object items), and those in Cluster 5 to almost everything. This is easier to see in a plot of this data that illustrates clusters and the proportion of people who chose each item (figure 6). In the plot, you can readily see the five clusters and their differences. Cluster 5 is the group that found most of the experiences satisfying overall, and Cluster 1 found the fewest.
Table 6 provides the bivariate residuals, which also support a 5Cluster model, since the highest is 2.2, which is well below the 3.84 threshold.
Table 6. Bivariate Residuals for Selections in the FiveCluster Model.  Indicators  1  2  3  4  5  6  7 

1  Beauty        
2  Emotion  0.0416  .      
3  Understand  0.0387  0.0052  .     
4  Information  0.0582  0.0001  0.1654  .    
5  Everyday Lives  1.2205  0.2130  1.5119  0.3675  .   
6  Memory  1.3514  0.0010  0.1352  1.1216  0.5604  .  
7  Meaning  0.0745  0.0232  0.0180  0.1142  0.1700  2.2020  . 
8  Rare  0.1573  0.1142  0.3016  0.0473  1.7512  0.2523  1.6083 
Once clusters have been identified, other differences can be examined. You can save each individual cluster assignment and use those new variables to address further research questions. For example, we can ask whether these clusters differed in their rating of overall experience.4 Results from a KruskalWallis analysis of the overall experience rating (table 7), indicate a statistically significant difference among the clusters.5
Table 7. KruskalWallis Test Result and Posthoc Analysis.  Hypothesis Test Summary 

 Null Hypothesis  Test  Sig.  Decision 

1  The distribution of Rating is the same across categories of Cluster modal.  IndependentSamples KruskalWallis Test  .001  Reject the null hypothesis. 
Each node shows the sample average rank of cluster modal. 

Sample1Sample2  Test Statistic  Std. Error  Std. Test Statistic  Sig.  Adj.Sig. 


1.005.00  90.600  26.122  3.468  .001  .005 
3.005.00  83.173  28.443  2.924  .003  .035 
2.005.00  74.820  26.468  2.827  .005  .047 
1.004.00  48.764  18.302  2.664  .008  .077 
4.005.00  41.836  30.436  1.375  .169  1.000 
3.004.00  41.337  21.485  1.924  .054  .544 
2.004.00  32.984  18.793  1.755  .079  .792 
1.002.00  15.780  10.449  1.510  .131  1.000 
1.003.00  7.427  14.752  .503  .615  1.000 
3.002.00  8.353  15.356  .544  .586  1.000 
The posthoc analysis indicates that, not surprisingly, members of Cluster 5 had a more satisfying experience than members of Clusters 1, 2, and 3. The clusters of interest for us, Clusters 2, 3, and 4, did not differ statistically in their experience rating, suggesting that each of them might have gotten what they wanted out of the exhibition.
There are further analyses that can be completed, but the purpose here is to introduce and demonstrate how the latest analytical methods can improve the understanding of museum visitor experience.