SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

This paper discusses the benefits of using Latent Class Analysis (LCA) versus K-means Cluster Analysis or Hierarchical Clustering as a way to understand differences among visitors in museums, and is part of a larger research program directed toward improving the museum-visit experience. For our comparison of LCA and K-means Clustering, we use data collected from 190 visitors leaving the exhibition Against All Odds; Rescue at the Chilean Mine in the National Museum of Natural History in January 2012. For the comparison of LCA and Hierarchical Clustering, we use data from 312 visitors leaving the exhibition Elvis at 21 in the National Portrait Gallery in January 2011.

This paper discusses the benefits of using Latent Class Analysis (LCA) rather than K-means Cluster Analysis or Hierarchical Clustering as a way to understand differences among visitors in museums, and is part of a larger research program directed toward improving the museum-visit experience. For our comparison of LCA and K-means Clustering, we use data collected from 190 visitors leaving the exhibition Against All Odds: Rescue at the Chilean Mine in the National Museum of Natural History, Smithsonian Institution, during the winter of 2011-2012. For the comparison of LCA and Hierarchical Clustering, we use data from 312 visitors leaving the exhibition Elvis at 21 in the National Portrait Gallery in January 2011. We are publishing this article here for two reasons: 1) It provides additional mathematical support for the four dimensions of experience preference in the IPOP theory presented in Pekarik et al in this issue (2014). And 2) it may encourage readers who are working on statistical methodologies to consider enlisting LCA to help understand the people who use our museums.

In social science research, there is the need to reduce a large number of initial variables into smaller groupings. These groupings may consist of constructed composites derived directly from values that are found in the original variables, such as socioeconomic status. Alternatively they can be latent constructs, derived indirectly from the original variables. Latent constructs point to an underlying characteristic or set of characteristics not directly measured; these values can be identified through a mathematical model. There are, currently in use, a number of different ways of identifying latent variables. The most prominent of these are Clustering and Factor Analysis. Clustering methods divide the data into groupings based on measures of “distance” between data points. Factor Analysis is based on correlations of variables.

The purpose of this paper is to examine two common clustering techniques—K-means Clustering and Hierarchical Clustering—in comparison to Latent Class Analysis (LCA). Latent Class Analysis, a type of Structural Equation Modeling, is based on identifying structure within cases. LCA has existed for quite some time, but advances in mathematics and in software development have made it much more powerful than before, with flexibility not available in other methods.

K-means Clustering

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

K-means Clustering has been widely used in marketing, especially in market segmentation, and it has been a popular analysis technique in the social sciences for several decades. Krantz, Korn, and Menninger argued that K-means Clustering is a practical and useful tool for exploring differences among museum visitors (2009).

The K-means method is used to identify relatively homogeneous groups of cases using characteristics that interest you. K-means Clustering uses data that is interval or ratio in nature. (With interval data the distances between neighboring values are equal; ratio data is interval data with a zero point). It attempts to reduce variables or to cluster cases into smaller groups. The method requires that the analyst specify the number of clusters and decide how well clustering works based on a subjective interpretation of the clustering results. A general type of museum research question related to K-means is, “What are some identifiable groups of museums that attract similar visitors within each group?” You could cluster museums into k homogeneous groups based on visitor characteristics (where “k” stands for an integer representing the number of distinct groups).

Krantz, et al. mentioned criticisms of the K-means method:

[S]ome statisticians do not think K-means Cluster Analysis is rigorous enough. In particular, the random assignment of the cores of clusters is problematic to some, and the researchers' determination of natural clusters is problematic to others. Moreover, cluster results may not be robust. Adding cases to an existing data set or using an entirely new data set may yield a cluster solution that is quite different (2009, 297).

There are indeed many problems with K-means Clustering, even among clustering methods generally. The potential difficulties include sensitivity to outliers (outliers are extreme values that can skew the results) as well as the need to use interval or ratio data—which means that, in calculating distances, you have to know whether the numbers actually add up—and some concerns about the order in which data is assembled.1 In some cases, data may not be appropriate for the K-means method. More fundamentally, the stability of clusters cannot be assumed because traditionally there has been no objective set of criteria to judge the suitability of solutions. K-means will always produce a solution, and some of those solutions are likely to fit your expectations.

Hierarchical Clustering

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

Hierarchical Clustering is a clustering technique that is used when the data is dichotomous in nature, such as when people answer yes or no to survey questions. This technique starts with each point as its own cluster and then proceeds to measure distances and combine the closest clusters. In the example below, the points are visitors, though clustering of items (variables) is also possible. This process continues until all points are in one cluster. For small data sets, generally under one thousand observations, hierarchical clustering is popular.

Once the analysis is completed, a dendrogram, a branching tree-like diagram, provides a graphic representation of how the survey respondents cluster together into nested groups. The problematic aspect, as with K-means Clustering, comes in choosing the number of clusters to retain. This is similar to the problems in Exploratory Factor Analysis. The interpretation of a Factor Analysis is often misunderstood as the “discovery” of an underlying structure for a set of variables, but this interpretation is not warranted by the mathematics. There is a fundamental indeterminacy due to the fact that any correlation matrix can be explained by an infinite number of factor structures (Mulaik 1976; Steiger 1990; Steiger and Shoenemann 1978). One cannot uniquely infer a “correct” factor structure. Instead we must select, from the infinite number of possible structures, those that are parsimonious and meaningful. It is conceivable that two researchers analyzing the same data could select very different solutions.

Latent Class Analysis

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

Latent Class Analysis (LCA) was developed about fifty years ago as a way to characterize latent variables in the analysis of nominal and ordinal data—the kind more typically obtained in surveys (McCutcheon 1987). (Nominal data are labels, not quantities; ordinal data have order, but the distances between values are not known.) LCA quite easily overcomes all of the problems with K-means clustering that were cited above. The increase of computing power in the 1990s made LCA a very efficient technique. In the literature, LCA is referred to in different ways. It has been called:

  • Latent Structure Analysis (Lazarfeld and Henry 1968).
  • Mixture Likelihood Clustering (McLachlanand Basford 1988; Everitt 1993).
  • Model-based Clustering (Banfield and Raftery 1993; Bensmail, et al. 1997; Fraley and Raftery 1998a; 1998b).
  • Mixture-model Clustering (McLachlan, et al. 1999).
  • Bayesian classification (Cheeseman and Stutz 1995).
  • Latent Class Cluster Analysis (Vermunt and Magidson 2000; 2002).

The best way to distinguish between LCA and cluster analysis is to note that LCA is model-based and cluster analysis is not. By “model-based,” we mean that there is a statistical model that is assumed to come from the population from which the data was gathered (Vermunt and Magidson 2002). Both K-means and LCA are seeking divisions that maximize the between-cluster differences and minimize the within-cluster differences. But in K-means this decision is arbitrary or subjective. In LCA, a statistical model allows the comparison to be statistically tested, so that the decision to adopt a particular model is less subjective. In addition, the items used in the analysis do not need to have the same scale or have equal variances. Finally, LCA allows for the examination of the residuals between items used in the analysis. In other words, LCA is useful in examining the data that does not fit the model, thus allowing the analyst to judge the overall quality of the model.

Magidson and Vermunt (2002) ran a simulation study to compare K-means analysis and LCA against discriminant function analysis, a method generally considered to be the “gold standard” in testing how well variables predict group membership. In the study, group membership was known in advance and the authors applied the three methods to the data to see how well they did. They argued that they used data that favored K-means analysis. Even so, the results of the comparison showed that K-means had an 8 percent misclassification rate versus 1.3 percent for LCA.

LCA versus K-means

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

This article compares LCA and K-means by using each method to examine the same dataset collected from an exhibition, Against All Odds, which tells the story of the 2010 Chilean mine rescue in which 33 miners were trapped underground for 69 days before being brought to the surface through an international effort. Fifty-two percent of visitors in the sample were coming to the museum for the first time. Thirty percent were alone on the visit, and 82 percent were living in the United States. The average age was 38.75 with a range of 18 to 84, and 55 percent were males.

The data to be analyzed here is from a brief survey about general behavior preferences—activities that people like to do and identify with—outside of the museum. The survey uses the instrument described in the article “IPOP: A Theory of Experience Preference,” in this issue (Pekarik, et al. 2014), and is based on the IPOP theory of visitor preferences in four categories: Ideas, People, Objects, and Physical. In IPOP research, museum visitors are given a self-administered questionnaire that asks 38 questions (in the long form), or 20 or eight questions (in the two shorter forms). These instruments are printed in this issue in Pekarik, et al. Appendix A, and are also available from the authors.

The 20-question survey was used here. These items were of the form:

I like to…

… know how things work

… analyze situations

… bring people together (and so on).

For each item, respondents made a selection from a four-level scale: Not me at all, A little me, Me, Very much me.

How many clusters should we choose to start with in the K-means Clustering? In the analysis software JMP 10, the K-means Clustering function provides eigenvalues, a statistic derived from the covariance matrix of the variables. As the number of clusters increases, the eigenvalues decrease. The point at which the drop in eigenvalues markedly decreases (“the elbow point”) suggests an approximate number of clusters. The eigenvalues, as shown in figure 1, support a solution somewhere between 3 and 4 clusters.

image

Figure 1. Plot of K-means Eigenvalues.

Download figure to PowerPoint

We chose three clusters to start. JMP 10 also provides a statistic known as Cubic Clustering Criteria (CCC), which compares the clusters created in K-means with what would be obtained from a uniformly distributed set of points (Sarle 1983). CCC is similar to eigenvalues in that it also is used to compare changes in values across different numbers of clusters. You generally want a number between 2 and 3 for the CCC. The -3.59 CCC fit value is uninterpretable alone, but it suggests that the three-cluster model is not the best fit. The negative value indicates that there are outliers.

We also calculated a four-cluster K-means model. The four-cluster result has a CCC fit value of -4.94. Therefore, the three-cluster model fits better because it has a higher (in this case closer to zero) CCC value. The negative value indicates that it, too, has some outliers and misfits. As figure 2 illustrates, the clusters overlap very little, but there are also quite a few misfits, or residuals. Within K-means there is no stable way to examine why the residuals are there or to test the model further, such as through a bootstrap technique.

image

Figure 2. Graphical Display of Three-cluster K-means Model.

Download figure to PowerPoint

LCA Analysis and Results

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

The LCA analysis, calculated in the software program Latent Gold 4.5, produced quite a different model. We started by examining 1, 2, 3, 4, 5, and 6 cluster models simultaneously.

In table 1 we see diagnostic statistics for models with one to six clusters. In general, analysis of the models focuses first on three values: LL, p-value, and BIC. LL is Log Likelihood, the logarithm of the likelihood ratio, a test that compares the fit of two models by examining how much more likely the data are predicted by one model compared to the other. The Log Likelihood can be used to compute a p-value, a measure of statistical significance. BIC is the Bayes Information Criterion, which is a statistic created to aid model selection by penalizing the number of factors in a model. Looking at these statistics, the four-cluster model might be the best fit because it has the lowest BIC value. But the log-likelihood value for the six-cluster model is the lowest. This gives us two candidates for best fit, the four-cluster model and the six-cluster model.

Table 1. Fit Values for Six Cluster Models—Against All Odds Data.
 LLBIC(LL)NparL2dfp-valueClass.Err.
  1. LL=Log-Likelihood; BIC=Bayes Information Criterion; Npar=Number of Parameters estimated; df =degrees of freedom; L2=Log-Likelihood squared; Class.Err.=Classification Error

1-Cluster-4147.2898598.281596523.8411132.5e-12960
2-Cluster-3961.8228335.445806152.907926.4e-12360.0219
3-Cluster-3864.4138248.7231015958.088716.9e-12140.0295
4-Cluster-3786.1088200.211225801.477503.4e-12010.0674
5-Cluster-3734.7728205.6361435698.806296.3e-12020.0406
6-Cluster-3704.6478253.4841645638.55681.5e-12150.0623

We begin by looking more closely at the six-cluster model. The first feature to consider is the parameter estimates (table 2). Four variables stand out—divide into categories, buy things, run, and ski—because they are not helping separate the clusters (that is, their p values are not significant). They could be removed or set to correlate and the model re-run. That is one option to consider, but there are more diagnostic features to examine first.

Table 2. Parameter Estimates for the Six-cluster Model.
Models for IndicatorsClusters   
 123456Wald p-value R2
Identify Patterns-0.291.040.722.51-0.29-3.7030.041.40E-050.39
Divide into Categories0.470.960.23-0.29-0.65-0.7210.83 0.055 0.27
Analyze Situations-1.140.510.183.182.86-5.5932.464.80E-060.49
Learn Philosophy0.340.430.641.720.17-3.3120.280.00110.19
How Things Work-0.313.48-0.444.12-2.02-4.8352.773.70E-100.60
How Things are Made-0.663.25-1.303.43-1.84-2.8856.985.10E-110.56
Buy Things-0.23-0.300.520.010.10-0.109.18 0.1 0.07
Construct Things-0.271.61-1.191.75-0.02-1.8942.255.20E-080.44
Help Others-0.520.440.563.69-2.11-2.0732.245.30E-060.36
Talk About Families0.28-0.551.380.81-1.07-0.8516.100.00660.29
Leisure With Others-0.39-0.051.510.80-0.65-1.2212.080.0340.27
Emotionally Connect -0.28-1.933.403.46-2.59-2.0737.884.00E-070.55
Bring People Together-0.08-0.781.852.06-1.41-1.6432.624.50E-060.37
Run-0.090.06-0.230.150.22-0.112.71 0.75 0.02
Ski0.040.100.130.16-0.970.542.90 0.72 0.03
Camp0.180.37-0.360.94-1.870.7419.040.00190.17
Play Sports0.040.28-0.410.34-0.290.049.320.0970.07

Next, we examine the bivariate residuals. The bivariate residual (BVR) is the association between any pair of indicators. It is a metric created by dividing the chi-square value by the degrees of freedom (df). In this case, we would expect that value to be around 1, because there is 1 df in the test. Values close to 1 would indicate that there is no serious problem with the model. BVRs over 3.84 (1 df) would indicate that the model is not explaining the relationship well.

The BVR table is too large to reproduce here, but in the case of the four-cluster model there is one BVR over 20 and six over 10; it seems that some of these might have similar meaning or a common underlying factor. For example, Talk About Families and Leisure Time With Others (BVR=19.82) are both social-oriented activities. We could fix residual pairs like these by correlating them and re-analyzing the model. Doing this would provide a better fitting model, but a wiser course of action is to consider an alternative model. In the six-cluster model, there is one very problematic residual (between Talk About Families and Leisure Time With Others), which is over 10. On the basis of the fact that it has fewer problematic residuals, we accept the six-cluster model as the best fit.

LCA also provides the ability to look at profiles of the clusters from multiple perspectives. One of these is a Profile Perspective. Table 3 presents the Profile Perspective for five items in the six-cluster model. These five items are the ones that we had intended to use in order to identify the People dimension in this version of the IPOP survey instrument.

Table 3. Profile Perspective of Five Items in the Six-cluster LCA Model.
 Cluster1Cluster2Cluster3Cluster4Cluster5Cluster6
Cluster Size0.380.250.160.140.040.02
Indicators      
Leisure With Others      
Not me at all0.010.010.000.000.050.09
A little me0.260.260.010.050.460.56
Me0.560.560.230.420.440.33
Very much me0.170.170.760.540.060.03
Help Others      
Not me at all0.020.000.000.000.220.21
A little me0.280.100.090.000.550.55
Me0.560.540.520.060.220.23
Very much me0.140.360.390.940.010.01
Talk About Families      
Not me at all0.030.070.000.000.190.19
A little me0.350.510.040.130.600.60
Me0.500.370.390.510.200.20
Very much me0.130.050.570.350.010.01
Emotionally Connect With Others     
Not me at all0.010.070.000.000.160.09
A little me0.190.530.000.000.600.55
Me0.750.390.250.240.230.36
Very much me0.060.010.750.760.000.00
Bring People Together      
Not me at all0.060.150.000.000.310.38
A little me0.320.440.040.030.470.46
Me0.530.370.440.390.210.16
Very much me0.090.030.520.580.010.01

In this table, the columns for each cluster in an item add up to one hundred percent (give or take a little rounding error to make the table smaller). Consider Cluster 3 item Leisure With Others. The decimal values are for each category choice; 76 percent of Cluster 3 visitors chose “Very much me” for that item. You can also create a visual display showing how clusters compare on each item (figure 3).

image

Figure 3. Cluster plot for Cluster 3 (green; topmost) and Cluster 6 (blue, lower).

Download figure to PowerPoint

Figure 3 illustrates items from the survey across two clusters, Cluster 3 and Cluster 6. (With multiple clusters, it is sometimes easier to examine them in pairs.) Cluster 3 indicates a group that appears to engage with other people while Cluster 6 indicates a group that seems more drawn to physical activities (ski, camp, and play competitive sports). The four rightmost items on the chart—run, ski, camp, and play competitive sports—were the items intended to identify the Physical dimension.

Similarly one of the clusters indicates a group that appears to favor the Idea items and another points to those more likely to identify strongly with the Object items. In other words, the LCA analysis confirms that the responses to this survey form four groups that support the four dimensions of the IPOP model (Idea, People, Object, Physical). The remaining two clusters contain cases that tended to mark many items high or many items in the low-middle range of the scale.

LCA and related software for LCA allow you to add covariates and random and fixed effects into the model, and run bootstraps to compare cluster models and the model itself. Latent Gold, which is the software we used for this study, can also compare a dichotomous factor analysis to the cluster analysis. It is additionally possible to use the class analysis results to create Bayesian prediction models based on these results in conjunction with the observed behaviors of the visitors.

There are further analyses that can be done with this approach, but the purpose here is to introduce and demonstrate how the latest analytical methods can be used to improve our understanding of museum visitors.

LCA versus Hierarchical Clustering

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

For this comparison we are using data from a survey evaluation of visitors leaving the exhibition Elvis at 21: Photographs by Alfred Wertheimer. This was an exhibition of photographs of Elvis Presley (1935-1977) taken by photojournalist Alfred Wertheimer in 1956, the moment the 21-year-old singer was about to enter the national stage. The exhibition was studied during its presentation at the National Portrait Gallery, Smithsonian Institution, Washington, D.C. In the middle of January 2011, 210 entering visitors and a separate sample of 312 exiting visitors were surveyed by the Smithsonian Institution's Office of Policy and Analysis. Forty-six percent of the exiting visitors were male, 45 percent indicated this was a first visit to the museum, 33 percent specifically came to see the exhibition, and the average age was 39.8.2

Visitors were asked the following question and were allowed to choose as many answers as they wished:

Which experiences did you find especially satisfying in Elvis at 21? (Mark one or more):

Being moved by beauty

Connecting with the emotional experiences of others

Enriching my understanding

Gaining information

Getting a sense of the everyday lives of others

Recalling memories

Reflecting on the meaning of what I saw

Seeing rare, valuable, or uncommon things

This list of response items derived from a long-standing effort by the Office of Policy and Analysis to understand the expectations and experiences of museum visitors.3 The selection of options for this particular study was influenced by the IPOP theory of experience preference. Two items were intended to address the Idea dimension (Gaining information; Enriching my understanding); two addressed the People dimension (Connecting with the emotional experiences of others; Getting a sense of the everyday lives of others); and two addressed the Object dimension (Being moved by Beauty; Seeing rare, valuable, or uncommon things). We did not attempt to phrase experiences in the Physical dimension for this art exhibition.

Hierarchical Clustering and Results

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

For this analysis, we focused on clustering the exiting visitors according to their responses on the question about especially satisfying experiences. Using JMP 10 we applied the hierarchical clustering function with Ward method. In JMP 10, two graphic representations are produced, the dendrogram and the scree plot showing where each visitor entered a cluster. Figure 4 shows the scree plot.

image

Figure 4. Scree Plot of Hierarchical Clustering for Elvis at 21 Data.

Download figure to PowerPoint

This plot suggests that the number of clusters to retain is four or five. There is a large distance between four (just above 15 on the distance: y-axis) and five (just below the mid mark of 13: y-axis). Beyond this there are no diagnostics that can assist in making a decision about the ideal number of clusters. The automated analysis within JMP actually chose an unwieldy 17 clusters as the cut-off. Visual inspection of the dendrogram suggests that there could be anywhere from two to eight clusters depending on where you think the cut-off should be (figure 5). The blue star curved graph at the bottom is the same data as the scree plot above, but in a different format.

image

Figure 5. Dendrogram with Scree Plot in Blue at the Bottom. [Color version on the Web.]

Download figure to PowerPoint

The crucial issue here is that we do not have a good idea where to draw the line, so to speak. We also cannot compare clusters from a technical standpoint. We can choose what seems parsimonious to us, but have no external basis for deciding.

LCA Analysis and Results

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

The first step in this LCA analysis was to run multiple clusters simultaneously and examine the diagnostic statistics (table 4). We ran clusters ranging in size from 1 to 5.

Table 4. Initial LCA Analysis of Elvis at 21 Satisfying Experience Data.
  LLBIC(LL)NparL2dfp-value
  1. LL=Log-Likelihood; BIC=Bayes Information Criterion; Npar=Number of Parameters estimated; df = degrees of freedom

Model11-Cluster-1660.083367.4448350.94552471.50E-05
Model22-Cluster-1617.083334.6517264.95452380.11
Model33-Cluster-1596.623346.92126224.02842290.58
Model44-Cluster-1584.693376.2535200.162200.83
Model55-Cluster-1571.423402.90644173.61892110.97

As we described earlier in this paper, analysis of the models focuses first on three values: LL, p-value, and BIC. The analyst is looking for lower values of LL and BIC p-values that are non-significant. It is common to seek a p-value that is less than .05, but in this case, the p-value should be greater than .05, in essence accepting that the data fits the model. From all of the diagnostic data here, Model 2 has the lowest BIC and a non-significant p-value, so that is a good starting point. The two-cluster model has 11 bivariate residuals (BVR) that are above 3.84, indicating that the model is not explaining the relationship between those variables well. On closer inspection this makes sense, because the two clusters consist of those who chose most of the eight items and those who did not. The 3-Cluster model has similar bivariate residual problems.

Therefore, the 4- and 5-Cluster models look like better candidates for correct specification. There is one residual that is high for the 4-Cluster model, and although that could technically be adjusted for, the 5-Cluster model looks like the best fit, because it does not have any BVR problems. Although the 5-Cluster model appears to fit the data best, it is advisable to run a boot-strap analysis just to double-check the results against the 4-Cluster model. The boot-strap process creates thousands of new samples from the original sample by using random sampling with replacement. Statistics calculated on each of these many samples are then examined against one another. The bootstrap results indicate that the 5-Cluster model is a better fit than the 4-Cluster model, specifically, the log-likelihood difference test provides a difference of 26.54 with a p-value of 0.04.

Next, we look at the profile plot (table 5). The profile plot shows the distribution across each cluster of the respondents who selected a particular item (0 = not selected; 1 =  selected). Values over 0.5 for those who selected an item are highlighted.

Table 5. Profile Plot for the Five Cluster Model.
 Cluster1Cluster2Cluster3Cluster4Cluster5
Cluster Size0.500.260.150.060.03
Indicators     
Being moved by beauty (Beauty)    
00.91910.92120.78950.07040.1153
10.08090.07880.2105 0.9296 0.8847
Seeing rare, valuable or uncommon things (Rare)  
00.82870.72440.5680.10880.2978
10.17130.27560.432 0.8912 0.7022
Enriching my understanding (Understanding)  
00.89660.69310.02870.66610.0142
10.10340.3069 0.9713 0.3339 0.9858
Gaining information (Information)     
00.87640.73390.38880.99560.0182
10.12360.2661 0.6112 0.0044 0.9818
Connecting with the emotional experience of others (Emotion) 
00.99220.07730.75210.5350.022
10.0078 0.9227 0.24790.465 0.978
Getting a sense of the everyday lives of others (Everyday lives) 
00.75270.57810.64090.85550.0182
10.2473 0.4219 0.35910.1445 0.9818
Recalling memories (Memory)    
00.81370.67350.89440.92670.108
10.18630.32650.10560.0733 0.892
Reflecting on the meaning of what I saw (Meaning)  
00.88720.91150.69230.83020.2326
10.11280.08850.30770.1698 0.7674

In the table results, Cluster 1 includes 50 percent of the visitors in it, Cluster 2 has 26 percent, Cluster 3 has 15 percent, Cluster 4 has 6 percent, and Cluster 5 has 3 percent. The rows in table 2 indicate, for example, that only 8 percent in Cluster 1 chose “Being moved by beauty” as an especially satisfying experience, also 8 percent in Cluster 2, but 92 percent in Cluster 4 and 88 percent in Cluster 5. The columns indicate that respondents in Cluster 5 tended to select all the items. Another way to look at this is to note that Cluster 4 participants are much more likely to respond that they found rare objects the most satisfying compared to participants in the other clusters.

At this point, we can also entertain some general differences across the clusters, such as the fact that those in Cluster 1 did not find anything especially satisfying, those in Cluster 2 seemed to align to Emotion and Everyday lives (the two People items), those in Cluster 3 to Understanding and Information (the two Idea items), those in Cluster 4 to Beauty and Rare (the two Object items), and those in Cluster 5 to almost everything. This is easier to see in a plot of this data that illustrates clusters and the proportion of people who chose each item (figure 6). In the plot, you can readily see the five clusters and their differences. Cluster 5 is the group that found most of the experiences satisfying overall, and Cluster 1 found the fewest.

image

Figure 6. Cluster Plot for Satisfying Experiences in Elvis at 21.

Download figure to PowerPoint

Table 6 provides the bivariate residuals, which also support a 5-Cluster model, since the highest is 2.2, which is well below the 3.84 threshold.

Table 6. Bivariate Residuals for Selections in the Five-Cluster Model.
 Indicators1234567
1Beauty       
2Emotion0.0416.     
3Understand0.03870.0052.    
4Information0.05820.00010.1654.   
5Everyday Lives1.22050.21301.51190.3675.  
6Memory1.35140.00100.13521.12160.5604. 
7Meaning0.07450.02320.01800.11420.17002.2020.
8Rare0.15730.11420.30160.04731.75120.25231.6083

Once clusters have been identified, other differences can be examined. You can save each individual cluster assignment and use those new variables to address further research questions. For example, we can ask whether these clusters differed in their rating of overall experience.4 Results from a Kruskal-Wallis analysis of the overall experience rating (table 7), indicate a statistically significant difference among the clusters.5

Table 7. Kruskal-Wallis Test Result and Post-hoc Analysis.
 Hypothesis Test Summary
  Null Hypothesis Test Sig. Decision
1 The distribution of Rating is the same across categories of Cluster modal. Independent-Samples Kruskal-Wallis Test .001 Reject the null hypothesis.
Each node shows the sample average rank of cluster modal.
Sample1-Sample2 Test Statistic Std. Error Std. Test Statistic Sig. Adj.Sig.
  1. Each row tests the null hypothesis that the Sample 1 and Sample 2 distributions are the same.

  2. Asymptotic significances (2-sided tests) are displayed. The significance level is .05.

1.00-5.00 -90.60026.122-3.468.001.005
3.00-5.00 -83.17328.443-2.924.003.035
2.00-5.00 -74.82026.468-2.827.005.047
1.00-4.00 -48.76418.302-2.664.008.077
4.00-5.00 -41.83630.436-1.375.1691.000
3.00-4.00 -41.33721.485-1.924.054 .544
2.00-4.00 -32.98418.793-1.755.079.792
1.00-2.00 -15.78010.449-1.510.1311.000
1.00-3.00 -7.42714.752-.503.6151.000
3.00-2.00 8.35315.356.544.5861.000

The post-hoc analysis indicates that, not surprisingly, members of Cluster 5 had a more satisfying experience than members of Clusters 1, 2, and 3. The clusters of interest for us, Clusters 2, 3, and 4, did not differ statistically in their experience rating, suggesting that each of them might have gotten what they wanted out of the exhibition.

There are further analyses that can be completed, but the purpose here is to introduce and demonstrate how the latest analytical methods can improve the understanding of museum visitor experience.

Conclusion

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies

Latent Class Analysis provides a more rigorous statistical footing than K-means and Hierarchical Clustering for both exploratory work and theory testing, as we try to understand differences in what visitors are doing and how they are responding. In addition, it does not have the serious problems that K-means has had historically, and it provides a wealth of diagnostic information to the researcher. In the age of “big” data, expectations for the quality of analysis have increased. LCA (and the subsequent work on LCA) represents the next generation of tools, and supersedes the less objective K-means and Hierarchical techniques. Finally, the data analyzed in this paper supports the basic structure of the IPOP theory, corroborating other methods used in this research.

Notes
  1. 1

    See the SPSS technical note on this issue at http://www-01.ibm.com/support/docview.wss?uid=swg21476878.

  2. 2

    The data used here are taken from exiting visitors only.

  3. 3

    An overview of the development of the research and conclusions from it are reported in Pekarik and Schreiber (2012).

  4. 4

    The rating is in response to the question “Please rate your overall experience in this exhibition today.” The response options are Poor, Fair, Good, Excellent, Superior. Overall, Elvis at 21 had a rating of 27 percent Superior, 56 percent Excellent, 40 percent Good, 3 percent Fair, and 0 percent Poor. This rating was well above the average for Smithsonian exhibitions.

  5. 5

    This non-parametric statistic test was selected because the rating variable is ordinal.

References

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies
  • Banfield, J. D., and A. E. Raftery. 1993. Model-based Gaussian and non-Gaussian clustering. Biometrics 49: 803821.
  • Bensmail, H., G. Celeux, A. E. Raftery, and C. P. Robert. 1997. Inference in model based clustering. Statistics and Computing 7: 110.
  • Cheeseman, P, and J. Stutz. 1995. Bayesian classification (Autoclass): Theory and results. In Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatesky-Shapiro and P. Smyth Uthurusamy, eds., 153180. AAAI Press.
  • Everitt, B. S. 1993. Cluster Analysis. London: Edward Arnold.
  • Fraley, C., and A. E. Raftery. 1998a. MCLUST: Software for Model-based Cluster and Discriminant Analysis. Department of Statistics, University of Washington, Technical Report No. 342.
  • Krantz, A., R. Korn, and M. Menninger. 2009. Rethinking museum visitors: Using K-means Cluster Analysis to explore a museum's audience. Curator: The Museum Journal 52(4): 363374.
  • Lazarsfeld, Paul F, and Neil W. Henry. 1968. Latent Structure Analysis. Boston: Houghton Mifflin.
  • Magidson, J, and J. K. Vermunt. 2002. Latent class models for clustering: A comparison with K-means. Canadian Journal of Marketing Research 20: 3744.
  • McCutcheon, A. 1987. Latent Class Analysis. Quantitative Applications in the Social Sciences Series, Number 07-064. Newbury Park, London, and New Delhi: Sage Publications.
  • McLachlan, G. J, and K. E. Basford. 1988. Mixture Models: Inference and Application to Clustering. New York: Marcel Dekker.
  • McLachlan, G. J., D. Peel, K. E. Basford, and P. Adams. 1999. The EMMIX software for the fitting of mixtures of normal and t-components. Journal of Statistical Software 4(2): 115.
  • Pekarik, Andrew J., and James B. Schreiber. 2012. The power of expectation. Curator: The Museum Journal 55(4): 487496.
  • Pekarik, Andrew J., James B. Schreiber, Nadine Hanemann, Kelly Richmond, and Barbara Mogel. 2014. IPOP: A theory of experience preference. Curator: The Museum Journal 57(1): 528.
  • Sarle, Warren S. 1983. Cubic Clustering Criterion. SAS Technical Report A-108, Cary, NC: SAS Institute Inc.
  • Schreiber, J. B., A. Pekarik, N. Hanemann, Z. Doering, and A-J. Lee. 2013. Understanding visitor behavior and engagement. The Journal of Educational Research. Accessed Oct. 2013 at http://dx.doi.org/10.1080/00220671.2013.833011.

Biographies

  1. Top of page
  2. Abstract
  3. K-means Clustering
  4. Hierarchical Clustering
  5. Latent Class Analysis
  6. LCA versus K-means
  7. LCA Analysis and Results
  8. LCA versus Hierarchical Clustering
  9. Hierarchical Clustering and Results
  10. LCA Analysis and Results
  11. Conclusion
  12. References
  13. Biographies
  • James B. Schreiber (jbschreiber@gmail.com), professor, Duquesne University and associated research fellow at the Smithsonian Institution.

  • Andrew J. Pekarik (pekarika@si.edu), senior research analyst, Office of Policy and Analysis, Smithsonian Institution, Washington, D.C.