Interindividual Variation in Source‐Specific Doses is a Determinant of Health Impacts of Combined Chemical Exposures

Abstract All individuals are exposed to multiple chemicals from multiple sources. These combined exposures are a concern because they may cause adverse effects that would not occur from an exposure recieved from any single source. Studies of combined chemical exposures, however, have found that the risks posed by such combined exposures are almost always driven by exposures from a few chemicals and sources and frequently by a single chemical from a single source. Here, a series of computer simulations of combined exposures are used to investigate when multiple sources of chemicals drive the largest risks in a population and when a single chemical from a single source is responsible for the largest risks. The analysis found that combined exposures drive the largest risks when the interindividual variation of source‐specific doses is small, moderate‐to‐high correlations occur between the source‐specific doses, and the number of sources affecting an individual varies across individuals. These findings can be used to identify sources with the greatest potential to cause combined exposures of concern.


INTRODUCTION
Individual human or ecological receptors' exposure to chemicals are complex (Escher, Stapleton, & Schymanski, 2020;Paulik & Anderson, 2018). All individuals are exposed to large numbers of chemicals and an individual may have multiple sources of exposure to a single chemical. In the United States, an individual's exposures to a single chemical from multiple sources is referred to as aggregate exposure, exposures to multiple chemicals as cumulative exposures (Environmental Protection Agency, 2003), and in the United States and Europe both types have been referred to as combined exposures (Organisation for Economic Co-operation and Development, 2018). The concern raised by such exposures is that while an individual may be able to tolerate a dose of one chemical from a single source, the combined doses of one or more chemicals recieved from multiple sources may result in adverse effects. As a result, findings of safety for of doses of individual chemicals received from a single source may not be protective of the health of humans or ecological receptors.
Risk assessors established screening methods for assessing risks from combined exposures that are based on dose-addition models (Environmental Protection Agency, 1986;Escher et al., 2020;Stokinger, 1964). Combined exposure assessments for single chemicals sum source-specific doses to arrive at estimates of total doses. Combined exposure assessments for multiple chemicals use chemical-specific indices to convert the doses of different chemicals into equivalent doses of a toxicity-weighted metric. These metrics are summed to give a measure of the combined toxicity of the chemicals to the individual.
As demonstrated in the Methods section, these additive risk models can be reduced to a single equation. Safety for an individual is assumed when the following is true: where TD ijk is the toxicity-weighted dose of the ith chemical for the jth individual from the kth source for n chemicals and m sources. A measure of combined toxicity for an individual (CTD j ) is given by the sum of the toxicity-weighted doses: The contribution of the chemical with the largest TD ijk to CTD j is characterized using the maximum cumulative ratio (MCR) . The MCR is a property of the individual and is defined as: Values of MCR j range from one to n. MCR j values approaching one indicate that the contribution of one chemical dominates CTD j and values approaching n indicate that the chemicals are present in equitoxic doses . For example, an individual having toxicity-weighted doses from three chemicals of 0.10, 0.10, and 0.11 and a second individual having 0.10, 0.10, and 2.0 would have MCR values of 2.8 and 1.05.
Studies of interindividual variation in MCR j have been published for populations of humans and ecological receptors. Studies have been performed on exposures to sources such as mixtures of chemicals in ground and surface water Holmes et al., 2018;Price et al., 2012;Silva & Cerejeira, 2015;Vallotton & Price, 2016), in indoor air (De Brouwere et al., 2014;Mishra, Ayoko, Salthammer, & Morawska, 2015), and cumulative exposures predicted from data collected in biomonitoring studies (Han & Price, 2013, Reyes & Price, 2018a, 2018b. The studies show a consistent pattern for MCR j in the study populations. Values of MCR j vary across individuals but values are typically closer to one than n. Even when the data include findings on large numbers of chemicals (n of 20 or more), values of MCR j are typically less than five and average less than two. In addition, MCR j is consistently found to be negatively correlated with CTD j . This correlation is measured by taking the slope of the log(MCR j -1) versus the log(CTD j ) and is hereafter referred to as the MCR slope (Reyes and Price, 2018b). Fig. 1 presents an example of such a slope for estimates of combined exposures to six phthalates in U.S. adults and children as measured by biomonitoring.
This pattern has significant implications for predicting risks posed by combined exposures. The negative values of the MCR slope results in the most exposed individuals having MCR values close to one. When this occurs, the upper bounds of the populations' combined exposures are similar to the upper bounds of the largest of the individual source-specific doses. This implies that if the upper bounds of the doses from the sources of exposure for these populations individually do not pose risk, then they are not likely to pose a risk in combination. In populations where the pattern occurs, individuals are not likely to have a risk that would be missed by not performing a combined exposure assessment. If criteria are identified that predict when the pattern does, and does not occur, then the criteria can be used to focus combined exposure assessments on populations that are the greatest concern.
This article explores whether this pattern occurs as the result of the shapes of the distributions of source-specific values of TD across individuals in exposed populations. It is difficult to find an analytical solution to the relationship between the shapes of the source-specific distributions of T D on the distribution of MCR values across a population. The relationship is therefore investigated empirically by using a simulation model of interindividual variation in individuals' combined exposure in a population exposed to chemicals from multiple sources.

MATERIALS AND METHODS
This section is divided into three sections. The first section is a demonstration that many of the existing approaches used to assess combined exposures involving one or more chemicals reduce to a common formula. This formula is used as the basis for the design of the simulation model. The second section presents a description of the simulation model. The final section describes how the model is used to investigate the impacts that the characteristics of interindividual variation in the source-specific values of TD have on MCR values.

Derivation of a Common Model for Dose-Additive Aggregate and Cumulative Risk Assessments
Both aggregate and cumulative risk assessments address the issue that an individual's chemical exposures occur as a result of the individual's interactions with multiple sources of exposure that result in multiple doses of one chemical (for aggregate assessments) or two or more chemicals (for cumulative assessments). In both types of combined assessments, a dose metric is determined for each source of exposure. The source-specific dose metrics are summed across the sources for aggregate assessments and across both chemicals and sources for cumulative assessments. The resulting sums are evaluated against a criterion derived from toxicity data.
For aggregate exposures, the dose metric is the source-specific dose of a chemical. These are summed over all sources to give the aggregate dose of the chemical to the individual: where D i jk is the dose of the ith chemical for the jth individual from the kth source, m is the number of sources, and AD ij is the aggregate dose to the ith chemical to the jth individual. The risk is judged to be acceptable when the aggregate dose is less than the permitted dose of the ith chemical (PD i ): For cumulative risk assessments, the values of D ijk are determined separately for each chemical and source. The doses are then normalized to a common metric that allows their summation. Two commonly used methods for normalization are the hazard index (HI) (Environmental Protection Agency, 1986) and toxicity equivalents (TEqs) (Varshavsky, Morello-Frosch, Woodruff, & Zota, 2018).
The HI method normalizes D ijk by dividing D ijk by the permitted dose of the ith chemical (PD i ) to give the hazard quotient metric (HQ ijk ): An individual's HQ ijk values are summed over n chemicals and m sources to give the individual's HI (HI j ): The risk from cumulative exposures is judged to be acceptable for the jth individual when the value of HI j is less than 1.
The TEq method normalizes D ijk by multiplying it with a chemical-specific toxicity equivalent factor (TEF i ) that converts the chemical-specific dose to equivalent dose of an index chemical (ID ijk ): These equivalent doses are summed over n chemicals and m sources to produce an estimate of the total dose of the index chemical reaching the jth individual (ID j ): When the value of ID j is less than the permitted dose for the index chemical (PD ID ), the individual's risk from cumulative exposures is judged to be acceptable.
The finding of acceptability of an individual's combined exposures as measured by the aggregate and two cumulative methods can be fit to a single equation. An individual's combined exposures is acceptable when the following is true: The value of K i is 1/PD i for aggregate assessments and cumulative assessment that use the HI approach and TEF i /PD ID for cumulative assessments that use the TEq approach. If K i D ijk is defined as the toxicity-adjusted dose of the ith chemical for the jth individual from the kth source (TD ijk ), then all three methods reduce to: The measure of combined toxicity adjusted doses for the jth individual (CTD j ) is defined as: Equations (11) and (12) do not address the issue of route of exposure on dose. In an actual cumulative or aggregate risk assessment, route-specific adsorption and metabolism would need to be assessed to determine how route-specific doses would contribute to the total systemic dose of the individual. In addition, the interindividual variation in TD ijk for aggregate assessments differs from the variation in cumulative assessments. In aggregate assessments, variation in TD ijk is only a function of variation in D ijk since K i is a constant. For the two cumulative models, the interindividual variation in TD ijk is a function of both the variation in D ijk and K i .
The value of the MCR for the jth individual can be defined as: In this article, the term MCR is also used to refer to the corresponding ratio in aggregate exposure assessments (aggregate dose divided by the single largest source-specific dose): While such a ratio would be, strictly speaking, the maximum aggregate ratio rather than a MCR, it behaves in the same way as the MCR and can be investigated using the same approaches.

Design of the Simulation Model
The design of the model is straightforward. A series of m sources of chemical exposure are characterized, and toxicity-adjusted doses received from each source by each individual in an exposed population are determined. For simplicity, the model assumes that each source releases a single unique chemical. As a result, the number of sources and chemicals are the same (m equals n), the maximum number of TD for an individual is m, and the variation in TD values across sources reflects differences between both the exposure potentials of the sources and the chemicals' toxicities. No attempt is made to separate the contribution of the two sources of variation. Values of CTD j and MC R j are determined for each individual using the values of TD jk . This process is repeated for a large number of populations in each model run and the characteristics of the distributions of the source-specific doses are varied across the populations. Three metrics are tracked for each population; the median value of MC R j in the population (median MCR), the average value of MC R j for the individuals with the 10 largest values of CTD j in the population (high-risk MCR), and the MCR slope.
The model characterizes three types of variation. At the most basic level, the model needs to characterize variation in toxicity weighted doses received from a source across the individuals in the simulated population (interindividual variation in TD jk ). Second, the model needs to characterize variation in the distributions of source-specific doses across the m sources of exposures of a population (intrapopulation variation in the distributions of TD jk ). Finally, the model must characterize the differences in the ranges of the m distributions in the different simulated populations (interpopulation variation of interpopulation variation in the distributions of TD jk ).
The values of TD jk differ across the individuals in a population as a function of the level of interaction each individual has to each source and the physical and chemical properties of the chemicals that influence intake and uptake of the substances. In the simulation modeling, the interindividual variation of TD jk are modeled using lognormal distributions. Distributions of source-specific doses are bounded at zero, frequently vary over several orders of Price magnitude, and tend to be right skewed. The distributions of such doses have been shown to be well approximated by lognormal distributions (Hattis & Burmaster, 1994;Limpert, Stahel, & Abbt, 2001;Ott, 1978). This occurs in part, because concentrations of chemicals in environmental media tend to follow lognormal distributions (Ott, 1990;Rappaport & Kupper, 2008). Dose estimates, expressed in units of mass per body weight, are also the product of multiple factors (bodyweight, duration of exposure, intensity of exposure, and uptake rates) that are multiplied or divided by one another. Such multiplicative formulae also tend to produce lognormally distributed results (Limpert et al., 2001).
For combined assessments involving multiple chemicals, the value of K i also varies across chemicals. As a result, the chemical-specific distributions of TD jk across the individuals in a population will have the variation in their averages increased as a result of variation in K i . Interchemical variation in toxicity values also follow right-skewed distributions, are bounded at zero, and vary by more than six orders of magnitude. For example, Fig. 2 shows a Q-Q plot of the 381 reference doses (a type of permitted dose) reported on the U.S. Environmental Protection Agency's integrated risk information system 1 . Except for the two most toxic compounds, the data are linear indicating the values follow a lognormal distribution.
Populations exposed to multiple sources include individuals who are exposed to some but not all sources. For example, an exposed population could include hobbyists who use the chemical as part of their hobbies and nonhobbyists who do not. The distribution of TD jk should therefore assign values of zero for TD jk for some individuals and sources. This requires a mixed distribution where a parametric distribution describes variability in TD jk for a portion of the population exposed to the kth source (α k ) and a dose of zero is assigned to the remaining portion (1 − α k ). The final distribution from the kth source is described using three inputs, a geometric mean (GM k ), a geometric standard deviation (GSD k ), and the fraction of the population that is exposed (α k ): 1 Downloaded July 17/2019 from https://cfpub.epa.gov/ncea/ iris/search/index.cfm Different sources of exposure produce different distributions of doses. Because of these differences, the values of the three inputs vary across sources. The goal in the selection of the distributions of the values of the three inputs is to evenly sample the parameter space of the inputs. As a result, uniform or loguniform distributions are used to describe variation in the source-specific values of the three inputs. In addition, in order to investigate the impact of differences in the values of GM k , GSD k , and α k , the various populations need to reflect different ranges of values for these inputs. Therefore, the averages and ranges of the inputs are also varied in a uniform way across the populations. The intrapopulation and interpopulation variability are modeled separately for GM k , GSD k , and α k using the following approaches.
The values of the daily doses of a chemical that an individual receives from a source could range from a few molecules to several hundred grams. This suggests that in the simulation models, the values of GM k and GSD k should be allowed to result in a wide range of values for TD jk . The average of GM k for the sources in a population is a scaler that equally effects all values of GM k and by extension all values of TD jk and CTD j . Since MC R j is the ratio of CTD j and max(TD jk ), the values of the three metrics (median MCR, high-risk MCR, and MCR slope) are all independent of the average of GM k . While the average value of GM k has no impact on the three metrics, the range in the source-specific values of GM k for the m sources of a population does have an effect. The intrapopulation variation in the spread of values of GM k is modeled by assuming that the values of GM k in a population's sources follow a log 10 uniform distribution. The range can be described by assigning a value of one for the average value of GM k and using a spread parameter (S GM ) that is defined as follows: S GM = log 10 (GM k ) max − log 10 (GM k ) min . (16) The maximum and minimum values of the logs of the averages of the source-specific dose distribution for a population, log 10 (GM k ) min and log 10 (GM k ) max , are then 0.5 S GM and −0.5 S GM . When S GM is equal to zero for a population, the GM k of all sources has a value of one. A value of six indicates that the values of GM k for the sources of a population could vary by a factor of 1 million. Modeling interpopulation differences in the spread of GM k is performed by varying the value of S GM across the sources. The interpopulation variation in S GM is modeled by sampling from a uniform distribution between zero and the maximum population spread (PS GM ) in the simulation run. The intrapopulation variation in GSD k across sources also can be large. Values of TD jk from a source can vary over the population's individuals by many orders of magnitude due to differences in chemical concentrations in relevant media and variation in the rate of intake and uptake of the chemicals. For another source, where the concentrations are fixed and exposures occur in a similar way, the source-specific doses may be relatively constant across the population. GSD k is also limited to values greater than one. The average value of GSD k as well as the range affects the values of the three population metrics. Therefore, both the inter-and intrapopulation variation of the average and the spread of GSD k need to be considered in the model.
Simulating the values of GSD k for a population's sources uses an approach similar to that used for GM k . The intrapopulation variation in GSD k is again modeled using a log 10 uniform distribution defined by log 10 (GSD k ) min and log 10 (GSD k ) max . The values of log 10 (GSD k ) min and log 10 (GSD k ) max are defined using the average and spread of the values of log 10 (GSD k ) in a population's sources (M GSD and S GSD ). S GSD is defined as: S GSD = log 10 (GSD k ) max − log 10 (GSD k ) min . (17) The values of log 10 (GSD k ) min and log 10 (GSD k ) max are then M GSD plus or minus 0.5 × S GSD . The interpopulation variation in M GSD is determined by sampling values of M GSD from a uniform distribution of between 0 and a maximum value of M GSD for all populations in a simulation run (PM GSD ). Because values of log 10 (GSD k ) min must be greater than zero, the value of S GSD for a population must be less than or equal to 2 × M GSD. The intrapopulation variation in S GSD is therefore sampled from a uniform distribution between zero and 2 × M GSD .
Values for α k fall between 0 and 1. In the simulation model, the intrapopulation variation in α k is modeled as uniform distributions that vary from α k-min to α k-max . In this model, values of α k-min and α k-max are held constant across the populations in a model run (interpopulation variation is zero).
An individual's exposures to one source may be correlated to exposures of a second. The simulation model is used to investigate the impacts of correlation of TD between sources. In this analysis, correlation is modeled using rank correlation. All sources are assumed to be similarly correlated. This assumption exaggerates the impacts of correlation since in many instances different levels of correlation occur between different pairs of sources.
The analyses were performed in Excel TM using the Excel add on software @Risk TM (version 7.6.0). A copy of the spreadsheet containing the model used in this analysis is provided in the Supporting Information.

Model Runs and Outputs
Six model runs of 5,000 populations each containing 1,000 individuals were performed (Table I). The values of m for the first and second runs were held constant at 20 and 100 respectively. Each individual was assumed to be exposed to all sources. The third run had an m of 100, but each individual had a 20% chance of exposure to each source. In this run, the average individual was exposed to 20 sources (as was the case for run one), but the number of sources varied across individuals. Runs one through three assumed no correlation between the doses of each source. Runs four through six assumed an m of 100 and that all individuals were exposed, but the individuals' values of TD jk of each source were rank correlated to varying degrees (Spearman correlation coefficients of 0.2, 0.5, and 0.8).
All the model runs varied GM k , GSD k using the approach described above. In all six runs, the values of PS GM and PM GSD were set at 6.0 and 2.5. A value of 6.0 for PS GM indicates that the values of GM k for the sources of some population could be very similar in size and in other populations could vary by a factor of 1 million. A value of 2.5 for PM GSD indicates that the sources of some populations provide a uniform dose across all individuals (M GSD equal to 0). For populations where M GSD has a value of 2.5, the log 10 (GSD k ) max of a source could be as large as five. A source with a log 10 (GSD k ) of five would have TD jk values at plus or minus two geometric standard deviations of the geometric mean that would differ by more than 20 orders of magnitude.
The results of each model run were used to generate a series of six scatter plots that show the effect of the spread of the average range of GM k and the average value of GSD k .in a population on the three population metrics. These two parameters were selected for the following reasons. First, the impacts of variation in values of the other model parameters (e.g., m and α) are explored in the different model runs. Second, as discussed above, variation in the average of the GM for a population's sources has no impact on the population metrics. Finally, while both the average and the spread of GSD values across a population's sources affect the metrics, the populations' spread of the GSD values are tightly correlated to the populations' average GSD values. As a result, plotting the three metrics against the spread of GSD gives very similar plots to the plots against average GSD (results not shown).

RESULTS
The results for the six model runs are presented as a series of scatter plots and a summary table. Fig. 3 shows the results for model runs one through three, and Fig. 4 shows the results for runs four through six. Each figure presents the six scatter plots of the three population metrics (median MCR, high-risk MCR, and MCR slope) plotted against the average of the range of GM k and the GSD k of the distributions of populations' doses. Table II gives information on the values of the three population metrics for the 5,000 populations simulated in each of the six model runs. Results are given for all populations in a run and for populations with average GSD values above and below 0.5. To capture the range of values for the metrics across the 5,000 populations, three percentiles (2.5, 50, and 97.5) are reported. The range of values of the metrics can be viewed as the range of plausible values that could occur when certain criteria are defined for a population.
The size of the ranges of GM k values for a population's sources has a modest effect on MCR j , with wider ranges of GM k suppressing MCR j values. The size had little effect on the MCR slopes. When the ranges of GM k values for a population's sources are small, median MCR values approach m for a small number of populations.
Differences in the average GSD k of the sourcespecific doses have a larger effect on MCR j and the MCR slope than the average range of the GM k . Values of MCR j only approach m when values of average GSD k are close to zero. Populations where the average GSD k value is greater than 0.5 (individuals at two standard deviations above and below the GM differ by a factor of 100 or greater) have low values of MCR j , negative slopes, and values of MCR j are largely independent of m and α k . In populations with average GSD k values of less than 0.5, the values of MCR j become dependent on m and α k . Positive MCR slopes in runs one and two increase with m but positive slopes are limited to the small fraction of the populations where the average values of GSD k are less than 0.2.
In the first three runs, the majority of the populations display the pattern of small values (relative to m) for the median and high-risk MCR and negative MCR slopes (Fig. 3). The fivefold increase Run 1: 20 sources, all exposed  in m between runs one and two increases the various measures of MCR j values. The size of the increase varies for different percentiles of median and high-risk MCR (Table II); but all increases are smaller than the increase in m. The 2.5 and 50th percentiles of the median and high-risk MCR are largely unchanged; but, the 97.5th percentiles of the two metrics increased by factors of 3.2 and 3.8, respectively. As stated above, the effects of an increase in m on MCR are more pronounced when GSD k values are less than 0.5. Increasing m made the MCR slopes more negative.
In run three, where the number of sources that reach an individual varies across individuals, the fraction of the MCR slopes that are positive increases compared to run one where the number is constant (Fig. 3). The values of the three percentiles of median MCR are largely unchanged between run one and three, but the values 50th and 97.5th percentiles of the high-risk MCR increase by roughly a factor of 1.5 (Table II). Again, the differences between the two runs are most pronounced when GSD k values are less than 0.5.
Runs four through six investigate the impact of introducing correlations into an individual's TD ijk values (Fig. 4). While the pattern observed in the first three runs occurs in these runs as well, correlations in source-specific values reduce the interindividual variation in an individual's TD ijk values and as a result, values of median and high-risk MCR increase and the MCR slopes become flatter with increasing correlation. The impacts are most significant for populations where all correlations coefficients are greater than 0.2 and average GSD k values are less than 0.5.

DISCUSSION
This article began with the observation that a "pattern" is frequently observed in studies of combined exposures. In this pattern, risks posed by such combined exposures are almost always driven by exposures from a few chemicals and sources and the exposures to individuals with the highest combined exposures are frequently driven by a single chemical from a single source. This analysis used computer simulations of combined exposures to investigate when multiple sources of chemicals drive the largest risks in a population and when a chemical from a single source is responsible for the largest risks. The modeling demonstrated that the "pattern" consistenly occurres when distributions of source-specific doses follow right-skewed distributions with the following exceptions. As shown in Fig. 3 (runs one through three) and in Table II, the pattern does not occur in simulated populations where the interindividual variation of source-specific dosesare small. Fig. 4 (runs four through six) shows that moderateto-high correlations between source-specific doses increase the likelihood that the pattern would not occur. Finally, variation in the number of sources affecting different individuals in the population and similar measures of central tendency across sourcespecific doses further suppress the occurance of the pattern when the interindividual variation of sourcespecific doses are small ( Fig. 3 and Table II).
As discussed in the introductin, the finding that this pattern widely occurs has implications for assessments of risks from combined exposures. Values of the high-risk MCR close to one and findings of safety for each source of exposure implies that the risk from combined exposures will also be safe. In such cases, there may be no need to determine a population's combined exposures. This is a valuable finding since combined exposure assessments are often difficult to perform and can be resource intensive.
The problem with using high-risk MCR values in determining the need for assessing combined exposures is that the values MCR for a population cannot be determined unless a combined exposure assessment has already been performed (Organisation for Economic Co-operation and Development, 2018). This limits the use of MCR to retroactive analyses. This article presents a solution to this problem by demonstrating relationships between measurable characteristics of source-specific doses in exposed populations that could be used to predict when the "pattern" occurs and high-risk MCR values will be close to one. Specifically, small values of highrisk MCR occur in populations where the distributions of sources-specific doses have average GSDs that are greater than 0.5, have GMs of different sizes, the number of sources-specific doses that reach an individual are constant over the population, and there are low or no correlations between individuals' source-specific doses.
With additional research, it may be possible to develop criteria, or processes, that allow data on interindividual variation to quantitatively predict if risks from combined exposures exceed the risks that occur from separate exposures to individual chemicals and sources. Under such approaches, data would be collected on the number of sources affecting a common population, correlations between the source-specific doses, and the size of the variation in the doses across individuals. Based on such data, estimates would be made on the upper bound of the likely range of the values for the population's highrisk MCR. This upper bound value could be multiplied times the largest upper bound estimate of the source-specific doses for the population. If the resulting estimate of exposure is acceptable, then the assessor would have a basis for concluding that combined risks do not pose a problem for the population.
The use of a multiplicative factor to account for combined exposures has been suggested in the past (KEMI, 2015;Martin, Martin, & Kortenkamp, 2013) but researchers have struggled to develop objective criteria to determine the size of such a factor. The concept of the high-risk MCR and the relationship between the high-risk MCR and the characteristics of distributions of source-specific doses established in this article may be helpful in such endeavors.
Finally, the initial part of the "pattern" (that the sum of separate values is often driven by a small number of the values) is not a novel finding. It has been found to occur in many fields (the majority of sales tend to come from a minority of clients, the majority of charitable donations come from a few contributors, and the majority of one's Facebook posts come from a small number of Facebook friends). The early observation of this behavior was attributed to Vilfredo Pareto and is frequently termed the Pareto principle (Backhaus, 1980;Juran, 2005). The second portion of the pattern (that the proportion of the sum from the largest value increases with the size of the sum for sums of values generated by sampling right-skewed distributions) is believed to be a novel finding.