#### Experimental

Two parameters from the model, namely the substrate and enzyme concentrations (E_{0} and S_{0}), can be readily varied in experiments, and we therefore firstly compared measurements and modeling in trials in which S_{0} and E_{0} were systematically changed. Figure 3A shows a family of calorimetric measurements in which Cel7A was titrated to different initial substrate concentrations (S_{0} in μm of reducing ends – this unit can be readily converted into a weight concentration using the molar mass of a glycosyl unit and the average chain length for the current substrate, DP = 220 glycosyl units). The concentration of Cel7A was 50 nm in these experiments and the experimental temperature was 25 °C. Figure 3B shows model results for the same values of E_{0} and S_{0}. Here, we used the model in Eqn (3) [Eqns (4) and (5)] and manually adjusted the kinetic constants and *n* by trial and error. The parameters in Fig. 3B are *k*_{1} = 0.0004 s^{−1}·μm^{−1}, *k*_{2} = 0.55 s^{−1}, *k*_{3} = 0.0034 s^{−1} and *n* = 150. Comparison of the two panels shows that the idealized description of processive hydrolysis in Eqn (3) cannot account for the overall course of the process, but some characteristics, both qualitative and quantitative, are captured by the model. For example, the model accounts well for the diminished burst (i.e. the disappearance of the maximum) at low S_{0} (below 5–10 μm). In these dilute samples, the rate of cellobiose production C′(*t*) increases slowly to a level which is essentially constant over the time considered in Fig. 3. At higher S_{0}, a clear maximum in C′(*t*) signifies a burst phase in both model and experiment. On a quantitative level, comparisons of the maximal rate at the peak of the burst (*t* = 150 s in Fig. 3C) and after the burst (*t* = 1400 s in Fig. 3C) showed a reasonable accordance between experiments and model. In addition, the substrate concentration that gives half the maximal rate (5–10 mm) is similar to within experimental scatter (Fig. 3C). Conversely, two features of the experiments do not appear to be captured by Eqn (3). Firstly, the model predicts a sharp termination of the burst phase, which tends to produce a rectangular shape of the C′(*t*) function at high S_{0} (Fig. 3B). This is in contrast with the experiments which all show a gradual decrease in C′(*t*) after the maximum. Secondly, the model suggests a constant C′(*t*) well within the time frame covered in Fig. 3, but no constancy was observed in the experiments. We return to this after discussing the effect of changing E_{0}.

Figure 4 shows a comparison of the calorimetric measurements and model results for a series in which the enzyme load was varied and S_{0} was kept constant at 40.8 μm reducing ends. The model calculations were based on the same parameters as in Fig. 3 without any additional fitting, and it appears that C′(*t*) increases proportionally to E_{0}. This behavior, which was seen in both model and experiment, implies that the turnover number C′(*t*)/E_{0} is constant over the studied range of time and concentration, and this, in turn, suggests that the extent of the burst scales with E_{0}. To analyze this further, π_{processive} was estimated from the data in Fig. 4. For the model results (Fig. 4B), this is simply done by inserting the kinetic parameters in Eqn (6). For the experimental data, we first numerically integrated the rates in Fig. 4A to obtain the concentration of cellobiose C(t), and then extrapolated linear fits to the data between 1400 and 1600 s to the ordinate as illustrated in the inset of Fig. 5. In analogy with the procedure used for nonprocessive enzymes (Fig. 1A), this intercept between the extrapolation and the C(t) axis was taken as a measure of the experimental π_{processive}.

The proportionality of the theoretical π_{processive} and E_{0} seen in Fig. 5 follows directly from Eqn (6). The slope of the theoretical curve is about 42, suggesting that each enzyme molecule completes 42 catalytic cycles (produces 42 cellobiose molecules) during the burst phase. This is about three times less than the obstacle-free path (*n*), which is 150 in these calculations, and this discrepancy simply reflects that *k*_{1}S_{0} is too small for the simple relationship π_{processive} = *n*E_{0} to be valid (see Theory section). Thus, low *k*_{1} and the concomitant slow ‘on rate’ tend to smear out the burst and, consequently, π_{processive}/E_{0} < *n*. This is a general weakness of the extrapolation procedure [17,18], also visible in Fig. 1, where the dotted line intersects the ordinate at a value slightly less than E_{0}. It occurs when the rate constants and S_{0} attain values that make the fractions on the right-hand side of Eqns (2) and (6) smaller than unity (this implies that the criteria for a simple π expression, *k*_{1}S_{0} >> *k*_{3} + *k*_{−1} and *k*_{2} >> *k*_{3}, discussed in the Theory section, are not met [17,18]). More importantly, the experimental data also show proportionality between π_{processive} and E_{0} with a comparable slope (about 65), and this supports the general validity of Eqn (3).

We now return to the two general shortcomings of Eqn (3) which were identified above: (a) the abrupt termination of the modeled burst phase (Fig. 3B), which is evident for high S_{0} and not seen in the experiments; and (b) the regime with constant C′(*t*) (see, for example, *t* > 500 s in Fig. 4B and inset in Fig. 6), which is also absent in the measurements. We suggest that, at least to some extent, (a) is a consequence of the ‘polydispersity’ in *n* in a real substrate and (b) depends on the random inactivation of the enzyme. As discussed in the Theory section, simplified descriptions of these properties may be included in the model, and these modifications considerably improve the concordance between theory and experiment. To illustrate this, we considered a substrate distribution with five subsets (each 20% of S_{0}) with *n* = 40, 70, 100, 130 and 160, respectively. We analyzed the initial 1700 s of all trials in Fig. 3 using Eqn (5) and the nonlinear regression routine in Mathematica 7.0. It was found that, above S_{0} ∼ 15 μm, the parameters derived from each calorimetric experiment were essentially equal, and we conclude that one set of parameters can describe the results in this concentration range. The parameters were *k*_{2} = 1.0 ± 0.2 s^{−1}, *k*_{3} = 0.0015 ± 0.0003 s^{−1} and *k*_{1}S_{0} = 0.0052 ± 0.001 s^{−1}, and some examples of the results are shown in Fig. 6. Parameter interdependence was evaluated partly by the confidence levels given by Mathematica and partly by ‘grid searches’, which provide an unambiguous measure of parameter dependence [28,29] and hence reveal possible overparameterization. In the latter procedure, the standard deviation of the fit was determined in sequential regressions, where two of the rate constants were allowed to change, whilst the third was inserted as a constant with values slightly above or below the maximum likelihood parameter [28,29]. These analyses showed moderate parameter dependence with 95% confidence intervals of about ±10% (slightly asymmetric with larger margins upwards). This limited parameter interdependence is also illustrated in the correlation matrix in Data S1, which shows that all correlation coefficients are below 0.7, and we conclude that it is realistic to extract three rate constants from the experimental data. The parameters from this regression analysis may be compared with recent work [30], which used an extensive analysis of reducing ends in both soluble and insoluble fractions to estimate apparent first-order rate constants for processive hydrolysis and enzyme–substrate disassociation, respectively. Values for the system investigated in Fig. 6 (i.e. *T. reesei* Cel7A and amorphous cellulose) were 1.8 ± 0.5 s^{−1} (hydrolysis) and 0.0032 ± 0.0006 s^{−1} (dissociation) at 30 °C [30]. The concordance of these values, which were derived by a completely different approach, and *k*_{2} and *k*_{3} from Fig. 6 provides strong support of the molecular picture in Eqn (3). With respect to the ‘on rate’, it is interesting to note that a constant value of *k*_{1} provided very poor concordance between theory and experiment (not shown), whereas constant *k*_{1}S_{0} gave satisfactory agreement (Fig. 6). This suggests that the initiation of hydrolysis (adsorption to the insoluble substrate and ‘threading’ of the cellulase) exhibits apparent first-order kinetics. This may reflect the reduced dimensionality or fractal kinetics, which has previously been proposed for cellulase activity on insoluble substrates [31,32], and it appears that the current approach holds some potential for systematic investigations of this phenomenon.

The model could not account for the measurements at the lowest S_{0}, and this may reflect the fact that the assumption S_{0} >> E_{0}, used in the derivation of the expression for C′(*t*), becomes unacceptable. Thus, the concentration of reducing ends S_{0} : E_{0} ranges from 30 to 2200 in this work (for S_{0} = 15 μm, it is 300). If, however, we use instead the accessible area of amorphous cellulose, which is about 42 m^{2}·g^{−1} [33], and a footprint of 24 nm^{2} for Cel7A [34], we find an S_{0} : E_{0} area ratio (total available substrate area divided by monolayer coverage area of the whole enzyme population) which is an order of magnitude smaller (3–240). These latter numbers are rough approximations as the average area of randomly adsorbed enzymes will be larger than the footprint, and only a certain fraction of the enzyme will be adsorbed in the initial stages. Nevertheless, the analysis suggests that not all reducing ends are available in amorphous cellulose, and hence the deficiencies of the model at substrate concentrations below 15 μm could reflect the fact that the premise S_{0} >> E_{0} becomes increasingly unrealistic.

The results in Fig. 6 are for the fixed average and distribution of *n* mentioned above. We also tried wider or narrower distributions with five subsets, distributions with 10 subsets and distributions with a predominance of *n* values close to the average (e.g. 5%, 20%, 50%, 20%, 5%, instead of equal amounts of the five subsets). The regression analysis with these different interpretations of *n* polydispersity gave comparable fits and parameters. In addition, average *n* values of 100 ± 50 were found to account reasonably for the measurements, and we conclude that detailed information on the obstacle-free path *n* will require a broader experimental material, particularly investigations of different types of substrate.

We consistently found that the experimental C′(*t*) fell below the model towards the end of the 1-h experiments (see inset in Fig. 6). For a series of 4-h experiments (not shown), this tendency was even more pronounced. This was interpreted as protein inactivation, as discussed in the Theory section. Numeric analysis with respect to Eqn (7) showed that the inclusion of inactivation and the same polydispersity as in Fig. 6 enabled the model to fit the data reasonably over the studied time frame for S_{0} above approximately 15 μm. Some examples of this for different S_{0} are shown in Fig. 7.

The parameters from the analysis in Fig. 7 were *k*_{1}S_{0} = (5.2 ± 1.6) × 10^{−3} s^{−1}, *k*_{2} = 1 ± 0.3 s^{−1}, *k*_{3} =*k*_{−1} = (1.2 ± 0.6) × 10^{−3} s^{−1} and *k*_{4} = (2 ± 0.7) × 10^{−4} s^{−1}. The parameter dependence of these fits is illustrated in the correlation matrix in Data S1. It appears that *k*_{3} and *k*_{4} show some interdependence, with an average correlation coefficient of 0.88, whereas other correlation coefficients are low or very low. This result supports the validity of extracting four parameters from the analysis in Fig. 7. The parameters for *k*_{1}S_{0}, *k*_{2} and *k*_{3} are essentially equal to those from the simpler analysis in Fig. 6, and the inactivation constant *k*_{4} is about an order of magnitude lower than *k*_{3}. The rates in Fig. 7 were integrated to give the concentration C(*t*), and two examples are shown in Fig. 8. In this presentation, the accordance between model and experiment appears to be better, and this underscores the fact that the rate function C′(*t*) provides a more discriminatory parameter for modeling than does the concentration C(*t*). Figure 8 also shows that the percentage of cellulose converted during the experiment (right-hand ordinate) ranges from a fraction of a percent for the higher to a few percent for the lower S_{0} values.

The qualitative interpretation of Fig. 7 is that Cel7A produces a burst in hydrolysis when enzymes make their initial ‘rush’ down a cellulose strand towards the first encounter with a ‘check block’, and then enters a second phase with a slow, single-exponential decrease in C′(*t*) as the enzymes gradually become inactivated. In this latter stage, all enzymes have encountered a ‘check block’ and, in this sense, it corresponds to the constant rate regime in Fig. 2. Unlike in Fig. 2, however, C′(*t*) is not constant, but decreasing, as dictated by the rate constant of the inactivation process *k*_{4}. In this interpretation, the extent of inactivation scales with enzyme activity (number of catalytic steps) and not with time. Hence, for any enzyme–substrate complex EC_{n−i}, the probability of experiencing inactivation when it moves one step to the right in Eqn (7) is . For the parameters in Fig. 7, this translates to about one inactivation for every 5000 hydrolytic steps, which is consistent with the frequency of inactivation (1 : 6000) suggested for a cellobiohydrolase working on soluble cello-oligosaccharides [35]. As the final C(*t*) is about 40 μm in Fig. 8, and we used E_{0} = 50 nm, each enzyme has performed about 800 hydrolytic steps in these experiments. With a probability of 2 × 10^{−4}, some inactivation can be observed within the experimental time frame used here, and this is further illustrated in Fig. 11. It is also interesting to note that the probability of hydrolysis of an EC_{n−i} complex (*k*_{2}) is about 800 times larger than the probability of disassociation (*k*_{3}), and hence a processivity of that magnitude would be expected for an ideal, ‘obstacle-free’ cellulose strand.

The notion of two partially overlapping phases of the slowdown is interesting in the light of the experimental observations of a ‘double exponential decay’ reported for the rate of cellulolysis [6,36–38]. In these studies, hydrolysis rates for quite different systems were successfully fitted to empirical expressions of the type C′(*t*) = Ae^{−αt} + Be^{−βt}. This behavior has been associated with two-phase substrates (high and low reactivity) [37], but, in the current interpretation, it relies on the properties of the enzyme. The first (rapid) time constant *α* reflects the gradual termination of the burst as the enzymes encounter their first ‘check block’, and the second (slower) constant *β* represents inactivation and is related to *k*_{4} in Eqn (7). As the extent of the first phase will scale with the amount of protein, this interpretation is congruent with the proportional growth of π_{processive} with E_{0} shown in Fig. 5. This enzyme-based interpretation of the double exponential decay predicts that a second injection of enzyme to a reacting sample would generate a second burst (whereas a second burst in C′(*t*) would not be expected if the slowdown relied on the depletion of good substrate). Figure 9 shows that a second dosage of Cel7A after 1 h indeed gives a second burst, which is similar to the first, and this further supports the current explanation of the double exponential slowdown.

In the last section, we show two examples of how the analysis of the kinetic parameters may elucidate certain aspects of the activity of Cel7A. First, we consider changes in the ratio *k*_{1}S_{0}/*k*_{3}. This reflects the ratio of the ‘on rate’ and ‘off rate’. At a fixed *k*_{2}, a change in this ratio may be interpreted as a change in the affinity of the enzyme for the substrate. Hence, we can assess relationships of this ‘affinity parameter’ and the hydrolysis rate C′(*t*). The results of such an analysis using S_{0} = 25 μm and the simple model [Eqn (3)] are illustrated in Fig. 10. The black curve, which is the same in all three panels, represents the cellobiose production rate C′(*t*), calculated using the parameters from Fig. 3. Figure 10A illustrates the effects of increased ‘affinity’, inasmuch as *k*_{1}/*k*_{3} is enlarged by factors of two, three and five for the red, green and blue curves, respectively. This was performed by both multiplying the original *k*_{1} and dividing the original *k*_{3} by, and , respectively. It appears that these changes strongly promote the initial burst, but also decrease the rate later in the process (the curves cross over around *t* = 300 s). This decrease in C′(*t*) is mainly a consequence of smaller *k*_{3} values (‘off rates’), which make the release of enzymes stuck in front of a ‘check block’ the rate-limiting step [the population of inactive EC_{x} in Eqn (3) increases]. Figure 10B shows the results when the *k*_{1}/*k*_{3} ratio is decreased in an analogous fashion. This reduces C′(*t*) over the whole time course, and this is mainly because the population of unbound (aqueous) enzyme becomes large when *k*_{1} (the ‘on rate’) is diminished. The blue curves in Fig. 10B, C also illustrate how a moderate increase in *k*_{3} tends to abolish the burst (maximum) in C′(*t*) altogether. This is because the inhibitory effect of the ‘check block’, as defined by the broken line in Fig. 2, becomes unimportant when the release rate is increased. Multiplying both *k*_{1} and *k*_{3} by , and , respectively, will obviously not change the ratio (or ‘affinity’), but will speed up both adsorption and desorption, and hence increase the rate of hydrolysis (Fig. 10C).

For the model in Eqn (7), the enzyme is distributed between four states: aqueous (E), catalytically active (EC_{n−i}), stuck at ‘check block’ (EC_{x}) or inactivated (IC_{n−i}). These enzyme concentrations can be numerically derived from the parameters found in Fig. 7. Figure 11 shows an example of such an analysis for E_{0} = 50 nm and S_{0} = 37.4 μm (i.e. corresponding to the middle panel in Fig. 6). It appears that the concentration of free enzyme (E) decreases for about 10 min and then reaches a near-constant (slowly decreasing) level which is about 20% of E_{0}. This calculated course of E(*t*) is in line with earlier experimental results on different types of substrate [39–43]. In addition, an 80% reduction in free enzyme after about 10 min matches our own adsorption measurements for a mixture of *T. reseei* cellulases on amorphous cellulose (L. Murphy, unpublished data). The population of catalytically active enzyme is highest (and about 25% of E_{0}) after a few minutes, but decreases at later stages, as a growing fraction of the enzyme becomes stuck in front of a ‘check block’. After about 12 min, this population is well over half of E_{0} and this transition from active EC_{n−i} to stuck EC_{x} is the origin of the burst in cellobiose production. As the inactivation of enzyme in Eqn (7) is modeled as an irreversible transition, the concentration of this species grows monotonically. This behavior also appears from Fig. 11, but further analysis of IC_{n−i} is postponed until calorimetric trials over extended time frames (and hence more precise values of *k*_{4}) become available.

In summary, we have proposed an explicit model that describes the initial burst and subsequent slowdown in the rate of cellobiose production for processive enzymes such as Cel7A. The focus is on the initial phase of the process, where inhibition from accumulated product and/or the depletion of good attack points on the substrate are of minor importance. We found that a burst and slowdown may indeed occur as a consequence of obstacles to processive movement, on the one hand, and the relative size of rate constants for adsorption, processive hydrolysis and desorption, on the other. This interpretation is analogous to that conventionally used for the description of burst phases in systems with soluble substrates and nonprocessive enzymes. The theory was tested against calorimetric measurements of the hydrolysis of amorphous cellulose by *T. reesei* Cel7A. No other enzymes or substrates were investigated, and the conclusions thus only pertain directly to this system. We note, however, that, if the origin of the slowdown is linked to low dissociation rates (low *k*_{3}), as suggested here, an analogous burst behavior should be expected on other substrates, and it appears relevant to conduct such measurements. We found that some experimental hallmarks were reproduced in a simple burst model, where the only cause of the slowdown was a protracted release of enzyme that had reached the obstacle on the cellulose chain. However, to account more precisely for the experimental data, it was necessary to consider enzyme inactivation as well as some heterogeneity in the obstacle-free path length. We implemented the former as an irreversible inactivation step that competed with the production of cellobiose in each hydrolytic cycle. The result was a more complex model which could explain the ‘double exponential decay’ in the rate of cellobiose production which has been reported in several earlier studies. Thus, in this interpretation, the fast component in the double exponential decay reflects the first sweep of each cellulase down a cellulose strand, whereas the slow component is ascribed to random inactivation which is unrelated to the stage of the process. It has recently been stated that ‘processivity is more about disassociation than about the rate of hydrolysis’ [44], and a pronounced improvement in activity has indeed been observed in an enzyme variant with diminished processivity [45]. We suggest that the models presented here may be useful in attempts to elucidate and rationalize such interrelationships of activity and processivity.