(www.interscience.wiley.com) DOI: 10.1002/sim.3733 Estimating average annual per cent change in trend analysis

Trends in incidence or mortality rates over a specified time interval are usually described by the conventional annual per cent change (cAPC), under the assumption of a constant rate of change. When this assumption does not hold over the entire time interval, the trend may be characterized using the annual per cent changes from segmented analysis (sAPCs). This approach assumes that the change in rates is constant over each time partition defined by the transition points, but varies among different time partitions. Different groups (e.g. racial subgroups), however, may have different transition points and thus different time partitions over which they have constant rates of change, making comparison of sAPCs problematic across groups over a common time interval of interest (e.g. the past 10 years). We propose a new measure, the average annual per cent change (AAPC), which uses sAPCs to summarize and compare trends for a specific time period. The advantage of the proposed AAPC is that it takes into account the trend transitions, whereas cAPC does not and can lead to erroneous conclusions. In addition, when the trend is constant over the entire time interval of interest, the AAPC has the advantage of reducing to both cAPC and sAPC. Moreover, because the estimated AAPC is based on the segmented analysis over the entire data series, any selected subinterval within a single time partition will yield the same AAPC estimate—that is it will be equal to the estimated sAPC for that time partition. The cAPC, however, is re-estimated using data only from that selected subinterval; thus, its estimate may be sensitive to the subinterval selected. The AAPC estimation has been incorporated into the segmented regression (free) software Joinpoint, which is used by many registries throughout the world for characterizing trends in cancer rates. Copyright © 2009 John Wiley & Sons, Ltd.


INTRODUCTION
Studies of disease incidence and mortality rates over time and across demographic subgroups play an important role in guiding national programs for disease prevention, control, and surveillance. For example, the Annual Report to the Nation on the Status of Cancer provides updated information on cancer incidence and mortality trends in the United States. It is published collaboratively by the American Cancer Society, the Centers for Disease Control and Prevention, the National Cancer Institute (NCI), and the North American Association of Central Cancer Registries. This annual report provides trends in age-adjusted incidence and mortality rates for the top 15 cancers, both long-term and short-term, by sex and race [1].
One popular method of trend analysis is to estimate the conventional annual per cent change (cAPC) for age-adjusted rates [2,3]. The cAPC is estimated by fitting a simple linear model: the logarithm of the yearly age-adjusted rates first is regressed on time, then a transformation of the slope is used to calculate the per cent change per year. The cAPC is easy to calculate and interpret. For long-term trend analysis, however, the linearity of rates on the logarithmic scale, implying a constant rate of change, may not apply over the entire time period of interest.
When the trend is not constant over the entire time period of interest, the nonlinearity of the trend may be characterized using the annual per cent change from segmented analysis (sAPC). This approach assumes that the change in age-adjusted rates is constant over each time partition defined by the transition points, but varies among different time partitions [1,2]. When comparing trends for different groups (such as racial subgroups), different groups may have different transition points and thus different time partitions over which they have constant rates of change; the comparison of group sAPCs is problematic over a common time interval of interest (e.g. the past 5 or 10 years).
For example, the segmented regression analysis for age-adjusted mortality rates for prostate cancer in the U.S. from 1975 to 2001 (Figure 1, based on data from the National Center for Health Statistics, NCHS 2004 [4]) consists of four line segments for whites (1975-1987, 1987 + -1991, 1991 + -1994, and 1994 + -2001) and three line segments for blacks (1975-1988, 1988 + -1993, and 1993 + -2001). Note that we define t + = t + t with t → 0 when t is continuous and t = 1 when t is discrete. Because of the difference in the last transition points for whites and blacks over the time period 1975-2001, the time period of the most current trend is from 1994 + to 2001 for whites, but from 1993 + to 2001 for blacks. Because the time periods of most current trends are different, the estimated most current trends for whites (decreasing 4.2 per cent annually) and for blacks (decreasing 2.7 per cent annually) are not directly comparable. Moreover, the introduction of prostate-specific antigen (PSA) screening and new treatments (most notably the use of androgendeprivation therapy in the adjuvant setting) over the last 10 years raises the question of possible racial disparity in the annual per cent decline over this period. The sAPCs from the segmented regression analysis, however, do not allow direct comparison between blacks and whites over the 10-year time period from 1992 to 2001 because the sAPC for whites was −0.8 per cent for 1991-1994 and −4.2 per cent for 1994 + -2001, while for blacks it was 3.3 per cent from 1988 to 1993 and −0.7 per cent from 1993 + to 2001.
Hence, it is essential to develop a summary measure of trends that takes into account trend transition over a common subtime period-for example the 10 years from 1992 to 2001-for the trend comparison. In addition, a summary measure that applies over the entire time period also is needed so that the overall 1975-2001 trends for whites and blacks can be compared after accounting for different racial trend transitions. Rate per 100,000 Year of Death Motivated by the need for a trend analysis summary measure to facilitate trend comparisons, we propose the average annual per cent change (AAPC) to summarize and compare rates of change that are not constant over a given time period. The AAPC reduces to both the cAPC and the sAPC if the rate of change is constant over the entire time period of interest. The remainder of this paper is organized as follows: Section 2 presents the proposed AAPC. In Section 3, we apply the AAPC estimator to data from the Surveillance, Epidemiology, and End Results (SEER) Program at the NCI (see [5] for more information on the SEER Program) and compare the proposed AAPC with the cAPC. The results and the proposed methodology are discussed further in Section 4.

Conventional annual per cent change (cAPC) and its estimator
Denote the observed rates at time t i as r i with the associated random variable R i , and denote the corresponding expected rate as i = E(R i |t i ), where the n observed ordered time points are Then the annual rate of change is j+1 / j = exp( ) and the cAPC is Denoteˆ as the estimated slope of the regression line (1) based on the data (t i , r i ), i = 1, . . . , n. Then, the estimated cAPC is {exp(ˆ )−1}. Note thatˆ may be estimated by the approach of weighted least squares, with weights as the reciprocals of variances of age-adjusted rates, R i , to take into account the variability in the rates R i .
Note that the rates can take on only positive values, but their logarithms can have unrestricted range, and hence the normal approximation can be used to estimate the parameters. It is, therefore, appropriate to first calculate a range for log rates, and then to convert this back to a range for rates [6].

Segmented regression model
Suppose that the log( i ) is nonlinear over the entire time interval [a, b] and that it follows the segmented linear regression model, that is In this model, the time interval [a, b] is partitioned into k +1 segments by the k transition points j , j = 1, . . . , k, with a = t 1 < ··· <t n 1 where N j = j l=1 n l , and n j represents the number of observed data points between the transition points j−1 and j , that is in the time interval ( j−1 , j ], with k+1 j=1 n j = N k+1 = n. For the purpose of notational convenience, we define 0 = a − and k+1 = t n k+1 = t n = b, where a − = a − t with t → 0 when t i is continuous and t = 1 when t i is discrete. The segmented regression model (2) indicates that the rate of change is constant within each of the (k +1) partitions: ], and ( k , t n ], with the constant rate of change corresponding to the partition ( j−1 , j ] being exp( j ), and the annual per cent change from segmented analysis (sAPC) over ( j−1 , j ] being {exp( j )−1}, j = 1, . . . , k +1. We can rewrite equation (2) as with the constraints: These constraints guarantee the continuity of (2) at points.

Proposed summary measure AAPC
Assume that log( i ) is nonlinear over the time interval [a, b] and follows the segmented regression model (2). We propose to summarize the (k +1) change rates in the entire time interval [a, b] with the AAPC, conditional on the transitional points j , j = 1, . . . , k. The AAPC is defined as When the t i are equally spaced, the weights reduce to w j = n j /n, with n j representing the number of observed data points between the transition points j−1 and j , ( j−1 , j ]. Note that the weights w j are proportional to the corresponding lengths of the time partitions, and they are proportional to the numbers of data points within the partition ( j−1 , j ], j = 1, . . . , k +1, when the t i are equally spaced. When the rate of change is constant over the entire time period [a, b], it is clear that the AAPC in equation (3) reduces to both the cAPC that is commonly used and the sAPC from the segment analysis; otherwise, the AAPC is the geometric mean of the annual changes from all of the partitions. More specifically, from (3) When t i are equally spaced, this reduces to The fact that ( +1) is the geometric mean of the annual changes exp( j ) or, equivalently, log( +1) is the arithmetic mean of log(annual change), j , from all the partitions, motivates us to name the summary measure the AAPC. Thus, the AAPC is obtained easily if we know the sAPC in each of the k +1 partitions. For example, if we know that the three sAPC values in [a, b] are 10 per cent, −3 per cent, and 2 per cent (i.e. k = 2) with equal weighting (say, each line segment has six data points, i.e. n j = 6) and the t i are equally spaced, then the AAPC over the 18-year time period [a, b] is given by = 3 √ (1.10)(0.97)(1.02)−1 = 0.029 = 2.9 per cent. This means that the age-adjusted rates increased 2.9 per cent annually on average during the 18-year period of time.
Using the delta method, a general approach for variance estimation of functions of random variables [7,8], we estimate the standard error ofˆ bŷ whereˆ 2 j is the variance estimator ofˆ j . Note that, as mentioned earlier, theˆ j may be taken to be the weighted least-squares estimators to take into account the variability in R i .
Statistical significance tests pertaining to the departure of the AAPC from zero or comparison of AAPCs between groups for a given time period can be performed easily on the log scale. To construct a confidence interval (CI) for or to compare from different groups, the distribution of is needed. Under the standard assumption that the counts for the age-adjusted rates are Poisson and the person-years in the denominator are fixed constants [6], for large counts the age-adjusted rates on the log scale, that is log(R i ), are asymptotically normally distributed. Consequently,ˆ j and log(ˆ +1), as well, are asymptotically normal. Hence, the CIs for the AAPC can be calculated first on the log scale, under the asymptotic normality of log(ˆ +1), and then transformed back through exponentiation. More specifically, the 100(1− ) per cent lower ( L ) and upper ( U ) confidence limits for the AAPC, , are given by where z is the th quantile of the standard normal distribution,ˆ is defined as in equation (4), andˆ 2 j is the variance estimator ofˆ j as in equation (5). If the CI contains zero, then there is no evidence to reject the null hypothesis H 0 : = 0 at the significance level of ; otherwise, we reject H 0 in favor of H 1 : = 0 and conclude that the rate of change is increasing on average over the time interval [a, b] if the lower confidence limit L is positive, or that the rate of change is decreasing if the upper confidence limit U is negative. If p-values are preferred, rather than the 95 per cent CIs, the computation of p-values is straightforward under the asymptotic normality.
confidence limits for the ratio ( 1 +1)/( 2 +1) are Note that, if the CI contains unity, there is no evidence to reject the null hypothesis H 0 : ( 1 +1)/( 2 +1) = 1 at the significance level of ; otherwise, we reject H 0 in favor of H 1 : ( 1 +1)/( 2 +1) = 1 and conclude that, on average, the annual rate of change for Group 1 is more rapid than for Group 2 over the time interval [a, b] if L >1 or, alternatively, the rate of annual change for Group 1 is slower than for Group 2 if U <1.

AAPC for subtime intervals of specified length
In addition to summarizing and comparing the (long-term) trend for the entire time period of [a, b] during which the data are observed, the AAPC also may be used to summarize any short-term trend for any subinterval [s 1 , and the AAPC (s 1, s 2 ) is estimated bŷ where w j (s 1 , s 2 ) are the normalized weights over [s 1 , s 2 ], denoting the proportion of the length (s 2 −s 1 ) that falls within the partitioning intervals ( j−1 , j ], j = 1, . . . , k +1. Specifically, . . , k +1 Note that we use the notation 0 = a − and k+1 = b in w j (s 1 , s 2 ).
When t i is discrete and equally spaced, the weights reduce to  (6) and (7)

EXAMPLES
As an illustration, the sex-specific age-adjusted cancer incidence rates and trends, including the cAPC, sAPC, and AAPC, for the top 15 cancers, were calculated using 1975-2002 populationbased cancer registry data (SEER-9) collected by the SEER Program at the NCI. Since 1973, the SEER Program has collected data on all primary cancers occurring in residents of defined geographic regions. The Program currently collects and publishes cancer incidence and survival data covering approximately 26 per cent of the U.S. population (see www.seer.cancer.gov for detailed information).
These yearly age-adjusted cancer incidence rates also are adjusted for reporting delays (see      Table I represents sex-specific results of estimated sAPCs for each of the line segments from the segmented regression analyses, the estimated cAPCs, and AAPCs for the entire time period  and for the last 10 years (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002), and their 95 per cent CIs as well. The trends of these top 15 cancer sites cover from 0 to 3 transition points.
When there are no transition points in the entire data series (e.g. leukemia for males and for females), the incident trend is linear on the log scale. Thus, exactly the same estimates are obtained for the cAPC, sAPC, and AAPC from the entire data series. The estimated sAPCs and AAPCs for the last 10-year period are exactly the same as those for the entire 28-year period. The estimated cAPCs for the last 10-year period, however, generally differ from the estimated cAPCs for the entire 28-year period. Moreover, comparing the cAPCs for the 10-year sub-time period and for the entire time period may lead to different statistical conclusions at the significance level of 0.05, although there are no transition points in the entire data series. For example, for the entire 28 years from 1975 to 2002, the estimated cAPC (also the sAPC and the AAPC) of male leukemia incidence rates is not statistically significantly different from 0; nevertheless, the estimated cAPC for the last 10 years (i.e. 1993-2002) indicates a statistically significant annual increase rate of 0.6 per cent. The estimated cAPC for 1993-2002 is not statistically significantly different from 0 for female stomach cancer incidence rates, in contrast the statistically significantly decreasing annual rate of 1.7 per cent during the entire period of 1975-2002. When there is at least one transition point between 1975 and 2002 (that is, the assumption of a linearity of log incidence rates over the entire time period is not supported by the data), the cAPC tends to have narrower 95 per cent CIs than that of the AAPC, which indicates that the variance of the estimated cAPC tends to be underestimated when the linearity assumption does not hold. Furthermore, the estimated cAPCs are more likely to show a statistically significant difference from the null value than the AAPC estimates. For example, the segmented regression analysis reveals three transition points over the entire time period of 1975-2002 for male urinary bladder cancer incidence rates, and both the estimated AAPC and the estimated cAPC indicate that the male urinary bladder cancer incidence rate increases at 0.3 per cent annually, but the cAPC estimate indicates that this upward trend is statistically significantly different from zero, while the AAPC estimate does not. The AAPC estimate also indicates that during the 1975-2002 time period, the overall incidence trend for male pancreas cancer is flat (estimated AAPC = −0.6 per cent with the 95 per cent CI (−1.3 per cent, 0.1 per cent)), but the cAPC estimate shows a statistically significant downward trend.
For the example of the U.S. age-adjusted mortality rates for prostate cancer cited in the Introduction, the 10-year estimated AAPC from 1992 to 2001 is −3.2 per cent for whites and −1.5 per cent for blacks and both are statistically significantly different from zero. Thus, they suggest the possible benefits of PSA screening and treatment for both whites and blacks. The estimated relative rate of annual mortality change for whites, that is (ˆ 1 +1)/(ˆ 2 +1), is 0.98 times that of blacks and we are 95 per cent confident that the relative rate of annual mortality change for whites versus blacks, that is ( 1 +1)/( 2 +1), is somewhere between 0.975 and 0.990, where subscript 1 indicates white and subscript 2 indicates black. Therefore, the downward mortality trend in the 10 years is faster for whites than for blacks on average, which suggests that whites have derived a larger benefit from the recent cancer control advances than blacks.
Quite often in the literature, the trends were presented, rather than the data points (i.e. rates) themselves. This is a situation where the AAPC can still be applied. For instance, the trend of high school students who were current users of cigarettes from 1991 to 2005 was presented as two segmented annual per cent changes (sAPC): increasing 5.29 per cent from 1991 to 1997, and then decreasing

DISCUSSION
We propose the AAPC for use in summarizing and comparing trends that may not be constant over a given time period. The proposed AAPC takes into account trend transitions, whereas the cAPC does not and thus one can be led to erroneous conclusions on statistical significance. In addition, when there are no changes in trends, (a) the AAPC reduces to the cAPC and sAPC and (b) the AAPC for any subtime intervals of specified length is exactly the same as the AAPC (and the sAPC) over the entire time interval, whereas the estimated cAPC may vary as the chosen subtime interval changes. More generally, because the estimated AAPC is based on the segmented analysis over the entire data series, any selected subinterval within a single time partition will yield the same AAPC estimate; that is it will be equal to the estimated sAPC for that time partition. There are several specific reasons for using the AAPC instead of the cAPC and sAPC, especially in important publications that summarize national trends in cancer incidence and mortality (e.g. Howe et al. [14]). First, a cAPC is re-estimated over the selected subinterval, and may be sensitive to the selected subinterval; it also assumes the linearity of the trend over the subinterval. For example, in Tables 4 and 5 of Howe et al. [14], cAPCs are compared across cancer sites, gender, and racial/ethnic groups for the period 1995-2003. A more robust analysis would be to substitute AAPCs for the cAPCs.
Second, in Tables 2 and 3 in Howe et al. [14], sAPC's are compared across cancer sites and gender. Interest often focuses on the statistical significance of the sAPC for the final segment, with the segment characterized as 'rising' or 'falling' if the final sAPC is statistically different from zero and as 'stable' if it is not. Comparison of these characterizations across groups may not be appropriate, however, because the standard error of an sAPC is related to the length of the segment, and final segment lengths can vary widely for different groups. For example, when 2002 was the most recent data point [15] the delay-adjusted incidence for thyroid cancer in males had an sAPC from 1980 to 2000 of 2.2 per cent, which was statistically significant, and an sAPC from 2000 to 2002 of 11.6 per cent, which was not statistically significant; the delay-adjusted incidence for thyroid cancer in females had an sAPC from 1993 to 2002 of 5.3 per cent, which was statistically significant. Thus, a characterization of the most recent segment trend for thyroid cancer in males would be 'stable', while the characterization of the most recent segment trend for thyroid cancer in females would be 'rising'. This comparison is not appropriate because of the radically different lengths of the final segment. To make more compatible comparisons between males and females, it is useful to compute the AAPC over the same fixed interval for both series. For example, the AAPC for 1993-2002 is 4.2 per cent for males and 5.3 per cent for females (each characterized as rising as they are both statistically significant). The AAPC for 1998-2002 is 6.8 per cent for males and 5.3 per cent for females (each characterized as rising as they are both statistically significant).
As shown in the cigarette smoking example, one additional advantage of the AAPC method is that it can be used in a situation where the trends were presented. Quite often in the literature, the trends, rather than the data points themselves, were presented.
For these reasons, we recommend using the AAPC instead of the cAPC for summarizing and comparing trends over a specified time interval. Use of the AAPC is not meant to replace sAPCs from a segmented regression analysis, because sAPCs provide a detailed picture of trends over time. When a summary trend is needed over a specified time interval, however, the AAPC provides an essential complement to the more detailed results.
We have incorporated the AAPC estimation into Joinpoint, the segmented regression analysis software program (available at http://srab.cancer.gov/joinpoint) that is used by many cancer registries throughout the world for characterizing trends in cancer rates. The software reports and compares AAPCs directly as an integral part of results from segmented regression analyses. The Annual Report to the Nation on the Status of Cancer and the NCI's annual publication Cancer Statistics Review started using the AAPC in 2008.
It is a common assumption that counts follow the Poisson distribution and the person-years denominator are fixed constants [6,16]. It is important to investigate the performance of AAPC, cAPC, and sAPC when the assumptions do not hold. This is an issue for further research.