Detecting Earnings Management: A New Approach

Authors

  • PATRICIA M. DECHOW,

    1. The Haas School of Business, University of California, Berkeley
    Search for more papers by this author
  • AMY P. HUTTON,

    1. Carroll School of Management, Boston College
    Search for more papers by this author
  • JUNG HOON KIM,

    1. School of Accounting, Florida International University. We are grateful for the comments of the referee, Phil Berger (the Editor), Frank Ecker, Jennifer Francis, Joseph Gerakos (the discussant), Maureen McNichols, Per Olsson, Katherine Schipper and workshop participants at the 2011 Journal of Accounting Research Conference, the 2011 American Accounting Association Annual Meetings, the University of Arizona, the University of California Los Angeles, Brigham Young University, the University of Houston, the University of Texas at Austin, and the University of Washington.
    Search for more papers by this author
  • RICHARD G. SLOAN

    1. The Haas School of Business, University of California, Berkeley
    Search for more papers by this author

ABSTRACT 

This paper provides a new approach to test for accrual-based earnings management. Our approach exploits the inherent property of accrual accounting that any accrual-based earnings management in one period must reverse in another period. If the researcher has priors concerning the timing of the reversal, incorporating these priors can significantly improve the power and specification of tests for earnings management. Our results indicate that tests incorporating reversals increase test power by around 40% and provide a robust solution for mitigating model misspecification arising from correlated omitted variables.

1. Introduction

Earnings management is an important accounting issue for academics and practitioners alike.1 A large body of academic research examines the causes and consequences of earnings management.2 A major limitation of this research is that existing techniques for measuring earnings management lack power and are often misspecified. The most common techniques for measuring earnings management attempt to isolate the “discretionary” portion of the accrual component of earnings. The limitations of such techniques are enumerated in Dechow, Sloan, and Sweeney [1995]. The techniques lack power for earnings management of plausible magnitudes because of the poor ability of the models to isolate discretionary accruals. Moreover, tests using these techniques are misspecified due to correlated omitted variables in samples with extreme financial performance, a situation that is not uncommon in tests for earnings management. Few improvements have been made since Dechow, Sloan, and Sweeney [1995] (DSS hereafter). Alternative techniques have been proposed for modeling accruals (e.g., Dechow and Dichev [2002]), but whether they provide improvements for detecting earnings management is questionable (e.g., Wysocki [2009]). Performance matching procedures have been adopted to mitigate misspecification (e.g., Kothari, Leone, and Wasley [2005]), but cause substantial reductions in test power and are only effective when the matching procedure employs the relevant omitted variable.

In this paper, we propose a new approach for the detection of earnings management that simultaneously improves test power and specification. Our approach exploits an inherent characteristic of accrual-based earnings management that has gone largely ignored in previous research. Specifically, we recognize that any accrual-based earnings management in one period must reverse in another period.3 If the researcher has reasonable priors concerning the period(s) in which the hypothesized earnings management is expected to reverse, the power and specification of tests for earnings management can be significantly improved by incorporating these reversals. For example, if the researcher correctly identifies the periods in which earnings management originates and reverses, incorporating reversals essentially doubles the amount of variation in discretionary accruals that is attributable to the hypothesized earnings management. Our calibrations suggest that doing so would increase test power by around 40% in typical earnings management studies.4

In addition to improving test power, incorporating reversals in tests of earnings management also mitigates misspecification arising from correlated omitted variables. For example, firm size has been identified as a potentially important correlated omitted variable in tests for earnings management (e.g., Ecker et al. [2011]). In particular, large firms tend to have lower nondiscretionary accruals because they have lower growth prospects. Thus, a researcher testing for evidence of downward earnings management in a sample of firms that are large relative to the control firms may incorrectly conclude that earnings are being managed downward. This problem is mitigated by incorporating reversals, since the researcher would also test for evidence of higher accruals arising from the reversal of the earnings management in an adjacent period. But, because firm size is a persistent economic characteristic, the related accruals would also be lower in adjacent periods. This would inform the researcher that the lower accruals are not attributable to earnings management. Note that the researcher does not have to identify the relevant omitted variables for tests incorporating reversals to mitigate the associated misspecification. As long as the omitted variables do not completely reverse in the same period as the earnings management reversal, test specification is improved.5 We show that incorporating accrual reversals provides a robust solution to mitigating misspecification across a variety of economic characteristics.

Our paper proceeds as follows. First, we develop an econometric framework to summarize common tests for earnings management and highlight their associated problems. Next, we introduce a flexible procedure for addressing these problems by incorporating researchers’ priors concerning the reversal of discretionary accruals in tests for earnings management. This procedure requires the researcher to identify the period(s) in which accruals are predicted to be managed and the period(s) in which this accrual management is predicted to reverse. A standard test for joint significance is then used to test for earnings management. Our procedure is readily adapted to all common models of nondiscretionary accruals.

We next evaluate the power and specification of tests incorporating accrual reversals relative to the traditional t-test on discretionary accruals and to the Kothari, Leone, and Wasley [2005]t-test on performance-matched discretionary accruals. Following DSS, we use four sets of analyses to evaluate the competing tests for earnings management. First, we evaluate the specification of the tests in samples of historical data using randomly assigned earnings management years. We show that all tests, including our new tests, are reasonably well specified in random samples. Second, we conduct simulations using archival data and seeded earnings management to examine how incorporating discretionary accrual reversals enhances the power of tests for earnings management. The simulations indicate that, if the researcher's priors about the reversal year are as accurate as the priors about the earnings management year, then incorporating reversals can increase test power by around 40%. These simulations also show that test power increases even in cases where the researcher is less than half as accurate in identifying the reversal year relative to the earnings management year. Third, we evaluate the power of the tests in a sample where the SEC alleges firms have overstated earnings. We find that incorporating accrual reversals in the period(s) following the alleged overstatements substantially increases the power of tests for earnings management in this sample. The gains in power dwarf any gains arising from choosing between different models of nondiscretionary accruals. Fourth, we evaluate the specification of tests in samples of historical data with extreme economic characteristics. We show that standard t-tests are highly misspecified and that performance matching only mitigates misspecification when the matching procedure employs the relevant omitted variable. For example, the commonly followed Kothari, Leone, and Wasley [2005] procedure of matching on return on assets (ROA hereafter) mitigates misspecification in samples with extreme ROA but exaggerates misspecification in samples with extreme firm size. In contrast, our new tests, which incorporate accrual reversals, are robust in mitigating misspecification across a broad set of economic characteristics.

Overall, our results suggest that our new approach for detecting earnings management leads to substantial improvements in both test power and test specification. We therefore encourage subsequent earnings management research to consider this approach. Our reversal framework should also be useful for practitioners interested in establishing the existence of earnings management in historical data. Examples include regulators tasked with enforcing accounting principles, investors evaluating the quality of managements’ past financial reports, and class action lawyers establishing cases of fraud on the market.

The remainder of our paper is organized as follows. Section 2 reviews and evaluates existing techniques for detecting earnings management and motivates our extension to incorporate accrual reversals. Section 3 describes the earnings management models and associated tests that we consider in this paper. Section 4 describes our research design, section 5 presents our results and section 6 concludes.

2. Review and Motivation

2.1 statistical framework

This section develops a statistical framework for summarizing common tests of earnings management and identifying potential misspecifications associated with these tests. It builds on the framework introduced by McNichols and Wilson [1988] and extended by DSS. Accrual-based tests of earnings management are based on the following linear model:

image(1)

where:

  • DA = discretionary accruals;
  • PART = a dummy variable that is set to 1 in periods during which a hypothesized determinant of earnings management is present and 0 otherwise;
  • b = the magnitude of the hypothesized earnings management; and
  • a,ɛ = the impact of other determinants of discretionary accruals (i.e., other sources of earnings management).

Invoking the standard OLS assumptions, the OLS estimator of b, denoted b̂, is the best linear unbiased estimator of b, with a standard error of

image

where:

  • n = total number of observations (including those where PART= 1 and PART= 0);
  • sɛ = standard error of the regression (residual sum of squares divided by n− 2); and
  • sPART = sample standard deviation of PART.

The ratio of b̂ to SE(b̂) has a t-distribution with n− 2 degrees of freedom. The null hypothesis of no earnings management is rejected if b̂ has the hypothesized sign and the associated t-statistic is statistically significant at conventional levels. Consequently, the power of the t-test for earnings management is increasing in

  • b, the magnitude of the hypothesized earnings management;
  • n, the total number of observations; and
  • sPART, the standard deviation of PART. Note that if we let ρ denote the proportion of the total observations for which PART= 1, then sPART=√(ρ−ρ2), so sPART is greatest when ρ= 0.5 and gradually declines to 0 as ρ approaches either 0 or 1.

Conversely, test power is decreasing in sɛ, the standard error of the regression (residual sum of squares divided by n− 2), which reflects the standard deviation of the combined impact of other determinants of earnings management.

Unfortunately, the researcher does not directly observe discretionary accruals, and so must use a discretionary accrual proxy that measures discretionary accruals (DA) with error

image(2)

where:

  • μ = discretionary accruals that are unintentionally removed from DAP;
  • η = nondiscretionary accruals that are unintentionally left in DAP.

To understand the resulting misspecification, we first substitute DAP for DA in equation (1):

image((1)′)

The OLS estimator of b obtained from regressing DAP on PART, denoted b̃, is misspecified by the omission from the regression of (−μ+η). In particular, b̃ is a biased estimator of b, with bias given by6:

image

where:

  • β(-μ+η)(PART) = regression coefficient in a regression of (–μ+η) on PART, and
  • E(.) = the expectations operator.

Also, the OLS standard error for b̃, is given by

image

where:

  • r2(−μ+η)(PART)= the R2 from a regression of (–μ+η) on PART and

  • r2(DAP)(−μ+η)·(PART)= the R2 from a regression of DAP on the component of (–μ+η) that is orthogonal to PART.

The above expressions highlight three distinct types of misspecification that can arise in the estimation of (1)′

  • Problem 1.Bias and loss of power caused by the omission ofμ from DAP. Recall that μ represents discretionary accruals that have been unintentionally removed from DAP. The presence of μ causes bias in b̃ that is bounded by −b≤β(−μ)(PART)≤ 0.7 This means that b̃ is biased toward 0, with the limiting case where we unintentionally remove all discretionary accruals resulting in b̃= 0. This bias reduces the likelihood of rejecting the null hypothesis of no earnings management when it is false (i.e., increased Type II error rate). Intuitively, removing some of the discretionary accruals results in a less powerful test (we have “thrown the baby out with the bathwater”).

  • Problem 2.Bias and misspecification caused by the inclusion of correlated η in DAP. Recall that η represents nondiscretionary accruals that have been unintentionally left in DAP. The presence of η biases b̃ so long as η is correlated with PART. In particular, b̃ will not equal 0 even when b= 0. This increases the likelihood of rejecting the null hypothesis of no earnings management even when it is true (i.e., excessive Type I error rate). Intuitively, we mistakenly infer the presence of earnings management because of nondiscretionary accruals that happen to be correlated with PART.

  • Problem 3.Inefficiency caused by the inclusion of uncorrelated η in DAP. If nondiscretionary accruals are left in DAP, but they are uncorrelated with PART, b̃ is unbiased. However, SE(b̃) =SE(b̃)/(1 −r2(DAP)(η)). So the standard error of the estimated coefficient is increasing in the proportion of the variation in DAP that is attributable to η. This reduces the likelihood of rejecting the null hypothesis of no earnings management when it is false (i.e., increased Type II error rate). Intuitively, failing to extract nondiscretionary accruals from DAP results in a less powerful test even when these nondiscretionary accruals are uncorrelated with PART.

Balancing these competing sources of misspecification presents a trade-off. Incorporating every conceivable determinant of nondiscretionary accruals is likely to exacerbate Problem 1. But incorporating too few determinants exacerbates Problems 2 and 3.

2.2 overview of discretionary accrual models

Following Healy [1985], most discretionary accrual models start with working capital accruals as their base measure of accruals. Early research then simply employs the levels (e.g., Healy [1985]) or changes (e.g., DeAngelo [1986]) in working capital accruals as discretionary accrual proxies, implicitly assuming that nondiscretionary accruals are constant. This assumption is unlikely to be empirically descriptive, because nondiscretionary accruals are expected to change with firms’ underlying business activities (e.g., Kaplan [1985], McNichols [2000]).

Subsequently, more sophisticated models that attempt to explicitly model nondiscretionary accruals have been developed, enabling total accruals to be decomposed into discretionary and nondiscretionary components. The most popular models are attributable to Jones [1991], DSS, Dechow and Dichev [2002], and McNichols [2002]. We will describe these models in more detail in the next section. Such models typically require at least one parameter to be estimated, and were originally implemented through the use of a firm-specific “estimation period,” during which no systematic earnings management was hypothesized. Starting with Defond and Jiambalvo [1994], researchers have generally employed cross-sectional and panel estimation of these models.

Concerns that these models fail to capture all nondiscretionary accruals have also led researchers to supplement the models with performance-matching procedures. Kothari, Leone, and Wasley [2005] propose a popular matching procedure that entails subtracting estimates of discretionary accruals from Jones-type models using control firms matched by industry and ROA in either the current or the previous period.

2.3 limitations of existing models

While various models described above have been used extensively in the literature to test for earnings management, their effectiveness is known to be limited. DSS provide the first comprehensive assessment of the specification and power of commonly used test statistics across the measures of discretionary accruals generated by several of these models. They conclude that: (1) all of the models generate well-specified test statistics when applied to random samples, (2) all models generate tests of low power for earnings management of economically plausible magnitudes (e.g., 1% to 5% of total assets), and (3) all models are misspecified when applied to samples of firms with extreme financial performance. McNichols [2000] reiterates point (3) and shows that all models are particularly misspecified for samples with extreme forecasts of long-term earnings growth.

Kothari, Leone, and Wasley [2005] propose the performance-matching procedure mentioned earlier to mitigate performance-related misspecification. Their results indicate that performance matching is no panacea. First, their performance-matching procedure rarely eliminates misspecification and sometimes exaggerates misspecification. For example, their results indicate that performance matching on ROA mitigates misspecification for samples with extreme earnings-to-price and book-to-market, but can exaggerate misspecification in samples with extreme size and operating cash flows. Second, their results highlight the low power of existing tests for earnings management and show that performance matching exacerbates this problem. For example, their simulations show that using random samples of 100 firm-years, seeded earnings management equal to 1% of total assets, and a 5% test level results in rejection rates of only 20% with no performance matching and a paltry 14% with performance matching.

These results highlight two key limitations of performance-matching procedures. First, performance matching is only effective in mitigating misspecification when the researcher matches on the relevant correlated omitted variable. Second, performance matching reduces test power by increasing the standard error of the test statistic. We can use the framework developed in section 2.1 to formalize these problems. Using the subscript j for the matched control firm, the resulting performance-matched discretionary accrual proxy is:

image

If we have chosen a perfect match, then (ηi,t−ηj,t) = 0. But even in this case, we have introduced two new problems. First, it is possible that DAi,t, and DAj,t will be positively correlated. This seems particularly likely when matching on ROA, because DA is a component of ROA. This will generate a special case of Problem 1, removing discretionary accruals and reducing the power of the test in the presence of earnings management. Second, assuming ɛ is independently and identically distributed, the new standard error of the regression will be √2sɛ, causing the t-statistic to be reduced accordingly. This problem is similar to Problem 3, in that it introduces additional uncorrelated noise into the residual leading to a less powerful test.

2.4 incorporating discretionary accrual reversals

We introduce a new approach for detecting earnings management that has the potential to simultaneously improve test power and mitigate misspecification. Our approach exploits an inherent property of discretionary accruals. Discretionary accruals are made with the purpose of shifting earnings between reporting periods. The accrual accounting process requires misstatements in one period to reverse in another period. For example, if a firm overstates its receivables in one period, the overstatement must be reversed in the subsequent period during which it becomes clear to the firm's auditors that the associated cash flows will not be received.

Nondiscretionary accruals, in contrast, are tied to the operations of the underlying business (e.g., McNichols [2000]). At an aggregate level, they will tend to originate during periods when the business is either growing (e.g., purchasing inventory in anticipation of future sales growth) or making strategic changes to its operating and investing decisions (e.g., granting more generous credit terms). Since businesses operate as going concerns, their operating characteristics tend to persist. As such, the associated nondiscretionary accruals should also tend to persist. In other words, old reversing nondiscretionary accruals will tend to be offset by new originating nondiscretionary accruals (e.g., the replacement of inventory as it is sold) such that nondiscretionary accruals will tend to persist in the aggregate.

Because discretionary accruals should reverse while nondiscretionary accruals should persist, we can test for earnings management not only by testing for the presence of discretionary accruals in the earnings management period, but also by testing for the reversal of those accruals in an adjacent period. Incorporating reversals should both: (1) increase test power and (2) mitigate misspecification caused by the inclusion of correlated nondiscretionary accruals. We can use the framework developed in section 2.1 to understand the intuition behind these two improvements. To do so, we introduce two new earnings management partitioning variables:

  • PARTR = a dummy variable that equals 1 in periods during which the hypothesized earnings management reverses and 0 otherwise; and
  • PART′ = PARTPARTR (i.e., PART′= 1 for earnings management years, −1 for reversal years and 0 otherwise).

To see how incorporating accrual reversals increases test power, we assume that: (1) we measure DA without error (i.e., μi,ti,t= 0 for all i and t), (2) we correctly identify the earnings management and accrual reversal periods, and (3) the hypothesized earnings management and reversal periods are mutually exclusive.8 Consider estimating equation (1) after replacing PART with PART′, denoting the corresponding regression estimates by b′ and SE(b′), respectively. Recall that sPART=√(ρ−ρ2) and we can readily determine that sPART=√(2ρ). Thus, as long as ρ is “small,” we obtain

image

The expected coefficient on PART′, E(b), is unchanged, but the standard deviation of PART′, sPART, exceeds sPART by a factor of √2. A higher sPART results in a correspondingly lower SE(b′), providing increased test power. In other words, PART′ identifies twice as much variation in DA that is attributable to the hypothesized source of earnings management, thus increasing the power of the associated t-test accordingly.

One can think of this increase in test power described above as being analogous to the case of doubling the number of observations where PART= 1, while simultaneously reducing the number of observations where PART= 0 (such that the overall sample size remains constant). This brings us to the issue of why we need to assume that ρ is “small.” Recall that ρ represents the proportion of the observations where earnings are managed, and since we assume that earnings management and reversal periods are mutually exclusive, ρ has a maximum possible value of 0.5. In the case where ρ= 0.5, every observation in the sample is either an earnings management period or a reversal period. Thus, modeling reversals does not improve test power, because we have already implicitly identified the reversal periods by identifying the earnings management periods. Formally, in the case where ρ= 0.5, we obtain:

image

So in the case of ρ= 0.5, modeling reversals reduces the estimated coefficient on PART by a factor of 2, but also reduces the standard error of the coefficient estimate by a factor of 2, thus having no net impact on test power. In fact, because PART′=–1 in every observation where PART= 0, PART′ will be a simple linear transformation of PART, with PART′= 2(PART 1/2). For example, if earnings are managed up by 5 when PART= 1 and therefore managed down by 5 for the remaining half of the observations, we have E[b′]= 5 and sPART= 1, while E[b̂]= 10 and sPART= 1/2. So in the case of ρ= 0.5, what reversals give in terms of a higher sPART, they take in terms of a lower b′.9

The preceding discussion begs the question of ρ's magnitude in typical earnings management studies. In many earnings management studies, earnings management is only hypothesized to occur in a small proportion of the available observations. For example, in our analysis of firms subject to enforcement actions by the SEC, to be presented later in the paper, there are 406 firm-years with earnings management allegations out of a total sample of 161,119 firm-years. Thus, ρ= 406/161,119 = 0.003. To give further examples from some highly cited earnings management studies, ρ= 0.020 in Defond and Jiambalvo's [1994] study of debt covenant violations, ρ= 0.012 in Defond and Subramanyam's [1998] study of auditor changes, and ρ= 0.047 in Ball and Shivakumar's [2008] study of IPOs. Thus, successfully incorporating reversals would have substantially increased test power in all of these studies. Nevertheless, researchers should be aware that the improvements in test power from incorporating reversals would disappear in settings where ρ approaches 0.5.

We next relax the assumption that we perfectly measure discretionary accruals and examine the impact of incorporating reversals on test specification in the presence of correlated omitted nondiscretionary accruals (Problem 2: Inclusion of correlated η in DAP). Recall from section 2.1 that the presence of correlated omitted nondiscretionary accruals, η, in DAP biases the estimate of earnings management b̃ as follows:

image

where β(η)(PART)= regression coefficient in a regression of η on PART. The impact of modeling reversals on this bias therefore hinges on the impact of substituting PART′ for PART on β(η)(PART). First, note that β(η)(PART)(η)(PART′) only in the special case that η completely reverses in the reversal period.10 As discussed earlier, the economic characteristics driving nondiscretionary accruals tend to persist, leading us to expect that the associated nondiscretionary accruals should also persist. So a second special case of interest is when η completely persists into the reversal period. In this case, β(η)(PART′)= 0, because, for every observation where PART′= 1, we now have another observation for which PART′=–1 and η is the same. Thus, incorporating reversals in tests of earnings management completely eliminates Problem 2 when the nondiscretionary accruals completely persist into the reversal period. More generally, incorporating accrual reversals will mitigate Problem 2 to the extent that the associated nondiscretionary accruals do not completely reverse in the reversal period. Intuitively, by modeling discretionary accrual reversals, we reduce the likelihood of mistakenly attributing earnings management to nondiscretionary accruals that are correlated with PART but do not completely reverse.

To summarize, incorporating reversals into tests of earnings management produces two potential benefits:

  • 1) So long as the hypothesized determinant of earnings management is present in less than half of the total available firm-years, incorporating reversals increases test power.
  • 2) So long as any correlated omitted nondiscretionary accruals do not happen to reverse in the same period that the earnings management is hypothesized to reverse, incorporating reversals mitigates correlated omitted variables bias.

3. Test Design

This section describes our framework for incorporating accrual reversals in tests of earnings management and summarizes key features of the nondiscretionary accrual models employed in our tests. Our primary objective is to examine the impact of incorporating accrual reversals on existing tests for earnings management. We therefore strive to keep other features of our testing framework consistent with prior research.

3.1 test procedure

We implement equation (1)′ as follows:

image(3)

where:

  • WC_ACC = noncash working capital accruals;
  • PART = a dummy variable that is set to 1 in periods during which a hypothesized determinant of earnings management is present and 0 otherwise; and
  • Xk = controls for nondiscretionary accruals.

Note that, following DSS, we use working capital accruals as our base measure of accruals and directly include controls for nondiscretionary accruals as additional explanatory variables in the earnings management regression.

To incorporate reversals, we augment (3) through the inclusion of a second partitioning variable that identifies periods in which the earnings management is hypothesized to reverse (PARTR):

image(4)

We then test the linear restriction that bc= 0 to test for earnings management.11 The alternative hypotheses for upward (downward) earnings management are b − c > (<) 0. While the assumption that earnings management reverses in one year is reasonable for working capital accruals, it is not the only possible assumption. For example, if earnings are hypothesized to be managed upward during equity offerings, one might reasonably hypothesize that such earnings management would not reverse until after sufficient time has passed that management and investment banker lock-up agreements have expired. Thus, PARTR need not always take on the value of 1 in the period immediately following that in which PART= 1.

For the purpose of conducting our evaluation of model (4), we consider three scenarios regarding the timing of the reversal of earnings management. In the first scenario, we assume that the researcher has no priors regarding the reversal of the earnings management, thus excluding PARTR from the regression. This scenario essentially collapses to the traditional model in equation (3). In the second scenario, we assume that all earnings management reverses in the year immediately following the earnings management year. This seems to be a plausible assumption when considering working capital accruals, since most working capital accruals are expected to reverse within a year. However, since it is also possible that managers have the incentives and the ability to delay accrual reversals beyond one year, we also consider a third scenario in which we assume that all earnings management reverses over the two years following the earnings management year. Under these latter two scenarios, if earnings are hypothesized to be managed for two or more consecutive years, we assume that the reversal starts in the first year following the last of the consecutive earnings management years.

To facilitate interpretation of the results for the third scenario, we decompose PARTR into two new partitioning variables, PARTP1 and PARTP2, where PARTP1 equals 1 in the first year following an earnings management year and 0 otherwise and PARTP2 equals 1 in the second year following an earnings management year and 0 otherwise:

image(5)

We then conduct a test of the linear restriction that bc − d= 0 to test for earnings management. While similar in spirit to including a single reversal variable PARTR, where PARTR=PARTP1 +PARTP2, this approach allows us to separately estimate the magnitude of the reversal in each of the subsequent two periods.

3.2 models of nondiscretionary accruals

A wide variety of nondiscretionary accrual models have been employed by previous research. We examine common variants of the most popular models, and our testing framework is easily extended to other models. The two key features of each model are:

  • 1) the measure of accruals and
  • 2) the determinants of nondiscretionary accruals, Xk.

We use noncash working capital accruals (WC_ACC) as the measure of accruals in all of our models, where:

image

and

  • ΔCA = the change in current assets
  • ΔCL = the change in current liabilities
  • ΔCash = the change in cash
  • ΔSTD = the change in short-term debt
  • A = total assets.

Early research also subtracts depreciation expense in the definition of accruals (e.g., Healy [1985]), but this adjustment is often dropped in subsequent research on the grounds that it is related to long-term capital expenditure accruals rather than working capital accruals (e.g., Allen, Larson, and Sloan [2010]).

We consider five popular models of nondiscretionary accrual determinants as follows:

3.2.1.Healy

Healy [1985] does not incorporate any determinants of nondiscretionary accruals.

3.2.2.Jones

Jones [1991] includes the change in revenues and the level of gross property, plant, and equipment (PPE hereafter) as determinants of nondiscretionary accruals.

image

3.2.3.Modified Jones

DSS show that the original Jones model has low power in cases where firms manipulate revenue through the misstatement of net accounts receivable. This is because the original Jones model includes the change in credit sales as a determinant of nondiscretionary accruals, resulting in the removal of discretionary accruals (Problem 2 from section 2.1). To mitigate this problem, DSS suggest that cash revenue be used in place of reported revenue.12

image

3.2.4.DD

Dechow and Dichev [2002] note that, if the objective of accruals is to correct temporary matching problems with firms’ underlying cash flows, then nondiscretionary accruals should be negatively correlated with contemporaneous cash flows and positively correlated with adjacent cash flows. They therefore propose including past, present, and future cash flows (CF) as additional relevant variables in explaining nondiscretionary accruals.13

image

where:

CFi,t= Earnings before Extraordinary Itemsi,tDAPi,t.

Wysocki [2009] reasons this model will tend to classify discretionary accruals that are made with the intention of smoothing earnings as nondiscretionary. For example, a firm with deteriorating cash flows may try to manage accruals upward to avoid reporting deteriorating earnings. This model is therefore poorly suited to tests of earnings management where the hypothesis entails earnings smoothing.

3.2.5.McNichols

Finally, McNichols [2002] shows that combining the determinants from both the Jones and the DD models described above results in greater explanatory power with respect to working capital accruals. McNichols points out that, because these determinants most likely represent fundamentals to a greater extent than discretion, estimates of discretionary accruals based on either model alone likely contain a significant nondiscretionary component. This combined model has been embraced in subsequent research (e.g., Francis et al. [2005]).

We make three additional research design choices that apply to all of the models. First, we estimate the models as a single panel that pools across all available firm-years in our sample. This approach is common in the existing literature. Another common approach is to estimate each of the models by industry and year and then conduct the earnings management tests by pooling across the model residuals (e.g., Defond and Jiambalvo [1994]). In unreported tests, we confirmed that this approach yields results that are qualitatively similar to those reported in the paper. An approach that was adopted by early research is to estimate each model at the firm level and then conduct statistical inference by aggregating t-statistics from the firm-specific regressions (e.g., Jones [1991]). This approach is not common in more recent research because it results in a considerable loss of power. The loss in power arises because firms with insufficient observations to conduct a firm-specific regression have to be dropped and because a separate set of model parameters has to be estimated for each firm. In unreported tests, we confirmed that this approach results in the loss of a substantial number of observations and a significant decline in test power.

The second choice we make in our research design is to conduct all statistical tests using the heteroskedasticity-consistent covariance matrix proposed in MacKinnon and White [1985] and commonly referred to as HC3. This approach to incorporating heteroskedasticity has been shown to be the best specified across a broad range of sample sizes (e.g., Long and Erwin [2000]). Note that, because tests using HC3 appeal to asymptotic theory, all linear restrictions are tested using a chi-square (χ2) statistic. We note that most previous earnings management research employs standard OLS regression analysis, implicitly assuming that the residuals in equation (3) are independently and identically distributed. We conducted a series of diagnostic tests to identify significant violations of this assumption. While we find little evidence of systematic time-series or cross-sectional (e.g., industry) dependence in model residuals, we do find significant evidence of heteroskedasticity when grouping firm-years by characteristics such as size, ROA, and presence of an SEC enforcement action. Hence, we recommend the use of heteroskedasticity-consistent covariance matrices in tests for earnings management.

The third choice we make in our research design is to conduct tests using performance-matched discretionary accruals following the procedure described in Kothari, Leone, and Wasley [2005]. The matched pair is the firm-year in the same two-digit SIC code and fiscal year with the closest ROA. We follow Kothari et al. in conducting separate tests for matching on ROAt and ROAt-1, respectively. Performance-matched discretionary accruals are computed by taking the residuals from each of the models of nondiscretionary accruals (estimated excluding the earnings management partitioning variables) and subtracting the corresponding residual for the matched pair in the PART= 1 year. We also follow Kothari et al. in conducting inference using a standard t-test against a null of 0 on the resulting differenced residual. For comparative purposes, when we report these test statistics, we square the t-statistic to arrive at the corresponding F-statistic, which approximates a χ2-statistic for large sample sizes and hence is comparable to the χ2-statistics from our reversal models. We emphasize that we adopt the Kothari et al. approach for comparative purposes and because of its popularity in the existing literature. Our main purpose in doing so is to demonstrate the effectiveness of modeling accrual reversals as an alternative to Kothari et al.'s performance matching procedure in addressing misspecification due to correlated omitted variables.

4. Experimental Design

4.1 data

Our sample consists of available firm-years from the Compustat annual files for which we can calculate WC_ACC. We therefore require positive nonmissing values of the following variables (Compustat mnemonics in brackets): receivables (rect), current assets (act), current liabilities (lct), cash and equivalent (che), short-term debt (dlc), total assets (at), sales (sale), and PP&E (ppegt). We also require nonmissing values of earnings before extraordinary items (ib) so that we can derive cash flows, CF, for use in the Dechow and Dichev model and cash flow performance matching tests. Annual Compustat data are pulled using the DATAFMT=STD flag to ensure that we are using the original “as reported” and unrestated data.14 We exclude financial firms, since working capital is less meaningful for these firms, and we winsorize all financial variables at the 1% tails. Following Kothari, Leone, and Wasley [2005], we define ROA as earnings before extraordinary items divided by lagged total assets. Our final sample consists of 209,530 firm-year observations between 1950 and 2009.

4.2 test procedure

We follow a similar procedure to DSS to examine the power and specification of each of the models. We first examine each model in its traditional form, and we then examine the impact of incorporating earnings management reversals. The models are evaluated in four different contexts. First, we examine model specification using randomly selected earnings management years. Second, we artificially seed earnings management and its associated reversal to evaluate the gains in test power resulting from incorporating reversals. Third, we examine the power of the models using a sample of firms identified by the SEC as having manipulated earnings. Finally, we examine model specification in situations where the earnings management years are correlated with various economic characteristics.

4.2.1.Tests Where the Earnings Management Year (i.e., PART = 1) Is Randomly Selected

To evaluate test specification in random samples of firm-years, we perform the following steps for each combination of models and tests:

  • 1) From among the 209,530 firm-years, we randomly select 100 firm-year observations.15 The 100 firm-years are designated as earnings management years (i.e., PART= 1). The remaining firm-years are designated as non–earnings management years (i.e., PART= 0).
  • 2) We then determine whether data are available in the two years immediately following each earnings management year. If they are, we set PARTP1 = 1 and PARTP2 = 1 for the first and second year, respectively, and equal to 0 otherwise.
  • 3) We conduct a pooled regression for each model as described in the previous section using all 209,530 firm-years.
  • 4) Steps 1 and 2 are repeated 1,000 times.
  • 5) We record the frequency with which the null hypothesis of no earnings management is rejected at the 5% level using one-tailed tests (for the χ2 tests we use a 10% level and condition on the direction in which the linear constraint is rejected, effectively conducting a one-tailed test at the 5% level).16

4.2.2.Simulation Tests with Induced Earnings Management

The purpose of these tests is to examine the power of the models to detect earnings management in settings where we know the magnitude and timing of the earnings management and associated reversal. Our tests differ from those in previous research, such as DSS, in that we also simulate the reversal of the earnings management. Our first set of simulations examines how changing the proportion of the reversal that is correctly modeled by the researcher impacts test power. These simulations are conducted through the following six steps:

  • 1) From among the 209,530 firm-years, we randomly select 100 firm-year observations. The 100 firm-years are designated as earnings management years (i.e., PART= 1). The remaining firm-years are designated as non–earnings management years (i.e., PART= 0).
  • 2) For the 100 earnings management years, we artificially induce earnings management by adding “discretionary accruals” equal to 1% of the beginning total assets.
  • 3) We then determine whether data are available in the year immediately following each earnings management year. If it is, we set PARTP1 equal to 1 for that year. We consider 11 scenarios in which the induced earnings management in step 2 is reversed in increments of 10%, from 0% (i.e., no reversal) to 100% (i.e., complete reversal) in this subsequent year.
  • 4) We estimate a pooled regression for each model using all 209,530 firm-years and conduct tests for earnings management.
  • 5) Steps 1 through 4 are repeated 1,000 times.
  • 6) We repeat steps 1 through 5 after substituting earnings management of 2% of beginning total assets at step 2.

Our second set of simulations examines test power as a function of sample size. In these simulations, we assume that the researcher correctly models the reversal of earnings management and examine how incorporating reversals impacts test power relative to traditional tests of earnings management that ignore reversals. These simulations are conducted through the following five steps:

  • 1) From among the 209,530 firm-years, we randomly select 100 firm-year observations. The 100 firm-years are designated as earnings management years (i.e., PART= 1). The remaining firm-years are designated as nonearnings management years (i.e., PART= 0).
  • 2) For the 100 earnings management years, we artificially induce earnings management by adding “discretionary accruals” equal to 2% of the beginning total assets.
  • 3) We then determine whether data are available in the year immediately following each earnings management year. If it is, we set PARTP1 equal to 1 for that year and we add a “discretionary accrual reversal” equal in magnitude but opposite in sign to the discretionary accruals in step 2 (i.e., 100% reversal).
  • 4) We estimate a pooled regression for each model using all 209,530 firm-years and conduct tests for earnings management.
  • 5) Steps 1 through 4 are repeated 1,000 times.
  • 6) We repeat steps 1 through 5 varying the number of earnings management firms selected in step 1 from 100 to 1,000 in increments of 100.

4.2.3.SEC Accounting and Auditing Enforcement Release (AAER) Sample

We use the AAER sample to examine the power of the different tests and models to detect earnings management in a sample of firm-years where we have strong priors that earnings have been managed. The advantage of these tests is that we do not have to make assumptions about either the magnitude or timing of the earnings management and reversal. Instead, we employ a sample of firm-years examined by Dechow et al. [2011] in which the SEC alleges that upward earnings management has taken place. Dechow et al. [2011] identify the specific years in which the alleged earnings management takes place by reading the associated SEC accounting and auditing enforcement releases. We expect these cases of earnings management to be particularly egregious. Moreover, the fact that they were identified and targeted by the SEC makes it probable that any earnings management is subsequently reversed. So this sample provides an ideal setting to look for both evidence of earnings management and its associated reversal. If we are unable to document evidence in this sample, then it seems unlikely that our tests for earnings management could be effective in other less extreme settings.

There are 230 firms and 406 firm-years for which the SEC makes allegations of upwardly managed earnings. We evaluate the ability of the different models to detect earnings management through the following steps:

  • 1) We set PART= 1 in the 406 firm-years in which upward earnings management is alleged to have taken place and PART= 0 otherwise.
  • 2) We set PARTP1 = 1 in the first year following the final earnings management year and PARTP1 = 0 otherwise.
  • 3) We set PARTP2 = 1 for the second year following the last earnings management year and PARTP2 = 0 otherwise.
  • 4) We conduct a pooled regression for each model using all 161,119 firm-years during the 1973–2003 period spanned by the AAERs.
  • 5) We repeat the above steps 1 through 4 for a subset of 122 of the 406 AAER firm-years in which the SEC specifically alleges that a component of working capital accruals was manipulated.

4.2.4.Tests Where the Earnings Management Year (i.e., PART) Is Randomly Selected from Portfolios with Extreme Economic Characteristics

To determine the specification of the models for samples where the earnings management partitioning variable is correlated with common economic characteristics, we perform the following steps for each model:

  • 1) We rank the 209,530 firm-years into 10 portfolios based on the corresponding economic characteristic, where decile 10 consists of firms with the highest values of the characteristic. We then randomly select 100 firm-years from decile 10. The 100 firm-years are designated as earnings management years (i.e., PART= 1) with the subsequent two firm-years designated as reversal years (i.e., PARTP1 = 1 and PARTP2 = 1).
  • 2) We conduct a pooled regression for each model described in the previous section using all 209,530 firm-years.
  • 3) Steps 1 and 2 are repeated 1,000 times.
  • 4) We record the frequency with which the null hypothesis of no earnings management is rejected at the 5% level for each earnings management test.
  • 5) We repeat steps 1 through 4, but select 100 firm-years from decile 1 (lowest values of the characteristic).

We perform these tests for a variety of economic characteristics that are commonly encountered in earnings management studies. These characteristics include ROA, sales growth, size (market capitalization), operating cash flows, and the consensus analyst forecast of long-term earnings growth.17

5. Results

5.1 descriptive statistics

Table 1 reports descriptive statistics for working capital accruals. This table illustrates the intuition behind several of our subsequent results. Panel A indicates that working capital accruals have a positive mean, suggesting that the sample firms have grown in scale over the sample period. Panel B reports some pertinent correlations. First, the correlation between working capital accruals and earnings is 0.18, indicating that working capital accruals are an important driver of contemporaneous earnings. Second, the serial correlation in working capital accruals is weakly positive. This result tells us that working capital accruals tend to neither immediately reverse nor strongly persist “on average.” It is consistent with the results in Allen, Larson, and Sloan [2010], which shows that working capital accruals contain both strongly reversing and strongly persistent components that cancel each other “on average.”

Table 1. 
Descriptive Statistics on Working Capital Accruals
Panel A: Distribution of working capital accruals
 MeanStandard Deviation5%Lower QuartileMedianUpper Quartile95%SkewnessKurtosis
WC_ACC0.0210.124−0.145−0.0250.0100.0580.2180.6987.928
Panel B: Pearson correlations
 Earnings (t)WC_ACC (t+1)WC_ACC (t+2)
WC_ACC (t)0.180*0.047*0.040*
Panel C: Descriptive statistics on working capital accruals for deciles formed on earnings performance
Rank ofMean of WC_ACCStandard Deviation of WC_ACC
EarningsYear tYear t+ 1Year t+ 2Year tYear t+ 1Year t+ 2
1−0.041−0.0050.0050.2130.2170.220
2−0.012−0.0100.0030.1200.1320.132
30.004−0.0060.0030.0940.0980.105
40.013 0.0050.0090.0840.0870.093
50.018 0.0130.0120.0770.0810.083
60.023 0.0170.0160.0810.0780.081
70.031 0.0260.0220.0870.0870.086
80.039 0.0360.0280.0960.0930.092
90.051 0.0450.0340.1080.1030.097
100.0810.0570.0360.1660.1350.119
Panel D: Descriptive statistics on working capital accruals for the AAER sample
Mean of WC_ACCStandard Deviation of WC_ACC
Year tYear t+ 1Year t+ 2Year tYear t+ 1Year t+ 2
  1. *Significant at 1% level.

  2. Variables are defined as follows (Compustat mnemonics in parentheses):

  3. WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1, working capital accruals;

  4. ΔCA, change in current assets (act);

  5. ΔCL, change in current liabilities (lct);

  6. ΔCash, change in cash (che);

  7. ΔSTD: change in short-term debt (dlc);

  8. A, total assets (at);

  9. Earningsi,t, earnings before extraordinary itemsi,t(ib)/Ai,t−1.

  10. Panel A reports descriptive statistics on working capital accruals for 209,530 firm-year observations between 1950 and 2009. In selecting the 209,530 firm-year observations, positive nonmissing values are required for receivables (rect), current assets (act), current liabilities (lct), cash and equivalent (che), short-term debt (STD), sales (sale), and PP&E (ppegt), and income before extraordinary items (ib) is nonmissing. Panel B reports Pearson correlations between working capital accruals in year t and earnings in year t, working capital accruals in year t+ 1, and working capital accruals in year t+ 2. The number of firm-year observations used in this correlation analysis is 172,969. Panel C reports means and standard deviations of working capital accruals in year t, year t+ 1, and year t+ 2 for each earnings performance decile rank. Earnings performance is defined as earnings before extraordinary items (ib) divided by the beginning total assets (at). Earnings performance decile ranks are formed with 209,530 firm-year observations between 1950 and 2009. Rank 1 is the lowest earnings performance decile and rank 10 is the highest earnings performance decile. Panel D reports means and standard deviations of working capital accruals in year t, year t+ 1, and year t+ 2 for 406 SEC Accounting and Auditing Enforcement Release (AAER) sample firm-years that were subject to enforcement actions for upward annual earnings management between 1971 and 2003.

0.077−0.047−0.0350.1920.1580.140

Panel C reports average accruals for years t, t+ 1, and t+ 2 by deciles formed on earnings performance (i.e., ROA) in year t. There is clear evidence of a positive correlation between working capital accruals and earnings performance in all three years. Thus, there is a strong positive correlation between earnings and accruals that persists over the next two years. These results are consistent with the existence of nondiscretionary accruals that capture persistent economic performance. The fact that the high (low) accruals in high (low) earnings deciles tend to persist helps rule out the possibility that these accruals are due to reversing earnings management. These results also highlight the intuition for why incorporating accrual reversals into tests of earnings management helps to mitigate misspecification due to correlated omitted determinants of nondiscretionary accruals. If we only infer that earnings management is present when we see clear evidence of a corresponding accrual reversal, we would not infer that the accruals in the extreme earnings performance deciles in panel C represent earnings management, because they do not reverse.

Panel D reports average accruals for years t, t+ 1, and t+ 2 for firm-years in the AAER sample that are alleged to have managed earnings upward in year t. Accruals are large and positive (0.077) in year t and similar in magnitude to the top earnings decile in panel C (0.081). But unlike the top decile panel C accruals that stay high in periods t+ 1 and t+ 2 (0.057 and 0.036, respectively), the accruals for the AAER sample exhibit a sharp reversal and are significantly negative in periods t+ 1 and t+ 2 (–0.047 and –0.035, respectively). Note also that the sum of the negative accruals in periods t+ 1 and t+ 2 are opposite in sign but approximately equal in magnitude to the positive accruals in period t. Thus, in a sample where we expect that earnings management is present, we see clear evidence of the predicted subsequent accrual reversal. This pattern illustrates how incorporating accrual reversals should increase the power of tests for earnings management.

A final feature to note from Table 1 is that the standard deviation of accruals varies widely across sample partitions. The overall sample standard deviation from panel A is 0.124. But the standard deviation ranges from a low of 0.077 for earnings decile 5 in period t to highs of 0.213 for the lowest earnings decile in period t and 0.192 for the AAER sample in period t. The standard deviation of accruals is clearly associated with firm-year characteristics such as earnings performance and the presence of SEC enforcement actions. These significant violations of the assumption of independently and identically distributed errors that underlie OLS regression analysis are what prompt us to control for heteroskedasticity in the estimation of our earnings management models (see section 3.2 for details).

5.2 specification of tests for earnings management in random samples

Table 2, panel A, reports the mean coefficients and t-statistics from the 1,000 simulations using randomly assigned earnings management years for each of the competing earnings management models and test statistics. As expected for the random assignments, the parameter estimates on the earnings management partitioning variables and associated reversal variables are all close to 0 for all models. Similarly, the mean performance-matched discretionary accruals are also close to 0. The various explanatory variables in the models also take on their predicted values, consistent with previous research. The coefficients on sales growth are significantly positive, the coefficients on contemporaneous cash flows are significantly negative, and the coefficients and lead/lag cash flows are significantly positive.

Table 2. 
Descriptive Statistics and Rejection Rates for Tests of Earnings Management When Earnings Management Years Are Randomly Selected
Panel A: Mean coefficients (t-statistics) for earnings management model parameters, and mean performance-matched discretionary accruals (t-statistic)
Model Parameter Estimates
 b onc ond on     Mean
 PARTPARTP1PARTP2ΔREVPPECF (t−1)CF (t)CF (t+ 1)Adjusted R2
Healy0.000−0.002−0.003     0.00%
 (−0.046)(−0.210)(−0.285)      
Jones0.0000.0000.0000.109−0.008   12.80%
 (−0.037)(−0.044)(−0.032)(80.524)(−11.527)    
Modified Jones0.000−0.001−0.0010.091−0.007   6.79%
 (−0.041)(−0.106)(−0.124)(58.682)(−9.516)    
DD0.000−0.002−0.003  0.079−0.2250.13317.33%
 (0.001)(−0.221)(−0.284)  (31.406)(−52.839)(40.883) 
McNichols0.000−0.001−0.0010.101−0.0140.089−0.2120.11328.29%
 (0.011)(−0.067)(−0.035)(81.523)(−20.317)(34.669)(−50.104)(35.381) 
Performance-Matched Discretionary Accruals
 Matched on ROA (t−1)Matched on ROA (t)
Healy0.0000.001
 (−0.009)(0.042)
Jones0.0000.000
 (−0.007)(0.007)
Modified Jones0.0000.000
 (−0.012)(0.022)
DD0.0000.001
 (−0.002)(0.039)
McNichols0.0000.000
 (−0.006)(0.001)
Panel B: Rejection rates
HealySignificantly PositiveSignificantly Negative
Test level: one-tailed 5%
b= 04.1%6.8%*
b − c= 06.3%3.9%
b − c − d= 08.8%**3.2%**
Performance matched on ROA(t−1)5.2%5.0%
Performance matched on ROA(t)5.5%4.8%
JonesSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%
b= 04.3%6.1%
b − c= 05.6%6.1%
b − c − d= 06.0%5.7%
Performance matched on ROA(t−1)5.1%5.5%
Performance matched on ROA(t)5.2%5.3%
Modified JonesSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%
b= 04.1%6.5%*
b − c= 06.0%5.0%
b − c − d= 06.6%*4.6%
Performance matched on ROA(t−1)5.2%5.4%
Performance matched on ROA(t)5.3%5.1%
DDSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%
b= 04.7%5.9%
b − c= 05.1%2.2%**
b − c − d= 07.9%**2.3%**
Performance matched on ROA(t−1)4.7%4.7%
Performance matched on ROA(t)5.4%4.4%
McNicholsSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%
b= 05.1%4.9%
b − c= 04.6%3.3%*
b − c − d= 05.1%4.3%
Performance matched on ROA(t−1)4.9%4.6%
Performance matched on ROA(t)5.0%5.0%
Panel C: Serial correlations in discretionary accruals and nondiscretionary accruals
 Mean Serial Correlations for All Available FirmsMean Serial Correlations for Available AAER FirmsSerial Correlations for Earnings Management and Reversal Years in AAER Sample
DiscretionaryNondiscretionaryDiscretionaryNondiscretionaryDiscretionaryNondiscretionary
  1. ** and * significantly different from the specified test level at 1% and 5% level, respectively, using a two-tailed binomial test.

  2. The models are defined as below (characters in parentheses are Compustat mnemonics):HealyWCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +ei,t,where:

  3. WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t− 1, working capital accruals

  4. ΔCA= change in current assets (act)

  5. ΔCL= change in current liabilities (lct)

  6. ΔCash= change in cash (che)

  7. ΔSTD= change in short-term debt (dlc)

  8. A= total assets (at)

  9. PARTi,t= partitioning variable that is set to 1 for randomly selected 100 firm-years and 0 otherwise

  10. PARTP1i,t= partitioning variable that is set to 1 in the year following randomly selected 100 firm-years and 0 otherwise

  11. PARTP2i,t= partitioning variable that is set to 1 in the second year following randomly selected 100 firm-years and 0 otherwise.JonesWCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +f1 Δ REVi,t +f2 PPEi,t +ei,t,where:

  12. ΔREVi,t= (Revenuei,tRevenuei,t− 1) /Ai,t− 1

  13. PPEi,t=PP&Ei,t/Ai,t-1:

  14. Revenue (sale) and PP&E: Gross property, plant, and equipment (ppegt).

  15. inline image

  16. where:

  17. ΔREVi,t= ([Revenuei,t− Revenuei,t-1]−[Net Accounts Receivablei,t− Net Accounts Receivablei,t-1])/Ai,t-1,

  18. Net accounts receivable: (rect).inline imagewhere:

  19. CFi,t= earnings before extraordinary itemsi,t (ib) −WC_ACCi,t.inline imagewhere:

  20. ΔREVi,t= (Revenuei,t− Revenuei,t-1) /Ai,t-1.

  21. For each performance matching model, discretionary accruals are estimated based on the respective model below:

  22. Healy: Discretionary accruals =WC_ACCi,t.

  23. Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+ei,t.

  24. Modified Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+ei,t.

  25. DD: Discretionary accruals are residuals from WC_ACCi,t=a+f1CFi,t-1+f2CFi,t+f3CFi,t+1+ei,t.

  26. McNichols: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+f3CFi,t-1+f4CFi,t+f5CFi,t+1+ei,t.

  27. Panel A reports mean coefficients and t-statistics for each model based on 1,000 pooled regressions using 209,530 firm-year observations from 1950 through 2009. PART is set equal to 1 for 100 randomly selected earnings management firm-years (0 otherwise), PARTP1 is set equal to 1 for the first year following the 100 earnings management years (0 otherwise), and PARTP2 is set equal to 1 for the second year following the 100 earnings management years (0 otherwise). This procedure is repeated 1,000 times. Panel A reports mean performance-matched discretionary accruals and associated t-statistics for each model, based on 1,000 repetitions of randomly selecting 100 observations for the models. For performance matching on ROA(t), observations are matched with another observation from the same two-digit SIC code with the closest return on assets in the current year. Performance-matched discretionary accruals are then computed by subtracting the matched firm's discretionary accruals. Performance matching on ROA(t−1) is done in a similar manner. Panel B reports the rejection rates of no earnings management based on PART (b= 0), PART and PARTP1 (bc= 0), PART, PARTP1, and PARTP2 (bcd= 0), and performance-matched discretionary accruals. For reversal models, after each pooled regression is run, χ2-tests are conducted based on heteroskedasticity-consistent standard errors. This procedure is repeated 1,000 times and the percent of the 1,000 regressions is determined where the χ2-statistic is significant at the 5% level (one-tailed test). A binomial test is performed to determine whether the percentage is significantly different from the specified 5% test level. Note that b is the coefficient on PART, c is the coefficient on PARTP1, and d is coefficient on PARTP2. For performance matching models, t-tests are conducted to examine whether the mean performance-matched discretionary accruals are significant at 5% level. This procedure is repeated 1,000 times and the percent of the 1,000 repetitions is determined where the t-statistic is significant at the 5% level. A binomial test is performed to determine whether the percentage is significantly different from the specified 5% test level. The first two columns of panel C report mean firm-specific serial correlations of discretionary accruals and nondiscretionary accruals estimated from each model for all available firms and AAER firms. The serial correlations of discretionary accruals and nondiscretionary accruals are Pearson correlations between year t and year t+ 1 for each firm. To estimate firm-specific discretionary accruals and nondiscretionary accruals, only firms with six or more observations are selected. For the first two columns of panel C, 11,342 firms (177,814 firm-years) are used in the case of all available firms, and 194 firms (2,789 firm-years) are used in the case of AAER firms. The last column of panel C documents serial correlations of discretionary accruals and nondiscretionary accruals between the last earning management year (when earnings are alleged to be managed for several consecutive years) and the first reversal years available for AAER firms.

Healy−0.094N/A−0.017N/A−0.197N/A
Jones−0.2060.200−0.1650.258−0.3340.192
Modified Jones−0.1800.177−0.1370.244−0.4130.190
DD0.077−0.1990.065−0.125−0.197−0.112
McNichols−0.080−0.127−0.112−0.052−0.344−0.087

Panel B reports the rejection frequencies for each of the discretionary accrual models using each of the competing tests for earnings management. There are five models (Healy, Jones, Modified Jones, DD, and McNichols models) and five different tests (b= 0, bc= 0, bcd= 0, performance matching on ROAt, and performance matching on ROAt-1). We also report one-tailed tests for both positive and negative earnings management, so panel B reports 40 sets of rejection frequencies in total. All tests are conducted using a 5% test level, and so the observed rejection frequencies should be 5% for well-specified tests.

The results indicate that all models are relatively well specified, in that their rejection frequencies are close to the specified test level. The only notable exceptions are for tests of the form bcd= 0 using the Healy and DD models, where the rejection frequencies look somewhat high for positive earnings management and somewhat low for negative earnings management. We note that the average level of accruals has declined slightly over time, providing a potential explanation for these rejection rates. The fact that models controlling for sales growth do not exhibit this problem suggests that the decline in accruals reflects a corresponding decline in nondiscretionary accruals that are correlated with sales growth.

Finally, panel C of Table 2 reports serial correlations for estimated discretionary accruals (i.e., the residuals) and nondiscretionary accruals (i.e., the fitted values) from each of the nondiscretionary accrual models. Recall from Table 1 that working capital accruals display weak positive serial correlation in the pooled sample. Allen, Larson, and Sloan [2010] predict that nondiscretionary accruals should display positive serial correlations, since they reflect persistent economic performance, while discretionary accruals should display negative serial correlations, since they reflect reversing earnings management. The first two columns of panel C report the mean firm-specific serial correlations for discretionary and nondiscretionary accruals using each of the models. The results for the Healy, Jones, and Modified Jones model are all consistent with the predictions of Allen et al. The discretionary accruals are negatively serially correlated while the nondiscretionary accruals are positively serially correlated. The serial correlations for the DD and McNichols models, in contrast, display negative serial correlations for nondiscretionary accruals. These results are consistent with Wysocki's [2009] conclusion that, by controlling for contemporaneous cash flows, these models classify some discretionary accruals as nondiscretionary. In order to gain further insights into these results, we focus on the serial correlations for firms that are alleged by the SEC to have managed earnings. The middle two columns of panel C report the mean serial correlations for firms that belong to the AAER sample. The serial correlations for these firms generally look similar to those for the full sample. But remember that these firm-specific results use all available observations for each AAER firm in our 59-year sample period, whereas earnings management is typically only alleged in 1 or 2 years. We therefore report pooled serial correlations using just the firm-years during which earnings management is alleged by the SEC (i.e., PART= 1) and the immediate following year (i.e., PARTP1 = 1), since this is where we would expect strong evidence of reversals. Consistent with this prediction, we see much stronger negative serial correlation in discretionary accruals for all five models.

5.3 power of tests for earnings management using simulations with seeded earnings management

Our objectives in inducing earnings management into random samples are twofold. First, as in DSS, these tests illustrate the effectiveness of particular models in detecting earnings management of known magnitudes. Second, we can use them to illustrate the gains in power from modeling accrual reversals, including settings where the researcher has relatively weak priors concerning the timing of earnings management reversals. For example, assume that the researcher hypothesizes that earnings management will occur in year t and reverse in year t+ 1, and while the hypothesized earnings management all occurs in period t, only 50% of it actually reverses in year t+ 1. Is the power of the test still improved by modeling reversals in year t+ 1?

Figure 1 provides the results for tests where we successively induce earnings management in the magnitude of 1%, and then 2% of total assets. Recall that these tests are based on 1,000 simulations in which 100 observations out of 209,530 have induced earnings management. For each case, we report the frequency with which the null hypothesis of no earnings management is rejected in favor of the alternative of positive earnings management using a 5% test level. All tests use the Healy model and incorporate reversals in the next period (i.e., b − c= 0).18 For comparative purposes, we also report rejection rates using: (1) the Healy model of discretionary accruals using the standard t-test for earnings management (i.e., b= 0) and (2) performance-matched discretionary accruals using the Healy model and matching on ROAt. Note that the rejection rates for (1) and (2) are not expected to change with the magnitude of the reversal, since these tests do not incorporate reversals. We use the horizontal axis to vary the proportion of the induced earnings management in period t that reverses in period t+ 1. On the far left, we simulate no reversal (i.e., 0%) and on the far right we simulate a complete reversal (i.e., 100%).

Figure 1.

—The power of tests to detect earnings management using the Healy model with seeded upward earnings management that is partially reversed in the subsequent period. All tests employ 100 seeded earnings management observations and a 5% one-tailed test. This figure is based on the Healy model: WC_ACCi,t=a+bPARTi,t+cPARTP1i,t+ei,t,where:
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1+ Induced earningsi,t for randomly selected 100 firm-years
or
= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1 for the remaining observations
ΔCA= change in current assets (act)
ΔCL= change in current liabilities (lct) ΔCash= change in cash (che) ΔSTD= change in short-term debt (dlc) Induced earningsi,t= 1% (Figure 1A) or 2% (Figure 1B) of beginning total assets PARTi,t= partitioning variable that is set to 1 for randomly selected 100 firm-years and 0
otherwise PARTP1i,t= partitioning variable that is set to 1 in the year following randomly selected 100 firm-years and 0 otherwise. The simulations employ 209,530 firm-year observations between 1950 and 2009. We randomly select 100 firm-years as earnings management years and we add 1% (2%) of total assets to working capital accruals in Figure 1A (Figure 1B) for the 100 randomly selected firm-years. PART is set equal to 1 for the 100 randomly selected firm-years (0 otherwise) and PARTP1 is set equal to 1 for the year following the 100 earnings management years (0 otherwise). We reverse the seeded earnings management in the following year in increments of 10% ranging from 0% (no reversal) up to 100% (complete reversal) as shown on the x-axis. A pooled regression is run using all 209,530 observations for each increment. This procedure is repeated 1,000 times for each increment. We report the frequency that b is positive and significant (i.e., Freq b > 0) at the 5% level (one-tailed test) among 1,000 pooled regressions. We also report the frequency that bc= 0 is both significant and positive among 1,000 pooled regressions (i.e., Freq. bc > 0 at the 5% level (one-tailed test)). For the performance matching model, 1% (Figure 1A) or 2% (Figure 1B) of beginning total assets are added to existing working capital accruals for the 100 randomly selected earnings management years and each of the 100 randomly selected observations is matched with another observation from the same two-digit SIC code with the closest return on assets in the current year (t). Performance-matched discretionary accruals are computed by subtracting the matched firm's discretionary accruals from the seeded earnings management firms discretionary accruals. This procedure is repeated 1,000 times and we report the frequency that the mean performance-matched discretionary accrual is greater than 0 (i.e., Performance-matched discretionary accruals > 0) at the 5% level.

Figure 1(A) provides results where the induced earnings management is 1% of assets. The benchmark rejection frequencies are 20.7% for the standard test of b= 0 and 15.6% for performance matching. Note that, consistent with the results in Kothari, Leone, and Wasley [2005], performance matching reduces test power by approximately 40%. These rejection frequencies indicate that standard tests have low power for earnings management under these conditions. If we model a reversal in period t+ 1 when there is not one, the rejection frequency drops to 14.2%. Thus, incorporating reversals that do not occur is detrimental to test power. On the other hand, if we model a reversal in period t+ 1 and a 100% reversal actually occurs, then the rejection frequency increases to 27.8%. Relative to the standard t-test, the breakeven point from which modeling a reversal starts to improve test power occurs when just over 50% of the induced earnings management reverses in the following period. Relative to the performance-matched test, the breakeven point is just over 10%.

Figure 1(B) reports corresponding results where the induced earnings management is increased from 1% of assets to 2% of assets. The rejection frequency for the standard t-test increases to 46.7% and the rejection frequency from modeling 100% reversals increases to 67.0%. The breakeven point when modeling the reversal improves test power relative to the standard t-test is when approximately 40% of the accrual reverses in the next period. The rejection frequency for performance-matched discretionary accruals is 35.0% and modeling accrual reversals results in greater test power when there is at least a 20% reversal. These tests illustrate that modeling reversals has the potential to increase test power by almost 50% relative to the standard tests (67.0%/46.7%) and almost 100% compared to performance matching (67.0%/35.0%). Moreover, note that we have assumed the researcher is 100% sure about the earnings management period and has only considered uncertainty about the reversal period. It is also possible that the researcher faces uncertainty concerning the earnings management period. For example, consider earnings management around equity offerings. Management may manage earnings for several quarters prior to the offering, and then take a “big bath” write-down in the fourth quarter following the offering (when manager and underwriter lock-ups typically expire). If the researcher has relatively weak priors about the timing of the originating earnings management and strong priors about its reversal, then the gains to modeling reversals can be even greater.19

The results in Figure 1 illustrate that all tests of earnings management have relatively low power in settings where 100 observations experience earnings management of 1% to 2% of total assets. A common rule of thumb in hypothesis testing is that the ex ante power of a test should be at least 80% in order to have the precision to provide reliable inferences (e.g., Aberson [2010], p. 15). None of the tests in Figure 1 meet this rule of thumb, since the highest reported rejection rate is 67.0%. Figure 2 summarizes power analysis simulations that allow us to determine the number of observations with PART= 1 necessary to generate an 80% rejection rate. For these simulations, we induce earnings management equal to 2% of assets and incorporate a 100% reversal in the next period. The test incorporating reversals reaches 80% power between 100 and 200 observations. In contrast, it takes over 300 observations for the standard t-test to reach 80% power. Performance matching further increases the required sample size to over 500 observations. These results highlight how incorporating reversals can provide more reliable tests of earnings management when large sample sizes are not available.

Figure 2.

—The power of tests for earnings management using the Healy model and earnings management sample sizes ranging from 100 to 1,000. All tests have seeded upward earnings management of 2% of assets that completely reverses in the subsequent year. We use a 5% one-tailed test as the cut-off rejection rate.
This figure is based on the Healy model:
WC_ACCi,t=a+bPARTi,t+cPARTP1i,t+ei,t
where:
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1+ Induced earningsi,t for randomly selected 100 firm-years
or
= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1 for the remaining observations
ΔCA= Change in current assets (act)
ΔCL= Change in current liabilities (lct)
ΔCash= Change in cash (che)
ΔSTD= Change in short-term debt (dlc)
Induced earnings = 2% of beginning total assets
PARTi,t= Partitioning variable that is set to 1 for randomly selected 100 firm-years and 0 otherwise
PARTP1i,t= Partitioning variable that is set to 1 in the year following randomly selected 100 firm-years and 0 otherwise.
The simulations employ 209,530 firm-year observations between 1950 and 2009. One hundred firm-years are randomly selected as earnings management years and 2% of beginning total assets are added to existing working capital accruals for those 100 randomly selected firm-years and the same amount is then subtracted from the following year's working capital accruals, assuming 100% reversal in the following year. PART is set to 1 for the 100 firm-years (0 otherwise) and PARTP1 is set to 1 for the year following the 100 earnings management years (0 otherwise). A pooled regression is estimated using all 209,530 observations. This procedure is repeated 1,000 times. The frequency reported corresponds to the number of cases where b is significant at the 5% level (one-tailed test) among 1,000 pooled regressions (i.e., Freq. b > 0) or where bc= 0 is significant at the 10% level and the difference (bc) is positive among 1,000 pooled regressions (i.e., Freq. bc > 0). The same procedure is repeated for 200 through 1,000 randomly selected firm-years in increments of 100 as shown on the x-axis. For the performance-matching model, 2% of beginning total assets are also added to existing working capital accruals for the 100 randomly selected earnings management years and each of the 100 randomly selected observations is matched with another observation from the same two-digit SIC code with the closest return on assets in the current year (i.e., year t). Performance-matched discretionary accruals for each of the 100 randomly selected observations are then computed by subtracting the matched firm's discretionary accruals and testing whether the mean performance-matched discretionary accruals for the 100 observations is significantly greater than 0 at the 5% level. This procedure is repeated 1,000 times and the rejection frequency is reported (i.e., Performance-matched discretionary accruals > 0). The same procedure is repeated for 200 through 1,000 randomly selected firm-years in increments of 100 as shown on the x-axis.

To summarize, the results in this section indicate that tests incorporating earnings management reversals do not unambiguously increase test power. If the researcher models a reversal in a period when no reversal actually occurs, test power is reduced. But so long as at least half of the earnings management reverses in the period in which it is modeled, incorporating reversals increases test power. In cases where the researcher's priors about the timing of the reversal of the earnings management are as strong as those about its origination, test power increases by almost 50% relative to the standard t-test and almost doubles compared to performance matching. If the researcher has stronger priors about the timing of the reversal than the origination of the earnings management, increases in power can be even more substantial. Given the low power of tests for earnings management in typical research settings, such gains in power can significantly increase test reliability.

5.4 power of tests for earnings management using the aaer sample

Our next set of tests examines the importance of considering accrual reversals in the AAER sample, where we have strong priors that upward earnings management has taken place. Figure 3 plots mean accruals for our AAER sample centered on the alleged year of the earnings management, comparing it to a sample where the earnings management year is randomly selected. The figure indicates that accruals are unusually high for AAER firms in year 0. Interestingly, they are also high in the years immediately prior to year 0. This could indicate that these firms were engaging in less egregious earnings management in the preceding years. Perhaps the most striking feature of Figure 3 is the strong accrual reversals that take place in years t+ 1 and t+ 2. It is perhaps these reversals that were catalysts for the SEC investigations. It seems intuitive that tests incorporating these reversals should provide more power in tests for earnings management.

Figure 3.

—Time series of working capital accruals for the Accounting and Auditing Enforcement Release (AAER) sample compared to a sample where PART is randomly assigned. Year 0 includes all years that the SEC alleges firms engage in manipulation. Working capital accruals are defined as follows (Compustat mnemonics in brackets):
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t− 1, where:
ΔCA= change in current assets (act)
ΔCL= change in current liabilities (lct)
ΔCash= change in cash (che) ΔSTD= change in short-term debt (dlc) A= total assets (at). This figure compares the time-series trend of the mean working capital accruals (WC_ACC) for the AAER sample to that for the sample where earnings management years are randomly selected. For the AAER sample, year 0 represents the 406 AAER firm-years between 1971 and 2003 and the mean working capital accruals are computed for year –5 through year 5. If earnings are managed for several consecutive years, “year 1” is the first year following the final year of earnings management. The rest of the relative years are defined in the same manner. For the sample where earnings management years are randomly selected, 100 firm-years are randomly selected from 209,530 firm-year observations between 1950 and 2009 and designated as year 0 and we calculate the mean working capital accruals. This procedure is repeated 1,000 times. The figure reports the mean working capital accruals based on the 1,000 iterations for each relative year.

Table 3 presentsthe results of formal tests for earnings management in the AAER sample. Panel A reports the coefficient estimates from the five models. Note that the underlying AAER sample is significantly enlarged and updated relative to the original sample used in DSS (we have 406 firm-years vs. 56 in their sample). The point estimates for b, the coefficient in the earnings management year range from 2.8% for the McNichols model to 5.7% for the Healy model. Note that, relative to the Healy model, all of the other models that contain additional controls for nondiscretionary accruals have lower point estimates for b. This finding is consistent with the contention in DSS that controls for nondiscretionary accruals unintentionally eliminate discretionary accruals (Problem 1 from section 2.1). The associated t-statistics are all significant at conventional levels and range from 3.714 for the McNichols model to 6.046 for the DD model.

Table 3. 
The Power of Tests for Earnings Management in the AAER Sample
 HealyJonesModified JonesDDMcNichols
  1. Tests of earnings management are conducted using the models below (characters in brackets are Compustat mnemonics):HealyWCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +ei,t,where:

  2. WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1, working capital accruals

  3. ΔCA= change in current assets (act)

  4. ΔCL= change in current liabilities (lct)

  5. ΔCash= change in cash (che)

  6. ΔSTD= change in short-term debt (dlc)

  7. A= total assets (at)

  8. PARTi,t= partitioning variable that is set to 1 for 406 AAER firm-years in which upward earnings management is alleged to have taken place

  9. PARTP1i,t= partitioning variable that is set to 1 for the first year following the final earnings management year and 0 otherwise

  10. PARTP2i,t= partitioning variable that is set to 1 for the second year following the final earnings management year and 0 otherwise.inline imagewhere:

  11. ΔREVi,t= (Revenuei,t− Revenuei,t-1) /Ai,t-1

  12. PPEi,t=PP&Ei,t/Ai,t-1:

  13. Revenue (sale) and PP&E: gross property, plant, and equipment (ppegt).inline imagewhere:

  14. ΔREVi,t= ([Revenuei,t− Revenuei,t-1]−[Net Accounts Receivablei,t− Net Accounts Receivablei,t− 1])/Ai,t-1. inline imagewhere:

  15. CFi,t= earnings before extraordinary itemsi,t (ib) −WC_ACCi,t.

  16. inline image

  17. where

  18. ΔREVi,t= (Revenuei,t− Revenuei,t-1) /Ai,t-1.

  19. For each performance matching model, discretionary accruals are estimated based on the respective model below:

  20. Healy: Discretionary accruals =WC_ACCi,t.

  21. Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+ei,t.

  22. Modified Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+ei,t.

  23. DD: Discretionary accruals are residuals from WC_ACCi,t=a+f1CFi,t-1+f2CFi,t+f3CFi,t+1+ei,t.

  24. McNichols: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2 PPEi,t+f3CFi,t-1+f4CFi,t+f5CFi,t+1+ei,t.

  25. Panel A reports the coefficients and t-statistics for each model based on a pooled regression using 161,119 observations between 1971 and 2003. PART is set equal to 1 for 406 AAER firm-years between 1971 and 2003 (0 otherwise), PARTP1 (PARTP2) is set equal to 1 for the year (for the second year) following the year in which PART equals 1. If earnings are managed for several consecutive years, PARTP1 (PARTP2) is set equal to 1 for the first (second) year following the final year of earnings management. Panel B reports the mean performance-matched discretionary accruals and associated t-statistics for 406 AAER firm-years. For performance matching on ROA(t), observations are matched with another observation from the same two-digit SIC code with the closest return on assets in the current year (t). Performance-matched discretionary accruals are then computed by subtracting the matched firm's discretionary accruals. Performance matching on ROA(t− 1) is done in a similar manner except that ROA in year t− 1 is used. Panel C reports results of testing earnings management for 406 AAER firm-years based on PART (b= 0), PART and PARTP1 (bc= 0), PART, PARTP1, and PARTP2 (bcd= 0), and performance-matched discretionary accruals. For reversal models, a pooled cross-sectional regression is run using all 161,119 observations and χ2-tests are conducted based on heteroskedasticity-consistent standard errors with χ2-statistics and associated p-values being reported. Note that b is the coefficient on PART, c is the coefficient on PARTP1, and d is coefficient on PARTP2. For performance matching models, t-tests are conducted for performance-matched discretionary accruals of 406 AAER firm-years, and F-statistics and associated p-values are reported by squaring t-statistics to maintain comparability with χ2-statistics used for reversal models.

Panel A: Coefficients (t-statistics) for each model
b, coefficient on PART0.0570.0330.0440.0500.028
 (5.987)(3.858)(4.736)(6.046)(3.714)
c, coefficient on PARTP1-0.066−0.064−0.068−0.067−0.065
 (−5.666)(−6.146)(−6.163)(−6.356)(−6.760)
d, coefficient on PARTP2−0.055−0.044−0.050−0.050−0.043
 (−4.863)(−4.245)(−4.552)(−4.849)(−4.347)
ΔRevenue 0.1090.089 0.101
  (73.708)(52.962) (74.903)
PPE −0.009−0.008 −0.014
  (−10.997)(−9.224) (−18.133)
CFO (t− 1)   0.0850.096
    (31.271)(34.238)
CFO (t)   −0.264−0.250
    (−54.733)(−52.388)
CFO (t+ 1)   0.1410.122
    (39.660)(34.745)
Adjusted R20.10%13.20%6.79%19.99%31.08%
Panel B: Performance-matched discretionary accruals (t-statistic) for each model
Mean discretionary accruals0.0540.0360.0450.0530.037
 performance matched on ROA(t− 1);(4.710)(3.500)(4.060)(5.380)(4.090)
Mean discretionary accruals0.0570.0380.0470.0560.039
 performance matched on ROA(t)(5.100)(3.740)(4.290)(5.790)(4.320)
Panel C: Significance levels of test statistics (p-values) for each model
χ2-statistic for b= 035.84914.88722.42836.55613.793
 (0.000)(0.000)(0.000)(0.000)(0.000)
χ2-statistic for bc= 066.88351.86960.38076.50857.976
 (0.000)(0.000)(0.000)(0.000)(0.000)
χ2-statistic for bcd= 089.52468.74879.86197.62674.961
 (0.000)(0.000)(0.000)(0.000)(0.000)
F-statistic for 0 discretionary22.18412.25016.48428.94416.728
 accruals performance matched on ROA(t− 1)(0.000)(0.000)(0.000)(0.000)(0.000)
F-statistic for 0 discretionary26.01013.98818.40433.52418.662
 accruals performance matched on ROA(t)(0.000)(0.000)(0.000)(0.000)(0.000)

The point estimates for c, the coefficient in the first potential “reversal” year, range from –6.4% for the Jones model to –6.8% for the Modified Jones model. All of these estimates are consistent with significant accrual reversals in the first post–earnings management year. Moreover, the reversals are all at least as large as the associated earnings management. The point estimates for d, the coefficient in the second year after the earnings management year, range from –4.3% for the McNichols model to –5.5% for the Healy model. These estimates are consistent with continued reversals in the second post–earnings management year. Note that the summed magnitude of the reversals exceeds the magnitude of the original earnings management. For example, the estimate of earnings management in the Healy model is 5.7%, while the associated reversals over the next two years sum to –6.6%–5.5%=–12.1%. There are several possible explanations for this result. First, visual inspection of Figure 3 indicates that discretionary accruals in the AAER sample are unusually high in the three years leading up the earnings management year. While year 0 is the year in which the SEC alleges that egregious earnings management has occurred, it is quite likely that these firms have been engaging in less egregious earnings management in the preceding years (e.g., Schrand and Zechman [2011]). So the reversals in years 1 and 2 may relate to the sum of the positive accruals in several prior years. Second, the SEC may be more likely to identify and litigate earnings manipulations characterized by large accrual reversals. Third, firms may be forced to liquidate operations in the immediate post-AAER period due to lack of financing stemming from investor mistrust and a higher cost of capital (e.g., McNichols and Stubben [2008]).20

Panel B of Table 3 reports the associated point estimates and test statistics on earnings management using performance-matched discretionary accruals. The point estimates are very similar to those reported in panel A. The associated t-statistics, however, are generally lower using performance matching. Recall that performance matching results in less powerful tests because of the increased standard errors resulting from the use of a matched pair.

Finally, panel C of Table 3 compares the power of the five different tests for earnings management using each of the five models. We report χ2-statistics for tests of linear constraints in the reversal models and F-statistics (which are just the square of the associated t-statistics) for the performance-matched tests. The F distribution approximates the χ2 distribution for large sample sizes, so the relative power of the tests can be directly evaluated from the magnitude of the associated test statistics. The tests incorporating accrual reversals have the largest test statistics for all five models. The performance-matched tests, meanwhile, generally have the lowest test statistics. For example, using the Healy model, the test statistics range from 89.524 for tests of the linear constraint that bcd= 0 to 22.184 for tests based on performance matching on ROA in period t− 1. Modeling accrual reversals dramatically improves the power of tests for earnings management in the AAER sample.

A potential shortcoming of the tests in Table 3 is that only a subset of the AAERs examined involves explicit allegations of upward earnings management via working capital accruals. Many of the AAERs involve other types of earnings management such as management of noncurrent assets and liabilities. Of course, the fact that there is no explicit allegation of working capital accrual management does not rule out the possibility that such accruals are managed. Managers will likely use within-GAAP earnings management in multiple accrual categories before resorting to the type of egregious GAAP violations targeted by the SEC. Nevertheless, we would expect to see greater evidence of working capital accrual management in cases where the AAER has explicitly targeted working capital accruals. Confirming that this is the case would also help to corroborate that our tests are picking up earnings management.

Table 4 replicates the tests in Table 3 for a subsample of 122 of the original 406 AAER firm-years in which the SEC explicitly alleges that a working capital account (e.g., accounts receivable, inventory, or accounts payable) was managed. In this sample, we set PART= 1 for the 122 firm-years and PART= 0 for all other firm-years (including the 384 other AAER firm-years). The results indicate that the point estimates of the amount of earnings management and associated reversals are all greater than in Table 3.21 For example, using the Healy model, the estimated coefficient on b, c, and d are 8.4%, –8.6%, and –8.9% in Table 4 versus 5.7%, –6.6%, and –5.5% in Table 3. Note, however, that the statistical significance is lower in all tests, due to the vastly reduced sample size.

Table 4. 
The Power of Models to Detect Earnings Management in a Subset of the AAER Sample Involving Allegations of Working Capital Accruals Manipulation
 HealyJonesModified JonesDDMcNichols
  1. This table repeats the tests reported in Table 3 for 122 AAER firm-years that are alleged to have manipulated earnings through working capital accruals (i.e., accounts receivable, inventory, or accounts payable). Tests of earnings management are conducted using the models below (characters in parentheses are Compustat mnemonics):HealyWCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +ei,t,where:

  2. WC_ACCi,t= (ΔCAi,tΔCLi,tΔCashi,tSTDi,t)/Ai,t-1= working capital accruals

  3. ΔCA= change in current assets (act)

  4. ΔCL= change in current liabilities (lct)

  5. ΔCash= change in cash (che)

  6. ΔSTD= change in short-term debt (dlc)

  7. A= total assets (at)

  8. PARTi,t= partitioning variable that is set to 1 for 122 AAER firm-years that are alleged to have manipulated earnings through working capital accruals (i.e., accounts receivable, inventory, or accounts payable)

  9. PARTP1i,t= partitioning variable that is set to 1 for the first year following the final earnings management year and 0 otherwise

  10. PARTP2i,t= partitioning variable that is set to 1 for the second year following the final earnings management year and 0 otherwise.inline imagewhere:

  11. ΔREVi,t= (Revenuei,t− Revenuei,t-1)/Ai,t-1

  12. PPEi,t=PP&Ei,t/Ai,t-1:

  13. Revenue (sale) and PP&E= gross property, plant, and equipment (ppegt).inline imagewhere:

  14. ΔREVi,t= ([Revenuei,t− Revenuei,t-1]−[Net Accounts Receivablei,t− Net Accounts Receivablei,t-1])/Ai,t-1. inline imagewhere

  15. CFi,t= earnings before extraordinary itemsi,t (ib) −WC_ACCi,t.
    inline imagewhere

  16. ΔREVi,t= (Revenuei,t− Revenuei,t-1)/Ai,t-1.

  17. For each performance matching model, discretionary accruals are estimated based on the respective model below:

  18. Healy: Discretionary accruals =WC_ACCi,t.

  19. Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+ei,t.

  20. Modified Jones: Discretionary accruals are residuals from WC_ACCi,t= a +f1ΔREVi,t+f2PPEi,t+ei,t.

  21. DD: Discretionary accruals are residuals from WC_ACCi,t=a+f1CFi,t-1+f2CFi,t+f3CFi,t+1+ei,t.

  22. McNichols: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+f3CFi,t-1+f4CFi,t+f5CFi,t+ 1+ei,t.

  23. Panel A reports the coefficients and t-statistics for each model based on a pooled regression using 161,119 observations between 1971 and 2003. PART is set equal to 1 for 122 AAER firm-years that are alleged to have manipulated earnings through working capital accruals (i.e., accounts receivable, inventory, or accounts payable), (0 otherwise), PARTP1 is set equal to 1 for the first year following those 122 AAER firm-years (0 otherwise), and PARTP2 is set equal to 1 for the second year following the 122 AAER firm-years (0 otherwise). If earnings are managed for several consecutive years, PARTP1 (PARTP2) is set equal to 1 for the first (second) year following the final year of earnings management. Panel B reports the mean performance-matched discretionary accruals and associated t-statistics for 122 AAER firm-years that are alleged to have manipulated earnings through working capital for each model. For performance matching on ROA(t), observations are matched with another observation from the same two-digit SIC code with the closest return on assets in the current year (t). Performance-matched discretionary accruals are then computed by subtracting the matched firm's discretionary accruals. Performance matching on ROA(t− 1) is done in a similar manner except that ROA in year t− 1 are used. Panel C reports results of testing earnings management for 122 AAER firm-years that are alleged to have manipulated earnings through working capital accruals based on PART (b= 0), PART and PARTP1 (bc= 0), PART, PARTP1, and PARTP2 (bcd= 0), and performance-matched discretionary accruals. For reversal models, a pooled cross-sectional regression is run using all 161,119 observations and χ2-tests are conducted based on heteroskedasticity-consistent standard errors with χ2-statistics and associated p-values being reported. Note that b is the coefficient on PART, c is the coefficient on PARTP1, and d is coefficient on PARTP2. For performance matching models, t-tests are conducted for performance-matched discretionary accruals for 122 AAER firm-years that are alleged to have manipulated earnings through working capital accruals and F-statistics and associated p-values are reported by squaring t-statistics to maintain comparability with χ2-statistics used for reversal models.

Panel A: Coefficients (t-statistics) for each model
b, coefficient on PART0.0840.0590.0690.0740.051
 (4.774)(3.648)(4.023)(4.743)(3.642)
c, coefficient on PARTP1−0.086−0.077−0.084−0.085−0.076
 (−4.438)(−3.995)(−4.207)(−4.745)(−4.221)
d, coefficient on PARTP2−0.089−0.079−0.086−0.078−0.070
 (−4.930)(−4.343)(−4.605)(−5.320)(−4.745)
ΔRevenue 0.1090.089 0.101
  (73.743)(53.004) (74.946)
PPE −0.009−0.008 −0.014
  (−10.930)(−9.166) (−18.051)
CFO (t− 1)   0.0850.096
    (31.261)(34.234)
CFO (t)   −0.264−0.250
    (−54.733)(−52.386)
CFO (t+ 1)   0.1410.122
    (39.651)(34.739)
Adjusted R20.07%13.18%6.77%19.96%31.06%
Panel B: Performance-matched discretionary accruals (t-statistic) for each model
Mean discretionary accruals0.0670.0470.0550.0650.046
 performance matched on ROA(t− 1)(3.260)(2.400)(2.700)(3.510)(2.580)
Mean discretionary accruals0.0830.0640.0710.0760.058
 performance matched on ROA(t)(3.950)(3.280)(3.440)(4.150)(3.380)
Panel C: Significance levels of test statistics (p-values) for each model
χ2-statistic for b= 022.79213.30916.18522.49413.263
 (0.000)(0.000)(0.000)(0.000)(0.000)
χ2-statistic for bc= 042.19529.21933.80044.84931.020
 (0.000)(0.000)(0.000)(0.000)(0.000)
χ2-statistic for b − c − d= 066.36347.93854.86472.14452.651
 (0.000)(0.000)(0.000)(0.000)(0.000)
F-statistic for 0 discretionary10.6285.7607.29012.3206.656
 accruals performance matched on ROA(t− 1)(0.002)(0.018)(0.008)(0.001)(0.011)
F-statistic for 0 discretionary15.60310.75811.83417.22311.424
 accruals performance matched on ROA(t)(0.001)(0.001)(0.001)(0.000)(0.001)

To summarize, the results in this section indicate that tests incorporating reversals of earnings management display substantially higher power in the AAER sample. The biggest increases come from modeling the reversal in the first subsequent year, with additional improvements from modeling reversals in the second year. The results also highlight that the choice of nondiscretionary accrual model has relatively little impact on test power. In addition, it appears that the controls for nondiscretionary accruals can actually extract some discretionary accruals, further reducing test power. Remember, however, that these models are also designed to mitigate misspecification stemming from correlated nondiscretionary accruals. Thus, the choice between competing models should also consider model specification in the absence of earnings management. Our next set of results examines misspecification in more detail.

5.5 specification of tests for earnings management in samples with extreme economic characteristics

As described earlier, a common problem that arises in tests of earnings management is the omission of determinants of nondiscretionary accruals. In particular, the hypothesized determinants of earnings management are often correlated with economic characteristics that may influence nondiscretionary accruals. Earnings management tests that do not control for these correlated determinants of nondiscretionary accruals will be misspecified (Problem 2 in section 2.1). DSS illustrate this problem by showing that samples of firms with extreme earnings and cash flow performance have rejection rates that deviate substantially from the specified test levels. Commonly proposed solutions to this problem include adding the omitted variables to the nondiscretionary accruals models (e.g., Jones, Modified Jones, DD, and McNichols models) and the performance matching procedures suggested by Kothari, Leone, and Wasley [2005]. These procedures have several limitations. Most importantly, they require the researcher to know the relevant omitted variables. Yet, existing research only provides a rudimentary understanding of the determinants of nondiscretionary accruals. Also, controlling for these determinants can reduce test power both through the incorrect modeling of discretionary accruals as nondiscretionary, and by increasing standard errors through the inclusion of irrelevant variables.

Incorporating accrual reversals in tests of earnings management provides an alternative approach for mitigating model misspecification. Most firms are going concerns, and so their economic characteristics tend to persist. As such, any associated nondiscretionary accruals should also persist. Earnings management, on the other hand, must ultimately reverse. If the researcher has reasonable priors concerning the timing of reversals, tests incorporating reversals should mitigate correlated omitted variable bias.

Figure 4 illustrates the intuition behind the above argument. DSS document that accruals are correlated with contemporaneous earnings performance. Figure 4 plots mean working capital accruals for firms in the highest and lowest deciles of earnings performance (i.e., ROA) in event year 0. The firms with high earnings performance have positive accruals in year 0 and the firms with low earnings performance have negative accruals in year 0. But, we also see that these patterns in accruals tend to persist in the surrounding years. This pattern can be contrasted to the accruals for the AAER earnings management sample in Figure 3. Accruals are also high in year 0 for this sample, but they exhibit a strong reversal in the next two periods, which is the trademark of earnings management. As long as the omitted determinants of nondiscretionary accruals do not reverse, incorporating accrual reversals provides a general solution for mitigating associated misspecification in tests of earnings management. The two key advantages of this approach are: (1) the underlying determinants of nondiscretionary accruals do not need to be identified and (2) the power of the tests for earnings management increases (recall that the performance matching approach, in contrast, reduces test power).

Figure 4.

—Time series of mean working capital accruals for the highest and lowest earnings performance deciles.
Working capital accruals are defined as follows (Compustat mnemonics in brackets):
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t− 1,
where:
ΔCA= change in current assets (act)
ΔCL= change in current liabilities (lct)
ΔCash= change in cash (che)
ΔSTD= change in short-term debt (dlc)
A= Total assets (at).
This figure compares the time-series trend of the mean working capital accruals (WC_ACC) for the highest earnings performance decile to that of the lowest earnings performance decile. Earnings performance is defined as earnings before extraordinary items (ib) divided by beginning total assets (at). We first rank 209,530 observations between 1950 and 2009 into decile portfolios based on earnings performance. We randomly select 100 firm-years from the highest earnings performance decile and designate them as year 0 and calculate the mean level of working capital accruals. We then determine the mean level of working capital accruals for each relative year. This procedure is repeated 1,000 times and the figure reports the mean level of working capital accruals across the 1,000 iterations for each relative year. The same procedure is performed for the lowest earnings decile. For the time-series trend of working capital accruals based on the total observations, all 209,530 firm-years are designated as year 0 and the mean working capital accruals are computed for year –5 through year 5.

Table 5 formally evaluates the ability of tests incorporating accrual reversals to mitigate misspecification using simulations on samples with extreme earnings performance. Note that we have selected the sample based on the same variable that Kothari, Leone, and Wasley [2005] use for performance-matching (i.e., earnings performance). Thus, we have simulated the exact situation where their performance-matching technique should be most effective by construction. The more interesting question we address here is the effectiveness of modeling accrual reversals in this situation.

Table 5. 
Rejection Rates When Earnings Management Years Are Selected from the Highest and Lowest Deciles of Earnings Performance
HealySignificantly PositiveSignificantly Negative
Test level: one-tailed 5%HighestLowestHighestLowest
  1. ** and * significantly different from the specified test level at 1% and 5% level, respectively, using a two-tailed binomial test.

  2. This table reports rejection rates for the null hypothesis of no earnings management for observations in the highest and lowest deciles of earnings performance (i.e., earnings before extraordinary items (ib) divided by beginning total assets (at)). Tests of earnings management are conducted using the following models (characters in brackets are Compustat mnemonics):Healy : WCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +ei,t,where:

  3. WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1= working capital accruals

  4. ΔCA= change in current assets (act)

  5. ΔCL= change in current liabilities (lct)

  6. ΔCash= change in cash (che)

  7. ΔSTDSTD= change in short-term debt (dlc)

  8. A= total assets (at)

  9. PARTi,t= partitioning variable that is set to 1 for randomly selected 100 firm-years and 0 otherwise

  10. PARTP1i,t= partitioning variable that is set to 1 in the year following randomly selected 100 firm-years and 0 otherwise

  11. PARTP2i,t= partitioning variable that is set to 1 in the second year following randomly selected 100 firm-years and 0 otherwise.inline imagewhere:

  12. ΔREVi,t= (Revenuei,t− Revenuei,t-1) /Ai,t-1

  13. PPEi,t=PP&Ei,t/Ai,t-1:

  14. Revenue (sale) and PP&E= gross property, plant, and equipment (ppegt).inline imagewhere:

  15. ΔREVi,t= ([Revenuei,t− Revenuei,t-1]−[Net Accounts Receivablei,t− Net Accounts Receivablei,t-1])/Ai,t-1.inline image

  16. where

  17. CFi,t= Earnings before extraordinary itemsi,t (ib) −WC_ACCi,t.
    inline image

  18. where

  19. ΔREVi,t= (Revenuei,t− Revenuei,t-1) /Ai,t-1.

  20. For each performance matching model, discretionary accruals are estimated based on the respective model below:

  21. Healy: Discretionary accruals =WC_ACCi,t.

  22. Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+ei,t.

  23. Modified Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+ei,t.

  24. DD: Discretionary accruals are residuals from WC_ACCi,t=a+f1CFi,t-1+f2CFi,t+f3CFi,t+1+ei,t.

  25. McNichols: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+f3CFi,t-1+f4CFi,t+f5CFi,t+1+ei,t.

  26. All 209,530 firm-year observations between 1950 and 2009 are first grouped into decile portfolios based on earnings performance. Rejection rates for the null hypothesis of no earnings management are reported based on PART (b= 0), PART and PARTP1 (bc= 0), PART, PARTP1, and PARTP2 (bcd= 0). We randomly select 100 firm-years from the highest earnings performance decile as earnings management years, and PART is set equal to 1 for those 100 firm-years (0 otherwise). PARTP1 (PARTP2) is set equal to 1 for the first (second) year following the 100 earnings management years (0 otherwise). A pooled cross-sectional regression is run using all 209,530 observations and χ2-tests are conducted based on heteroskedasticity-consistent standard errors. This procedure is repeated 1,000 times and the table reports the percent of the 1,000 regressions where the χ2-statistic is significantly positive (or significantly negative) at the 5% level using a one-tailed test. A binomial test is performed to determine whether the percentage of rejections is significantly different from the specified 5% test level. The same procedure is performed for the lowest earnings performance decile. For performance matching on ROA(t), each of 100 randomly selected observations is matched with another observation from the same two-digit SIC code with the closest return on assets in the current year (t). Performance-matched discretionary accruals are then computed by subtracting the matched firm's discretionary accruals. Performance matching on ROA(t− 1) is done in a similar manner except ROA in year t− 1 is used. This procedure is repeated 1,000 times and the table reports the percent of the 1,000 repetitions where the t-statistic is significantly positive (or significantly negative) at the 5% level using a one-tailed test. A binomial test determines whether the percentage is significantly different from the specified 5% test level. The same procedure is performed for the lowest earnings performance decile.

b= 098.8%**0.0%**0.0%**86.9%**
bc= 027.8%**0.0%**0.0%**30.3%**
bcd= 07.2%**1.8%**1.5%**12.1%**
Performance matched on ROA(t− 1)7.8%**4.0%3.4%*3.7%
Performance matched on ROA(t)6.9%*5.5%3.1%*4.1%
JonesSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%HighestLowestHighestLowest
b= 048.7%**0.0%**0.0%**76.9%**
bc= 04.6%0.2%**3.6%*27.2%**
bcd= 01.2%**1.4%**9.2%**13.4%**
Performance matched on ROA(t− 1)7.1%**5.2%3.3%*3.6%*
Performance matched on ROA(t)5.5%5.4%3.5%*4.5%
Modified JonesSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%HighestLowestHighestLowest
b= 069.9%**0.0%**0.0%**73.5%**
bc= 012.1%**0.2%**1.0%**26.1%**
bcd= 04.1%1.8%**4.1%12.5%**
Performance matched on ROA(t− 1)7.3%**5.0%3.3%*3.9%
Performance matched on ROA(t)5.8%5.7%2.9%**4.3%
DDSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%HighestLowestHighestLowest
b= 0100.0%**0.0%**0.0%**99.3%**
bc= 062.0%**0.0%**0.0%**57.9%**
bcd= 020.5%**0.4%**0.2%**25.3%**
Performance matched on ROA(t− 1)9.2%**3.3%*2.3%**5.4%
Performance matched on ROA(t)7.9%**3.6%*3.0%**5.3%
McNicholsSignificantly PositiveSignificantly Negative
Test level: one-tailed 5%HighestLowestHighestLowest
b= 089.3%**0.0%**0.0%**96.8%**
bc= 020.7%**0.0%**0.1%**54.1%**
bcd= 06.8%*0.4%**2.2%**27.5%**
Performance matched on ROA(t− 1)7.2%**4.3%3.3%*4.3%
Performance matched on ROA(t)5.5%4.2%3.8%5.6%

The left-hand side of Table 5 reports rejection frequencies using a 5% test level for the alternative hypothesis of positive earnings management. Consistent with previous research, we see that the standard test for b= 0 has excessively high rejection frequencies in the highest earnings performance decile for all five models. For example, the Healy model rejects 98.8% of the time using a 5% test level. We also see that, as expected, performance matching substantially mitigates this problem. For example, the rejection frequencies for the Healy model drop to 7.8% matching on ROAt-1 and 6.9% matching on ROAt. More importantly, we also see that modeling reversals substantially alleviates the excessive rejection frequencies. For example, using the Healy model, the rejection frequencies drop to 27.8% when testing bc= 0 and 7.2% when testing bcd= 0.

The right-hand side of Table 5 reports rejection frequencies for the alternative hypothesis of negative earnings management. Consistent with previous research, the standard test for b= 0 has excessively high rejection frequencies in the lowest earnings performance sample for all four models. For example, the Healy model rejects 86.9% of the time using a 5% test level. We also see that, as expected, performance matching substantially mitigates this problem. More importantly, we again see that modeling reversals substantially alleviates the excessive rejection frequencies. For example, using the Healy model, the rejection frequencies drop to 30.3% when testing bc= 0 and 12.1% when testing bc − d= 0. Note that, in this particular context, performance matching on ROA is more effective at mitigating misspecification. But remember that we have simulated the exact situation where performance matching on ROA should work. We now move to tests where we select samples based on other economic characteristics to evaluate the extent to which these results generalize.

We select four alternative economic characteristics for our additional specification tests. We select three characteristics that are previously shown to result in misspecified tests in Kothari, Leone, and Wasley [2005]. These three characteristics are sales growth, size (i.e., market capitalization at the end of the period), and cash flows (i.e., earnings before extraordinary items less working capital accruals, deflated by beginning of period total assets). The fourth characteristic is the consensus analyst forecast of long-term earnings growth. McNichols [2000] identifies this characteristic as an important correlated omitted variable in tests for earnings management.

Table 6 reports the simulated rejection frequencies using a 5% test level for samples of 100 firm-years randomly selected from the extreme deciles of each of these characteristics. It is useful to interpret these rejection frequencies in conjunction with the graphs in figure 5, which show event-time plots of mean working capital accruals for the extreme decile portfolios of each of these characteristics. Panel A (B) of Table 6 reports the frequency of significantly positive (negative) rejections of the null hypothesis of no earnings management. The first column of results is for sales growth. We can see from figure 5(A) that high sales growth firms tend to have positive accruals in year 0. Thus, we should expect to see excessive rejection frequencies for tests of positive earnings management in samples of firms with high sales growth. The potential reduction in misspecification from modeling a reversal depends on the persistence of the associated accruals. Figure 5(A) indicates that high accruals in year 0 remain high (i.e., persist) over the next two years. Our reversal test that subtracts the level of accruals in year 1 from the level of accruals in year 0 should therefore reduce the likelihood that the Healy model will reject in favor of positive discretionary accruals. This intuition is confirmed for the Healy model in panel A of Table 6, where the standard test for b= 0 is rejected 88.8% of the time in the highest sales growth sample. The rejection frequencies drop to 50.4% when testing b − c= 0 and 37.4% when testing b − c − d= 0. We also see that performance matching does little to mitigate this misspecification, with performance matching on ROAt only lowering rejection frequencies from 88.8% to 78.8%. Incorporating reversals, however, significantly reduces misspecification. Thus, while none of the tests substantially eliminates misspecification, incorporating reversals helps more than performance matching.

Table 6. 
Rejection Rates When Earnings Management Years Are Selected from the Highest and Lowest Deciles of Various Economic Characteristics
Panel A: Rejection rates in the direction of positive earnings management
HealySales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 088.8%**0.0%**0.0%**0.0%**0.0%**75.1%**56.6%**0.0%**
bc= 050.4%**0.1%**6.4%1.2%**0.0%**76.3%**24.5%**3.4%*
bcd= 037.4%**0.3%**41.8%**1.5%**0.0%**70.6%**21.1%**45.8%**
Performance matched on ROA(t− 1)76.2%**0.1%**0.2%**1.6%**0.0%**96.2%**32.7%**0.8%**
Performance matched on ROA(t)78.8%**0.0%**0.2%**1.1%**0.0%**96.0%**34.0%**0.9%**
JonesSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 05.0%1.9%**0.0%**0.5%**0.0%**72.4%**8.7%**3.9%
bc= 03.7%8.2%**3.6%*2.0%**0.0%**75.0%**12.7%**5.7%
bcd= 03.4%*11.2%**13.4%**2.4%**0.0%**68.8%**18.7%**3.6%*
Performance matched on ROA(t− 1)5.6%13.0%**0.4%**3.4%*0.0%**92.3%**10.5%**3.8%
Performance matched on ROA(t)6.5%*13.0%**0.3%**2.8%**0.0%**92.8%**7.3%**3.3%*
Modified JonesSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 014.2%**1.1%**0.5%**0.5%**0.0%**79.9%**21.0%**4.9%
bc= 09.4%**4.8%6.6%*1.9%**0.0%**79.4%**21.9%**6.6%*
bcd= 08.3%**8.0%**18.0%**2.4%**0.0%**73.6%**27.4%**4.3%
Performance matched on ROA(t− 1)18.4%**4.1%0.4%**2.3%**0.0%**94.4%**15.1%**2.5%**
Performance matched on ROA(t)23.3%**4.2%0.2%**1.9%**0.0%**94.8%**14.1%**1.9%**
DDSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 089.5%**0.0%**0.0%**0.0%**0.0%**21.5%**39.3%**0.0%**
bc= 040.7%**0.0%**6.4%1.2%**0.0%**18.6%**16.3%**1.5%**
bcd= 031.1%**0.2%**55.6%**1.3%**0.0%**20.9%**19.0%**49.0%**
Performance matched on ROA(t− 1)82.8%**0.1%**0.0%**2.3%**0.0%**87.5%**30.0%**0.2%**
Performance matched on ROA(t)84.2%**0.0%**0.0%**2.7%**0.0%**87.4%**33.2%**0.3%**
McNicholsSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 05.9%0.5%**0.0%**0.4%**0.0%**20.0%**3.9%2.5%**
bc= 02.0%**7.5%**3.6%*1.8%**0.0%**15.9%**9.0%**2.7%**
bcd= 02.1%**11.5%**19.8%**2.1%**0.0%**16.6%**20.1%**2.1%**
Performance matched on ROA(t− 1)7.6%**15.5%**0.0%**5.1%0.0%**79.8%**9.4%**1.5%**
Performance matched on ROA(t)7.5%**15.3%**0.1%**5.0%0.0%**78.8%**7.8%**3.1%**
Panel B: Rejection rates in the direction of negative earnings management
HealySales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 00.0%**98.6%**52.9%**54.7%**100.0%**0.0%**0.0%**93.9%**
bc= 00.0%**46.8%**4.0%14.4%**100.0%**0.0%**0.2%**8.4%**
bcd= 00.2%**17.6%**0.1%**10.6%**100.0%**0.0%**0.4%**0.1%**
Performance matched on ROA(t− 1)0.0%**47.1%**37.3%**13.8%**100.0%**0.0%**0.3%**18.2%**
Performance matched on ROA(t)0.0%**41.3%**34.4%**13.2%**100.0%**0.0%**0.2%**13.4%**
JonesSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 06.5%12.9%**35.9%**24.4%**100.0%**0.0%**2.7%**5.6%
bc= 07.8%**2.6%**7.0%**10.3%**100.0%**0.0%**0.9%**6.4%
bcd= 06.2%1.9%**1.4%**7.4%**100.0%**0.0%**0.4%**8.0%**
Performance matched on ROA(t− 1)5.9%1.0%**25.3%**7.0%**100.0%**0.0%**2.6%**8.8%**
Performance matched on ROA(t)3.6%*1.1%**23.8%**6.7%*100.0%**0.0%**3.4%*6.5%*
Modified JonesSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 01.9%**19.8%**22.7%**23.3%**100.0%**0.0%**0.8%**4.6%
bc= 02.1%**5.2%3.4%*10.2%**100.0%**0.0%**0.2%**5.5%
bcd= 02.1%**2.5%**0.8%**7.3%**100.0%**0.0%**0.1%**7.3%**
Performance matched on ROA(t− 1)1.5%**4.9%29.0%**8.6%**100.0%**0.0%**1.3%**12.2%**
Performance matched on ROA(t)0.5%**4.9%25.4%**9.2%**100.0%**0.0%**1.6%**7.6%**
DDSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
  1. ** and * significantly different from the specified test level at 1% and 5% level, respectively, using a two-tailed binomial test.

  2. This table reports rejection rates for tests of the null hypothesis of no earnings management for observations in the highest and lowest deciles of various economic characteristics. Sales growth is the current year sales divided by the sales in the prior year. Size is the market value at fiscal year end. Cash flows are earnings before extraordinary (ib) items less working capital accruals (WC_ACC). Earnings growth is the median of analysts’ long-term earnings growth forecast from I/B/E/S for the last month of the fiscal year. Tests of earnings management are conducted using the following models:Healy : WCACCi,t =a + bPARTi,t +cPARTP 1i,t +dPARTP 2i,t +ei,t,

  3. where:

  4. WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t-1= working capital accruals

  5. ΔCA= change in current assets (act)

  6. ΔCL= change in current liabilities (lct)

  7. ΔCash= change in cash (che)

  8. ΔSTD= change in short-term debt (dlc)

  9. A= total assets (at)

  10. PARTi,t= partitioning variable that is set to 1 for randomly selected 100 firm-years and 0 otherwise

  11. PARTP1i,t= partitioning variable that is set to 1 in the year following randomly selected 100 firm-years and 0 otherwise

  12. PARTP2i,t= partitioning variable that is set to 1 in the second year following randomly selected 100 firm-years and 0 otherwise

  13. Characters in brackets are Compustat mnemonics and the same definitions apply to the following models.inline image

  14. where:

  15. ΔREVi,t= (Revenuei,t− Revenuei,t-1)/Ai,t-1

  16. PPEi,t=PP&Ei,t/Ai,t-1:

  17. revenue (sale) and PP&E= gross property, plant, and equipment (ppegt).inline image

  18. where:

  19. ΔREVi,t= ([Revenuei,t− Revenuei,t-1]− ([Net Accounts Receivablei,t− Net Accounts Receivablei,t-1])/Ai,t-1: net accounts receivable (rect).inline image

  20. where:

  21. CFi,t= earnings before extraordinary itemsi,t(ib) −WC_ACCi,t.inline image

  22. where

  23. ΔREVi,t= (Revenuei,t− Revenuei,t-1)/Ai,t-1.

  24. For each performance matching model, discretionary accruals are estimated based on the respective model below:

  25. Healy: Discretionary accruals =WC_ACCi,t.

  26. Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+ei,t.

  27. Modified Jones: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+ei,t.

  28. DD: Discretionary accruals are residuals from WC_ACCi,t=a+f1CFi,t-1+f2CFi,t+f3CFi,t+1+ei,t.

  29. McNichols: Discretionary accruals are residuals from WC_ACCi,t=a+f1ΔREVi,t+f2PPEi,t+f3CFi,t-1+f4CFi,t+f5CFi,t+1+ei,t.

  30. All 209,530 firm-year observations (53,025 firm-year observations for earnings growth) between 1950 and 2009 are first grouped into decile portfolios based on various economic characteristics. Rejection rates for the null hypothesis of no earnings management are reported based on PART (b= 0), PART and PARTP1 (bc= 0), PART, PARTP1, and PARTP2 (bc − d= 0). We randomly select 100 firm-years from the highest decile as earnings management years, and PART is set equal to 1 for those 100 firm-years, 0 otherwise. PARTP1 (PARTP2) is set equal to 1 for the first (second) year following the 100 earnings management years, 0 otherwise. A pooled cross-sectional regression is run using all available observations and χ2-tests are conducted based on heteroskedasticity-consistent standard errors. This procedure is repeated 1,000 times and the table reports the percent of the 1,000 regressions where the χ2-statistic is significantly positive (or significantly negative) at the 5% level using a one-tailed test. A binomial test is performed to determine whether the percentage of rejections is significantly different from the specified 5% test level. The same procedure is performed for the lowest decile. For performance matching models, performance-matched discretionary accruals are computed for 100 randomly selected earnings management firm-years and t-tests are conducted to determine whether the mean performance-matched discretionary are significant at 5% level (one-tailed test). For performance matching on ROA(t), each of 100 randomly selected observations is matched with another observation from the same two-digit SIC code with the closest return on assets in the current year. Performance-matched discretionary accruals are then computed by subtracting the matched firm's discretionary accruals. Performance matching on ROA(t− 1) is done in a similar manner except that ROA in year t− 1 is used. This procedure is repeated 1,000 times and we report the percent of the 1,000 repetitions where the t-statistic is significantly positive (or significantly negative) at the 5% level using a one-tailed test. A binomial test determines whether the percentage is significantly different from the specified 5% test level. The same procedure is performed for the respective lowest decile.

b= 00.0%**99.7%**61.1%**57.8%**99.3%**0.4%**0.3%**99.0%**
bc= 00.0%**49.4%**2.3%**15.2%**97.2%**1.0%**0.6%**11.7%**
bc − d= 00.2%**19.7%**0.1%**11.4%**93.1%**0.3%**0.8%**0.0%**
Performance matched on ROA(t− 1)0.0%**45.1%**59.5%**10.9%**100.0%**0.0%**0.2%**30.7%**
Performance matched on ROA(t)0.0%**38.0%**57.2%**9.3%**100.0%**0.0%**0.0%**21.0%**
McNicholsSales GrowthSizeCash FlowsEarnings Growth
Test level: one-tailed 5%HighestLowestHighestLowestHighestLowestHighestLowest
b= 04.7%19.3%**40.4%**27.5%**100.0%**0.6%**7.5%**8.0%**
bc= 09.5%**3.1%**4.2%10.6%**98.4%**1.4%**2.6%**7.2%**
bc − d= 08.5%**2.4%**0.6%**8.8%**94.1%**0.7%**0.8%**12.1%**
Performance matched on ROA(t− 1)3.4%*0.9%**42.9%**4.8%100.0%**0.0%**2.9%**11.8%**
Performance matched on ROA(t)2.7%**0.5%**43.3%**4.7%99.8%**0.0%**3.9%8.7%**
Figure 5.

Figure 5.

—Time series of mean working capital accruals for highest and lowest deciles of various economic characteristics.
Working capital accruals are defined as follows (Compustat mnemonics in parentheses):
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t− 1,
where:
ΔCA= change in current assets (act)
ΔCL= change in current liabilities (lct)
ΔCash= change in cash (che) ΔSTD= change in short-term debt (dlc) A= total assets (at).
This figure compares the time-series trend of the mean working capital accruals (WC_ACC) for the highest decile to that of the lowest decile for various economic characteristics. Sales growth is the current year sales divided by the sales in the prior year. Size is the market value of common equity at fiscal year end. Cash flows are earnings before extraordinary (ib) items less working capital accruals (WC_ACC). Earnings growth is the median of analysts’ long-term earnings growth forecast from I/B/E/S for the last month of the fiscal year. For this figure, there are 209,530 observations between 1950 and 2009 (53,025 firm-year observations for earnings growth). We first rank observations between 1950 and 2009 into decile portfolios based on the economic characteristic. We randomly select 100 firm-years from the highest decile and designate them as year 0 and calculate the mean level of working capital accruals. We then determine the mean level of working capital accruals for each relative year. This procedure is repeated 1,000 times and the figure reports the mean level of working capital accruals across the 1,000 iterations for each relative year. The same procedure is performed for the lowest decile. For the time-series trend of working capital accruals based on the total observations, all 209,530 firm-years are designated as year 0 and the mean working capital accruals are computed for year –5 through year 5.

Figure 5.

Figure 5.

—Time series of mean working capital accruals for highest and lowest deciles of various economic characteristics.
Working capital accruals are defined as follows (Compustat mnemonics in parentheses):
WC_ACCi,t= (ΔCAi,t−ΔCLi,t−ΔCashi,tSTDi,t)/Ai,t− 1,
where:
ΔCA= change in current assets (act)
ΔCL= change in current liabilities (lct)
ΔCash= change in cash (che) ΔSTD= change in short-term debt (dlc) A= total assets (at).
This figure compares the time-series trend of the mean working capital accruals (WC_ACC) for the highest decile to that of the lowest decile for various economic characteristics. Sales growth is the current year sales divided by the sales in the prior year. Size is the market value of common equity at fiscal year end. Cash flows are earnings before extraordinary (ib) items less working capital accruals (WC_ACC). Earnings growth is the median of analysts’ long-term earnings growth forecast from I/B/E/S for the last month of the fiscal year. For this figure, there are 209,530 observations between 1950 and 2009 (53,025 firm-year observations for earnings growth). We first rank observations between 1950 and 2009 into decile portfolios based on the economic characteristic. We randomly select 100 firm-years from the highest decile and designate them as year 0 and calculate the mean level of working capital accruals. We then determine the mean level of working capital accruals for each relative year. This procedure is repeated 1,000 times and the figure reports the mean level of working capital accruals across the 1,000 iterations for each relative year. The same procedure is performed for the lowest decile. For the time-series trend of working capital accruals based on the total observations, all 209,530 firm-years are designated as year 0 and the mean working capital accruals are computed for year –5 through year 5.

The next set of results in panel A of Table 6 is for the Jones model. Not surprisingly, the reported rejection frequencies are much closer to the specified test level of 5% using the Jones model. This is because the Jones model explicitly incorporates sales growth as a determinant of nondiscretionary accruals. Results are somewhat similar for the Modified Jones and McNichols models. Finally, the rejection frequencies for tests using the DD model are similar to those for the Healy model, with the tests incorporating reversals being the least misspecified.

The next column of results in Table 6 is for size. We can see from figure 5(B) that large firms tend to have lower than average accruals in year 0 and that these accruals are strongly persistent. Thus, we should expect to see rejection frequencies that are higher than the specified test levels for tests of negative earnings management in samples of large firms (reported in panel B of Table 6). Moreover, because there is strong persistence in size, modeling accrual reversals in tests of earnings management should mitigate the high rejection frequencies. This intuition is confirmed for the Healy model in panel B of Table 6, where the standard test for b= 0 is rejected 52.9% of the time in the highest size sample. Incorporating a one-period reversal significantly reduces misspecification, with the rejection frequency falling to 4.0% for b − c= 0. However, incorporating a two-period reversal results in rejection frequencies falling to an excessively low rate of 0.1% for b − c − d= 0. This perverse result arises because size is a strongly persistent characteristic. Thus, modeling reversals over the next two periods creates the impression of positive earnings management (i.e., the sum of –c and –d is almost twice the magnitude of b). Performance matching, in contrast, does little to mitigate size-related misspecification, with performance matching on ROAt only reducing the rejection frequency to 34.3%. The rejection frequencies for the other models display a similar pattern, with the tests incorporating a one-period reversal consistently proving to be the least misspecified.

The next column of results in Table 6 is for cash flows. We can see from figure 5(C) that, in contrast to earnings and sales growth, where both have a positive correlation with accruals, cash flows exhibit a negative correlation with accruals. Note that, from our discussion in section 2.4, incorporating reversals is the least helpful in improving test specification when the correlated omitted variable is negatively serially correlated. Consistent with this expectation, the reversal results in Table 6 show the least improvement for cash flows. Because there is strong mean reversion in accruals associated with extreme cash flows (accruals for low cash flow firms are 0.07 in year 0 and revert to –0.01 in year 1), modeling accrual reversals has limited effectiveness in mitigating misspecification. This intuition is confirmed for the Healy model in panel A of Table 6, where the standard test for b= 0 is rejected 75.1% of the time in the lowest cash flow sample. Incorporating earnings management reversals has little impact on the rejection rates. Consistent with the results in Kothari, Leone, and Wasley [2005], performance matching on ROA actually makes the problem worse, increasing the rejection frequencies from 75.1% to 96.0%. These results are broadly consistent across the other accrual models. The McNichols model is the best specified of all models, because it incorporates a comprehensive set of explanatory variables. The rejection frequency for b= 0 is 20.0% and this declines to 15.9% for the test b − c= 0. Performance matching results in more misspecified tests, increasing rejection rates from 20.0% to 78.8% when matching on ROAt. Thus, while incorporating accrual reversals results in modest improvements, performance matching on ROA exaggerates misspecification.

The final column of results in Table 6 is for long-term earnings growth. We can see from figure 5(D) that high-growth firms tend to have higher than average accruals in year 0, and that these accruals are weakly persistent over the next three periods. Thus, we should expect to see excessive rejection frequencies for tests of positive earnings management in samples of high-growth firms that are mitigated in tests incorporating accrual reversals. This intuition is confirmed for the Healy model in panel A of Table 6, where the standard test for b= 0 is rejected 56.6% of the time in the high-growth sample. Incorporating reversals reduces the rejection frequencies to 24.5% for b − c= 0 and 21.1% for b − c − d= 0. Performance matching is less effective at mitigating misspecification, with performance matching on ROAt reducing the rejection frequencies to 34.0%. These results are broadly consistent using the DD model. The Jones, Modified Jones, and McNichols models, in contrast, result in less excessive rejection frequencies for all tests. This is because these models control for sales growth, alleviating the correlated omitted variable problem.

We close this subsection by summarizing the key findings:

  • 1) Misspecification due to correlated omitted variables associated with economic characteristics is a pervasive problem in tests for earnings management. There is no panacea for this problem. Performance matching only mitigates the problem when the matching procedure happens to identify the appropriate omitted variable(s). When it does not, it can make the problem worse.
  • 2) Tests for earnings management that incorporate accrual reversals mitigate misspecification due to correlated omitted variables for a range of economic characteristics. This approach does not require that the correlated economic characteristics be known, but just that they do not reverse in the same period as the earnings management.
  • 3) In the case of highly persistent correlated omitted economic characteristics, incorporating reversals over two or more subsequent periods can lead to an overcorrection problem. Modeling reversals over only one period avoids this problem.

6. Conclusions and Implications

A key feature of the accrual accounting process is that accrual distortions in one period must reverse in another period. In the case of working capital accruals that typically span less than a year, such reversals generally occur within a year or so of the original distortion. We provide a flexible framework for incorporating these reversals in tests for earnings management and show that existing accrual-based tests for earnings management can be significantly improved by incorporating accrual reversals.

We close by providing some caveats and guidelines for earnings management research. Our caveats are twofold:

  • 1) The power of accrual-based tests for earnings management is low for earnings management of plausible magnitudes. For example, our simulations indicate that even the most powerful tests that incorporate reversals reject the null hypothesis of no earnings management less than 30% of the time with earnings management equal to 1% of total assets, a sample size of 100 firms and a test level of 5%.
  • 2) Tests for earnings management are susceptible to misspecification due to the omission of correlated determinants of nondiscretionary accruals. There is no panacea for this problem, because models of nondiscretionary accruals are crude and determinants of earnings management are often correlated with economic characteristics that influence nondiscretionary accruals.

In terms of guidelines for future research, we emphasize that the appropriate choice of test for earnings management will depend on the specific research setting being considered. With this in mind, we offer the following broad guidelines:

  • 1) If the researcher has reasonable priors concerning the timing of the reversal of earnings management, then tests incorporating these priors will generally have improved power and specification. In the absence of specific priors, a reasonable assumption is that working capital accruals will reverse in the subsequent year (e.g., Allen, Larson, and Sloan [2010]).
  • 2) In terms of selecting an appropriate model of nondiscretionary accruals, the researcher should consider economic characteristics that are likely to be correlated with the hypothesized earnings management. For example, if the hypothesized earnings management is correlated with firm growth, use of the Jones or Modified Jones models should alleviate omitted variable bias. At the same time, the researcher should also consider the nature of the hypothesized earnings management so as to avoid reducing test power by unintentionally classifying discretionary accruals as nondiscretionary. For example, if earnings are managed through deferred revenues, both the Jones and Modified Jones models will tend to classify the earnings management as nondiscretionary. Similarly, if earnings are managed in order to smooth the underlying cash flows, the DD model will tend to classify the earnings management as nondiscretionary.
  • 3) We suggest caution in the use of performance matching. Performance matching is generally only effective when the correlated omitted variables are known and can therefore be used to identify an appropriate matched pair. Performance matching also entails a significant reduction in test power and so is undesirable in cases where the researcher's loss function places a relatively high weight on Type II errors.

Finally, while we follow prior research in restricting our analysis to aggregate working capital accruals, the reversal framework can be extended to individual working capital accrual accounts and to long-term accruals (e.g., PP&E and purchased goodwill). An advantage of focusing on specific accrual accounts is that researchers could directly observe and measure write-downs of previous accrual overstatements (e.g., inventory write-downs, goodwill impairments).

Footnotes

  • 1

    Dechow and Skinner [2000] review the earnings management literature, discuss the prevalence of earnings management, and provide both academic and practitioner perspectives on earnings management.

  • 2

    Dechow, Ge, and Schrand [2010] provide a recent review of this research.

  • 3

    A growing body of research examines the properties and pricing of accrual reversals (e.g., Defond and Park [2001], Allen, Larson, and Sloan [2010], Baber, Kang, and Li [2011], Fedyk, Singer, and Sougiannis [2011]). To our knowledge, ours is the first study to formally develop and evaluate techniques for measuring earnings management that incorporate accrual reversals.

  • 4

    Intuitively speaking, incorporating reversals in tests of earnings management is equivalent to doubling the number of observations for which earnings management occurs and correspondingly reducing the number of observations for which earnings management does not occur.

  • 5

    Note that, if the correlated omitted variable is highly persistent, then incorporating reversals completely eliminates the associated correlated omitted variable problem. On the other hand, if the correlated omitted variable is completely transitory, incorporating reversals only partially eliminates the associated correlated omitted variable problem. It is only in cases where the correlated omitted variable is highly negatively serially correlated that incorporating reversals would not improve test specification. For further details, see section 2.4.

  • 6

    See chapter 4 of Maddala [2001] for an analysis of the consequences of omitted variables for OLS estimation.

  • 7

    The inequality assumes b > 0. For b < 0, 0 ≤β(-μ)(PART)≤–b.

  • 8

    Assumptions (1), (2), and (3) are made to simplify the analysis and clarify the underlying intuition. Assumption (1) is also required in deriving the test-statistics for equation (1) that we use as a benchmark. Assumption (2) serves to clarify that the teststatistics assume we correctly identify both the earnings management and reversal periods. The test statistics for equation (1) only require that we correctly identify the earnings management period. We examine the sensitivity of the results to violations of assumption (1) in later simulation tests. Assumption (3) is also required in deriving the test statistics for equation (1). This assumption is required because overlapping earnings management and reversal periods will offset each other, resulting in no net discretionary accruals, thus reducing the power of all tests for earnings management.

  • 9

    Another way to think about the underlying intuition is that, for small ρ, the correlation between PART and PARTR is close to 0. Under such circumstances, PART and PARTR each explain distinct variation in DA, and so each adds to the power of the combined test statistic. But as ρ approaches 0.5, the correlation between PART and PARTR approaches –1. Under such circumstances, PART and PARTR both explain the same variation in DA, and so incorporating PARTR does not increase the power of the combined test statistic.

  • 10

    Note that, with respect to Problem 1, we expect a complete reversal of –μ, because these are discretionary accruals that we missed when they originated, and so we also expect to miss their reversal. Thus, incorporating reversals does not mitigate Problem 1.

  • 11

    Note that this testing procedure differs slightly from the one described in section 2.4, where we create a single new earnings management partitioning variable, PART′=PARTPARTR. Using PART′ simplifies the analytics in section 2.4 and yields the same result when reversals are symmetric. Incorporating both PART and PARTR allows us to separately observe the estimated magnitude of the earnings management and the associated reversal to evaluate whether the magnitudes are economically plausible.

  • 12

    DSS suggest that this adjustment only be made in years that earnings management is hypothesized. We make the adjustment in all years for two reasons. First, the change in accounts receivable has a positive sample mean, and so only adjusting earnings management years causes the change in sales to be downward biased in earnings management years and discretionary accruals to be upward biased in earnings management years, leading to excessive rejections of the null. We confirmed this fact in unreported tests. Second, when modeling reversals, an adjustment would also be required in reversal years, making the selective adjustment of earnings management–related years cumbersome.

  • 13

    We note that Dechow and Dichev [2002] do not specifically propose that their model be used in tests of earnings management, but subsequent research has adopted it in this context. See McNichols [2002] and Dechow, Ge, and Schrand [2010] for further details.

  • 14

    We examined a number of earnings management cases identified by the SEC in Accounting and Auditing Enforcement Releases to verify that Compustat's unrestated data picks up accrual reversals associated with earnings management. In cases where there is no prior period restatement, the managed accrual reverses naturally (i.e., the managed accrual is removed from the balance sheet and charged to net income in the period that it is discovered). In cases where the discovery of earnings management leads to a prior period restatement, the previous years’ financial statements are restated to remove the managed accrual from the financial statements and the subsequent financial statements are presented as if there never was a managed accrual (i.e., the managed accrual does not appear on the balance sheet and is not charged to net income in the period of discovery). Because we compute accruals using adjacent annual balance sheets, we pick up accrual reversals regardless of whether or not there is a prior period restatement (i.e., in either case, the balance sheet reflects the managed accrual before it is discovered, and does not reflect it after it is discovered). Note that the cash flow statement approach to measuring accruals (see Hribar and Collins [2002]) would not pick up reversals in the case of prior period restatements, because the managed accrual is not charged to earnings in the year of discovery.

  • 15

    The SAS code that we use is proc surveyselect data = data1 method = seq n = 100 out = data2 reps = 1,000 seed = 1,347,865.

  • 16

    For example, if a χ2 test rejects the null hypothesis that bc= 0 at the 10% level and bc > 0, we register a rejection of the null hypothesis that bc≤ 0 at the 5% level.

  • 17

    Consensus analyst forecasts of long-term earnings growth are obtained from I/B/E/S and are only available for a subset of 53,025 firm-years.

  • 18

    The key takeaways are similar using the other models and tests, and so we omit them for brevity.

  • 19

    Other examples where the researcher may have stronger priors concerning the timing of the reversal relative to the origination of the earnings management include “big bath” write-downs by an incoming CEO to reverse earnings management by a predecessor and the reversal of “cookie jar” reserves, whereby earnings are managed slightly downward over an extended time period in order to provide flexibility to manage earnings upward in future challenging periods.

  • 20

    In unreported tests, we confirmed that the accrual reversals for the AAER sample in years t+ 1 and t+ 2 are attributable to the reversal of past asset accruals rather than the origination of new liability accruals. We also confirmed that cash flows for the AAER sample are unusually low in years t+ 1 and t+ 2. These results suggest that the year t+ 1 accrual reversals are due to the reversal of past upward management in asset accruals rather than the origination of new liability accruals or the liquidation of assets.

  • 21

    In unreported tests, we estimate a single panel regression incorporating separate earnings management partitioning variables for the 122 working capital–related AAERs and the 384 other AAERs. Using F-tests for equality of coefficients across the two samples, we find that the coefficients on the partitioning variables are generally significantly higher for the working capital AAERs.

Ancillary