Institutional embeddedness and the language of accountability: evidence from 20 years of Canadian public audit reports

Due to the expansion of the mandate assigned to public auditors in the past decades, audit reports have become more prominent indicators of the quality of government. Accordingly, it is important to investigate the factors that shape the communication of audit findings. We suggest that while internal and legislative auditors belong to the same community of practice, they are also embedded in distinct institutional environments that incentivize them to report their findings in different ways. In particular, we hypothesize that to draw attention and mobilize support for their work, legislative auditors are encouraged to use a language that is more negative and emotive than internal auditors. Apply-ing methods of computational text analysis to a corpus of 3245 audit reports produced in the Government of Canada between 2000 and 2019, we present empirical evidence in favor of these hypotheses. Among other things, our findings provide large-sample evidence that despite comparable professional norms and guidance, public auditors are sensitive to their institutional context and, in response to their environment, resort to rhetorical strategies to either amplify or mitigate the reputational risks associated with their reports.


INTRODUCTION
Over the last 50 years, the importance and scope of auditing as a tool of governance have increased substantially.
Beginning in the 1970s under the impetus of the New Public Management (NPM) movement, the mandate of auditors expanded from relatively straightforward financial auditing focused on fraud detection and attesting to the reliability of financial statements to a much broader but loosely defined audit universe, where more contestable notions of valuefor-money (VFM) and performance are prominent (Johnsen, 2019;Power, 2003;Posner & Shahan, 2014). Auditors are now expected to contribute to the delivery of public accountability and organizational improvement by judging the systems and practices used to manage performance, risks, human resources, procurement, privacy, IT, sustainability, ethics, and more (Funnell, 2015;Sher-Hadar, 2020). As a result, audit reports have become important indicators of the quality of government.
Auditing is typically conceptualized as an independent and objective function, increasingly shaped by common professional norms and standards. However, in practice, the context within which it exists matters, and like other accounting technologies, it is not immune to institutional pressures (Broadbent & Guthrie, 2008). As with other professionals, auditors are bound to respond to the incentives of their work environment, including when reporting their findings. Such pressures should certainly be expected in the public sector, where political polarization, public distrust, and easier access to information mean that audit findings pose significant reputational risks for politicians and public managers (Hay & Cordery, 2018;OECD, 2011;Rika & Jacobs, 2019).
Within the body of contextualized studies on public accounting, the potential influence of institutional pressures on the practices of public auditors has been noted (e.g., Heald, 2018;Johnsen, 2019;Palmer, 2008;Schillemans & Busuioc, 2015). However, empirical evidence remains limited, and studies mostly focus on a small number of cases, cover short periods of time, and rely on surveys and interviews (Arena & Jeppesen, 2016;Funnell, 2015;Rika & Jacobs, 2019;Skaerbaek, 2009;Thomasson, 2018). As Johnsen (2019) notes, audit reports remain surprisingly underexploited as a source of evidence given how central they are to the practice of auditing.
This article addresses some of these gaps through a comparative analysis of legislative and internal audit reports produced in the Canadian government over the last 2 decades. While both internal and legislative auditors are embedded in a common community of practice, they also operate in significantly different institutional contexts. Legislative auditors-those employed by Canada's Supreme Audit Institution (SAI)-report their findings to parliamentarians and do not work in the organizations they audit. In contrast, internal auditors act as independent watchdogs within their organizations and report their findings to senior executives, not politicians.
We argue that despite common professional guidance on reporting, these variations in institutional environments generate incentives for public auditors to use different communication styles when presenting audit findings. Following Rutherford (2005, p. 350), we reason that ''the context within which accounting narratives are produced provides both opportunities for, and constraints on, communication'' and that the strategic response to these incentives will influence "the tone as well as the content of narrative descriptions.'" In keeping with this insight, our study does not analyze the themes and substantive arguments of audit narratives but rather focuses on broader rhetorical characteristics of audit reporting language, such as its degree of positivity/negativity (tone), its accessibility (and hence potential for obfuscation), and its emotive qualities (conveying the surprise or displeasure generated by audit findings, for example). We rely on indicators already widely used in computational linguistics and, increasingly so, in accounting, finance, and public administration (Hollibaugh, 2019).
Our research strategy is as follows. First, working from an institutional embeddedness perspective (Granovetter, 1985), we identify the institutional incentives associated with the working environment of public auditors. We posit that legislative auditors can further their influence and status by using language that is simpler, more negative, and relatively more emotive than internal auditors, while internal auditors will tend to use less accessible, more positive, and more clinical language to blunt their criticism and ensure a better reception of their reports by senior executives.
We then test these conjectures by analyzing the language of 3245 audit reports prepared within the Government of Canada between 2000 and 2019. We use natural language processing (NLP) techniques to produce readability, tone, and emotions measures for each report and use statistical approaches to identify differences between types of audits and changes over time. As expected, we find that legislative auditors use language that is more negative and emotive than internal auditors, although both legislative and internal audit reports became more positive over time.
However, in contrast to our expectations, we find that both types of reports used more complex language and became less readable over the period.
By empirically demonstrating that legislative auditors use a language that is more negative and emotional than their internal counterparts, our findings give credence to suggestions that legislative auditors strategically invest in a reputation for "toughness" and "hunt for headlines" as a strategy to demonstrate their value and maximize their impact. Conversely, our findings also support those who claim that internal auditors tend to blunt their criticisms to minimize reputational risks for their organizations, either to avoid personal repercussions or to increase the likelihood of a favorable reception by senior executives. More generally, they show that audit communications are context sensitive, thereby contributing to the wider literature emphasizing the importance of context when investigating how accounting techniques are constructed and used in the public sector (Broadbent & Guthrie, 2008).
The article has six sections. Section 2 explains the role of public auditors and the concept of institutional embeddedness, which is used to make predictions about how the language of internal and legislative audit reports is likely to differ due to their distinct contexts. Section 3 explains how the corpus of reports was developed and analyzed. Results are presented and discussed in Sections 4 and 5. Section 6 provides brief conclusions.

THE INSTITUTIONAL CONTEXT OF PUBLIC AUDITING
Several studies have shown that the practice of public sector accounting, which includes auditing, is largely "contextual." Accounting techniques themselves can change in response to the environment, as managers and auditors respond to organizational and paradigmatic shifts in the public sector, such as the rise of NPM (see Bruns et al. (2020) for a recent review). Techniques and related practices can also be harnessed by strategic actors, sometimes in unanticipated ways, to change their environment (Broadbent & Guthrie, 2008). This reflexive relationship between accounting and its context means that to better understand how accounting is used in the public sector, we must carefully consider its institutional environment.
In particular, studies have shown how audit communications are affected by contextual factors and institutional pressures. Triantafillou (2020), for instance, showed that Norway's SAI publishes performance audit reports that are overly critical in tone to bolster its independence and legitimacy. In her study of Sweden's local governments, Thomasson (2018) uncovered strategies used by auditors to quell or amplify public debates surrounding their reports that may harm/improve their own standing and that of their auditees. Svärdsten (2019) found that Sweden's SAI auditors neutralize potential criticism of their performance audits by referring to external sources of authority. More generally, a number of authors contend that public auditors can be insufficiently independent from auditees (Lonsdale, 1999;Power, 2003) and, as a result, tend to "conceal, soften, sugar-coat, or under-report adverse findings" (Kells, 2011, p. 387). In contrast, others criticize SAI auditors for using dramatic language to attract attention to their work (Morin, 2008).
Such responses to institutional norms and incentives are to be expected in most social settings since human agency is partly conditioned by the networks of relations in which agents are embedded (Granovetter, 1985;Lim et al., 2016).
To consider the potential impact of institutional norms and incentives on public auditors, we apply an analytical framework that emphasizes the importance of institutional embeddedness (Lounsbury, 2008;Modell, 2009). From this perspective, individuals pursue personal goals (e.g., self-preservation, social status, and power), but the context in which they live affects how these goals are identified and achieved.
The institutional arrangements and social networks that create norms and incentives affecting behaviors often emanate from the workplace, but they can also derive from the legal environment and communities to which individuals belong. An institutional embeddedness perspective also recognizes that individuals internalize norms to different extents, sometimes reacting strategically to further their personal goals (Oliver, 1991). In other words, our framework reflects the combined role of instrumental rationality and contextual institutional factors in shaping the behavior of auditors (Broadbent & Guthrie, 2008;Granovetter, 1985;Lounsbury, 2008;Schillemans & Busuioc, 2015).
Focusing on their institutional context can provide insights into how public auditors communicate their findings and allow us to predict variations in the language of legislative and internal audit reports. As rational agents, auditors will want to advance their careers and enhance their social status while ensuring their self-preservation. In doing so, they will face a tradeoff between maximizing the impact of their work and avoiding behaviors that can undermine relationships with colleagues, superiors, and other stakeholders. However, legislative and internal auditors experience this tradeoff in different ways due to differences in their institutional environment, and this could be reflected in how they communicate their findings in audit reports.
In this light, we identify key differences in their institutional contexts and use these to predict potential rhetorical strategies observed in audit reports (Lim et al., 2016). Informed by the literature on computational text analysis, we looked for variations in language along three dimensions: the extent to which reports use plain language (readability), the extent to which they sound positive or negative (tone), and the degree to which they convey emotions, such as anger, as opposed to being more clinical and bureaucratic. Our intuition is that despite sharing a community of practice and auditing the same organizations, differences in institutional incentives may lead internal and legislative auditors to adopt different rhetorical strategies. In particular, they may write more plainly or adopt a more critical tone to further their interests.
These language metrics are increasingly applied in accounting, finance, and public administration research (Hollibaugh, 2019). For instance, in finance, variations in readability measures have been associated with strategic behaviors by executives, including information hoarding and deception (Ertugrul et al., 2017). In accounting, measures of tone have been used to find positive biases in corporate reports that are expected to be neutral (Rutherford, 2005) and to explore the impact of more negative oversight hearings on agency morale in the American government (Marvel & McGrath, 2016). More recently, measures of emotions have been used to study the impact of economic conditions on the communication practices of central bankers (Buechel et al., 2019).
Formulating expectations about how audit communications may vary requires paying careful attention to the incentives auditors face since these will vary across countries depending on culture and the design of political and audit institutions. For example, political regimes based on the Westminster model, such as Canada, tend to be more adversarial, creating a more politicized environment for the reception of audit reports than other regimes. Internal auditors are also given different levels of resources and independence across countries. Here, we offer a stylized description of the institutional context faced by Canadian public auditors, although most of its features should also be relevant to the audit systems of other democratic countries.
In thinking about their institutional context, we first recognize that, with few exceptions, public auditors cannot force governments and public executives to do anything (Posner & Shahan, 2014, p. 502). Their influence stems from a combination of relevance, expertise, and legitimacy. This is particularly true of internal auditors who lack the visibility and clout of their legislative counterparts. More concretely, internal auditors shape managerial behaviors by writing reports that are seen as timely, relevant, and persuasive by the senior executives who receive them. Otherwise, their reports are simply dismissed or ignored. Hence, to improve the impact of their work, internal auditors have an incentive to present their findings in a constructive and noncritical manner.
Moreover, internal auditors must also be sensitive to the reputational risks their reports entail, particularly when they expose gaps between stakeholders' expectations and organizational performance. While internal audits are essentially tools of management as opposed to instruments of democratic accountability, embarrassing audit reports can harm the organization if they are proactively disclosed, accessed through freedom of information (FOI), legislation or leaked to the press (Power, 2003;Rika & Jacobs, 2019). While public releases through FOI requests and leaks are likely across most jurisdictions, the Canadian government has a distinctive policy of proactive disclosure of audit reports, forcing internal auditors to be sensitive to their public reception (Liston-Heyes & Juillet, 2020).
In any case, across all jurisdictions, the possibility that audit reports will be used by outside stakeholders to criticize the organization creates an incentive for internal auditors to tone down the presentation of audit findings. To be useful, internal auditors must report instances of mismanagement, poor performance, and outright fraud, but doing so using a more objective and positive tone (e.g., "improvements are possible") can dampen potential reactions from critics and reduce reputational risks to the organization (Wise & Freitag, 2002). For the same reasons, they are also likely to favor more complex and bureaucratic language that will be understood by senior executives but remain more obscure to noninitiates, further reducing the risk that a given report will be used to embarrass the organization. Finally, internal auditors occupy an uncomfortable position within their organization. As "internal watchdogs," they are tasked with assessing their colleagues' work and reporting their findings to senior executives. Fear of reprisals and ostracization within their own organization further incentivizes the toning down of reports (Liston-Heyes & Juillet, 2019).
Legislative auditors operate in a different context. Their influence also rests on their capacity to deliver reports that are seen as competent and relevant by senior executives responsible for the audited organizations (Reichborn- Kjennerud & Johnsen, 2015). However, unlike internal auditors, legislative auditors can also gain influence through political pressure (Yesilkagit & Van Thiel, 2012). Ministers and civil servants are more likely to adopt legislative audit recommendations if they feel that parliamentary and public reactions are such that they cannot be ignored with impunity (Posner & Shahan, 2014, pp. 502-503;Reichborn-Kjennerud & Johnsen, 2015;Schillemans & Busuioc, 2015).
As a result, SAIs are increasingly "using the media and civil society to enhance the follow-up and impact of their reports" (Shand, 2013, 835). On this score, Canada's SAI-the Office of the Auditor General (OAG) -may be among the more active communicators. Unlike other jurisdictions, its communication strategies target not only parliamentarians but also the broader public, including through the use of social media (Hazgui et al., 2022). Due to its direct engagement with citizens and high media profile, the OAG has even been criticized for placing itself "above the House of Commons," its formal master (Sutherland, 2003, 212). Among Westminster systems, we note that Australia's auditorgeneral office has also been found to cultivate a "strategic partnership" with the media as a way of putting pressure on noncooperative auditees, including by adapting their reports' format, style, and vocabulary (Parker et al., 2021, p. 157).
Hence, to maximize the likelihood of impact by mobilizing this broader audience, legislative auditors have an incentive to use as plain a language as possible when communicating their findings to parliamentarians, journalists, and the broader public. More readable reports, clearly written and containing nontechnical terms, will typically have a greater impact (Shand, 2013, p. 834;Stone & Lodhia, 2019). This reality explains why some SAIs, including those of Denmark and Canada, have made the greater use of plain language an official objective of their office (Good, 2014;Hazgui et al., 2022).
Legislative auditors also have an incentive to use a more negative tone and appeal to emotions, such as anger, to maximize the likelihood that their reports will be noticed by their stakeholders and mobilize them to pressure ministers and civil servants to adopt their recommendations (Kells, 2011;Thomasson, 2018). Presenting findings using a more negative tone can also enhance perceptions of SAI independence from the government and win it favors with opposition parties (Hong, 2019;Nielsen & Moynihan, 2017). The extent to which a more "combative" style of communication is adopted will vary based on differences in national political and administrative culture. However, there is little doubt of its presence in Canada, where periodic overt conflicts with government have been found to be an "inherent and necessary ingredient" of the work of the OAG (Good, 2014, p. 124). The senior executive of the Privy Council Office (i.e., the cabinet office) who described the OAG as a "vicious beast" in a landmark study of Canadian financial management may overstate the case but illustrates how the senior public service can find the OAG's criticism cutting and inflammatory (Good, 2014, p. 118).
Finally, while excessively critical and emotive language could eventually undermine the credibility of legislative auditors (Cordery & Hay, 2019), such language will rarely expose them to retaliation from their work colleagues.
Unlike internal auditors, the career prospects and professional standing of legislative auditors are typically enhanced by drawing public attention to audit reports and strengthening the SAI's reputation as an independent watchdog (Schillemans & Busuioc, 2015).
In summary, legislative and internal auditors are embedded in different institutional environments. These variations create incentives for auditors to report their findings differently, as context mediates how audit technologies are deployed (Broadbent & Guthrie, 2008). In particular, we posit that internal auditors will produce less readable reports that target a narrower range of readers, adopt a more positive tone, and refrain from using emotive language that can heighten reputational risks to their organizations. In contrast, we expect legislative auditors to use more accessible, negative, and emotive language in their reports to attract attention from parliamentarians and the general public and gain more influence.
Despite these differences, we acknowledge that all public auditors still share common professional norms and standards, audit the same universe of organizations, and report on similar issues. These commonalities result in significant overlap in vocabulary. 1 In this regard, our contention is not that the language of internal and legislative auditors will be entirely different but rather that, despite a largely shared vocabulary, they will use the rhetorical flexibility available to them to communicate audit findings differently in response to the distinctive incentives presented by their institutional context. Such leeway is clearly available in modern auditing, especially since auditors have long moved beyond a strict focus on narrow financial audits to include more diversified and malleable reporting on compliance, risk, and performance issues. In Canada, the growing focus on risk and compliance audits has been particularly notable over the last 20 years (Liston-Heyes & Juillet, 2021).

METHODS AND DATA
This project entailed the construction of a database of audit reports and their conversion to a format suitable for computational text analysis. Measures of readability, tone, and emotions were then computed for each report and analyzed statistically to explore differences in the language strategies used in internal and legislative audits and how these evolved through time.

Constructing and preparing the corpus of data
The construction of the database took approximately 6 months. First, we identified and located audit reports produced by the federal government in the past 2 decades. To do so, we combined an inventory of reports from the Treasury Board Secretariat, departmental audit plans and lists of audit reports posted on government websites, as well as searches of Library and Archives Canada, the government's official publication repository. This process led to a master list of 3521 reports, 2822 of which were collected directly from departmental websites and online archives and through special requests to departments.
The reports were then converted from pdf to text files and preprocessed to reduce noise. The final dataset includes 3245 audit reports (2520 internal and 725 legislative) from 64 organizations published between 2000 and 2019. Overall, we successfully located and converted 92% of the reports on our master list. The corpus of data comprises over 23 million words, with reports averaging 7160 words (24 pages).

Measures of readability
We rely on a number of readability indices to assess the complexity of the language used in audit reports. These indices typically combine measures of language complexity such as average word length, syllable density, and sentence length.
Scores are then compared with scales reflecting reading grade level. Readability scores are valuable in themselves and when used as benchmarks to track differences in narrative styles between groups of documents (Feldman et al., 2010).
For example, Stone and Lodhia (2019) documents how the International Integrated Reporting Council's efforts failed to improve the readability of integrated corporate reports, hindering their accessibility to stakeholders. A growing number of studies of public administration are also using readability measures to assess the clarity of communications with stakeholders (Feeney, 2012;Hollibaugh, 2019).
We recognize that readability measures are only a partial proxy for the accessibility of audit reports. By focusing on the syntactical complexity of text, they ignore other potential components of accessibility, such as the extent of disclosure and dissemination. They also disregard reader characteristics that can make the reports easier to understand, such as level of interest and experience with the subject matter (Stone & Parker, 2013). While availability, understandability, and readability can be considered different components of a report's accessibility, readability is the one most firmly under the control of auditors and hence potentially subject to strategic manipulation. Accordingly, readability indices are used as indicators of accessibility in this study.
Since their computation can lead to significant differences in readability scores, we followed Ertugrul et al. (2017) and used several indices, including the Flesch Kincaid Grade Level, Flesch Reading Ease, Smog, Gunning Fog, Coleman Liau, Automated Readability, and the Lix index. 1 Loughran and McDonald (2011) also argued that many complex words in finance and accounting are part of an established vocabulary that is understood by readers. For this reason, they advocate the use of simpler measures of complexity, such as document and sentence lengths (Feeney, 2012). Hence, we also computed document length (words, pages), sentence length, and three measures of word complexity: average number of syllables per word, number of polysyllable words per word, and number of long words per word.
After testing a number of options, we used "textacy," a Python facility built on the spaCy library that performs a variety of NLP tasks. 2 Textacy uses punctuation, capitalization, and a syntactic parse tree to determine sentence boundaries. It thus produces more consistent and representative measures of sentence length, enhancing the relevance of the readability indices. We also checked for outliers that could significantly influence and potentially bias the results. We found that 21 internal audits had unusually high readability measures. The statistical procedures were conducted with and without these outliers.

Sentiment analysis
We applied sentiment analysis techniques to assess the tone and emotive quality of audit reports. Tone is widely used in the finance and accounting literature (Loughran & MacDonald, 2016) and refers to the extent to which a text is written using positive or negative language. Emotive language refers to the use of words that express intense feelings or have the ability to arouse emotions. It can be measured along a number of dimensions, including polarity (the use of highly positive/negative language rather than more neutral and descriptive terms), subjectivity (words typically associated with opinions and judgments rather than objective descriptions), and their association with a range of archetypal emotions. We argue that the use of a more positive/negative tone or more emotive language suggests rhetorical efforts to influence the reception of audit reports.
We used two different techniques to assess tone and emotiveness. First, we used "textblob.sentiments," a sentiment function trained on the Brown corpus and Peen Treebank. 3 This tool focuses on adjectives to measure the polarity and subjectivity of texts. Its polarity indicator ranges from −1 to 1, where 1 and −1 refer to documents that contain many positive/negative adjectives, respectively, while documents with scores fluctuating approximately 0 are considered neutral in tone. Subjectivity is assessed on a narrower range [0, 1], with 1 indicating documents that are opinionated, emotive, and/or judgmental, whereas 0 indicates more neutral and factual documents (Rajsingh et al., 2018). Textblob.sentiments uses a lexicon of over 94,000 words.
Next, we used a "bag-of-words" approach that relies on lists of words that were associated with each category of emotions. The proportional count of words related to a given emotion scaled by total document words is used as the index of that particular emotion. Since there are no dictionaries designed specifically for the analysis of public administration or auditing, we used the Word-Emotion Association Lexicon-a generic dictionary developed by the National Research Council of Canada (NRC). 4 The NRC lexicon contains over 14,000 words and was created by manual annotation on a crowdsourcing platform. 5 It targets tones (positive/negative) and eight basic and prototypical "paired" emotions (anticipation-surprise, trust-disgust, anger-fear, and joy-sadness) (Mohammad & Turney, 2013). According to psychologists, these "primary" emotions underpin all human responses to stimuli and, among other effects, can influence learning and motivation in message recipients (Plutchik, 1980).
Linguistic studies also show that these emotions are detectable in various forms of text discussing any subject, albeit to varying degrees. As administrative documents, audit reports are expected to convey fewer emotions than literary or political texts. However, while they should score low on these metrics in absolute terms, we are interested in differences between the scores of internal and legislative audits. In this regard, we are only interested in variations within what Rutherford (2005, p. 349) might refer to as a "subgenre" of accounting narratives.

Statistical analyses
Indices of readability and sentiments were computed for each of the reports and added to other report identifiers such as year of publication and a control identifying the report as an internal or legislative audit. 6 Simple independent t-tests and descriptive statistics were carried out on internal and legislative audits. Correlation coefficients and graphs tracking the evolution of readability indices were also produced for each type of audits (Figures 1-5; Tables 1-6).

Caveats
We note four methodological caveats to this study. First, as administrative documents, audit reports are not ideally suited for sentiment analysis. Auditors are expected to adopt writing styles that are objective and unemotional when presenting their findings (Vanstraelen et al., 2012). However, this problem is mitigated by the fact that, in contrast to much NLP analysis, our focus is not on absolute measures of emotions and tone but rather on differences between two groups of auditors and potential changes over time (Feldman et al., 2010).
Second, according to McDonald (2011, 2016), caution should be exercised when interpreting indicators of positivity in texts when negation is not taken into account. In formal communications, authors will often prefer to negate positive words rather than use a negative word (e.g., "the outcome was not as good" as opposed to "the outcome was bad"). Since accounting for negation surrounding positive words is complex, positivity indices tend to be overstated. More generally, language is context specific, and some words may be perceived to be more or less negative in a public auditing setting than in a general setting. Unfortunately, a lexicon distinctive to public auditing was not available at the time of this study.
Third, due to the challenges of parsing documents with varied structures and headings, our analyses are conducted on the entire reports as opposed to shorter key extracts, potentially reducing variations across the indicators. In other words, our findings are based on average tone, emotion, and readability per report, but there may be particular sections that are more or less neutral in tone or easier to read (Silge & Robinson, 2020). Focusing on differences between reports, as opposed to absolute levels, helps mitigate this problem.

F I G U R E 1 Readability measures
Fourth, while we conducted a number of robustness tests testing the reliability of our findings across different years, department size, audit activity, and subsections of the dataset (Tables A.2

-A.5 in Supporting Information
Appendix), additional controls not included here could help explain some of the observed differences.
Finally, our study does not capture the full range of audit communications. While audit reports are considered the most important method of communicating audit findings, auditors communicate with auditees throughout the audit process. It is also an accepted practice for auditees to provide feedback on drafts of the audit report (Shand, 2013).

F I G U R E 2 Sentence length and word complexity
The findings of legislative audits are also communicated through press conferences, testimony before parliamentary committees, and media debriefs (Gonzalez-Diaz et al., 2013). The accessibility, tone, and emotiveness of the language used by auditors in these forums may be different. Our findings are specific to the content of audit reports and may not be representative of the language used by auditors in all venues. Table 1 provides the mean scores of the readability indices and the results of independent samples t-tests of the differences in readability scores between internal and legislative audit reports. For six of the seven indices, those differences

F I G U R E 3 Tone measures
are statistically significant at higher than the 1% level (p < 0.001). However, while the mean readability scores systematically indicate that legislative reports are more difficult to read, these differences are relatively small. The simpler measures of language complexity advocated by Loughran and McDonald (2011) yield the same general results, that is, while legislative audit reports are 6.5 pages longer on average, the differences in sentence length and word complexity measures are marginal.
On the whole, we conclude that the readability of legislative and internal audit reports is comparable. We also note that the mean scores are relatively high across the board, meaning that audit reports are considered difficult to read for members of the general public. While there are some variations, most indices suggest that some college education is required to understand an audit report on a first reading. 7 Moreover, these readability scores are similar to those of integrated corporate reports, which are considered insufficiently readable to effectively support accountability to stakeholders and investors (Stone & Lodhia, 2019).
To gain insights into audit readability through time, we plotted annual averages of the various indicators of Table 1 for internal and legislative audits (Figures 1 and 2) and produced correlation coefficients (Tables 2 and 3). All but two of the correlation coefficients are statistically significant, indicating an increase in report complexity over time. In both cases, word complexity is increasing in line with these trends. While the coefficient on sentence length is positive and statistically significant for legislative audits, it is negative for internal audits. Nonetheless, on the whole, both legislative and internal audit reports have become more complex and less readable over the last 20 years.
Next, we investigated the tone and emotiveness of audit language. As shown in Table 4 and Figures 3-5, there are statistically significant differences in how internal and legislative audit reports are written, with legislative auditors using a more critical tone to report their findings and making greater use of words evoking negative emotions, such as anger, disgust, fear, and sadness. They also use more polarized and subjective language. Meanwhile, internal auditors adopt a more positive tone and make greater use of words that evoke trust. On this basis, we conclude that the institutional environment of legislative auditors leads them to adopt a more critical discourse and present their findings in ways that are more susceptible to arouse emotions in readers.
Focusing on the correlation coefficients presented in Tables 5 and 6, we examined changes in tone and emotiveness over time. We found that, despite their differences, the language of both legislative and internal auditors has become more positive, less subjective, and increasingly evocative of trust over the last 20 years. We also note that trust-related

F I G U R E 4 Emotion measures
words and words indicating a positive tone are highly correlated (i.e., r = 0.685. and r = 0.734 for internal and legislative reports, respectively).
We tested the robustness of these findings over time, budget size, auditing activity, and across samples randomly selected from the corpus. More specifically, we grouped the reports according to the 5-year period in which they were
We then separated the departments according to the size of their annual budgets and tested small, medium, large, and very large departments separately. Similarly, we classified departments according to total auditing activity, that is, number of reports produced over the study period, and tested these categories independently. Finally, we selected nine random samples of the data (three using 25%, 50%, and 75% of the data). Overall, our results are robust across these subsamples (Tables A.2-A.5 in Supporting Information online Appendix).
Overall, our findings confirm that the language of legislative auditors is more negative and emotive than the language used by internal auditors. However, differences in levels of readability are marginal, which is somewhat surprising since legislative reports are aimed at a much broader constituency and meant to inform the public. Moreover, the readability of audit reports deteriorated significantly through time across both functions, although less so for internal audits.

DISCUSSION
Our results provide evidence that, as with other aspects of public accounting (Broadbent & Guthrie, 2008), public auditing is sensitive to the institutional context. We add to the extant literature by showing that, as with other practices of public accounting, the communication of audit findings is context sensitive and by using an institutional TA B L E 4 Tone and emotions: Internal (n = 2520) versus legislative audit reports (n = 750) a *p < 0.05, **p < 0.01, and ***p < 0.001 indicate levels of statistical significance. a The polarity and subjectivity indices are evaluated on a scale of −1 to 1, developed by TextBlob.
embeddedness perspective to think through the main incentives provided by the public sector environment in this regard. Writing audit reports involves more than the straightforward description of audits' objectives, methods, and findings. Despite professional standards, auditors have some freedom in how they present and discuss their findings, especially when reporting on nonfinancial audits that now consume a large portion of their efforts. Our study shows that this flexibility results in significant variations in the readability, tone, and emotive qualities of the language used by public auditors.
These findings may appear counterintuitive if one thinks of public auditors as technicians deploying acontextual and standardized methods to financial data but less so if they are considered rational, strategic, and context-sensitive agents engaged in less standardized evaluative work. Contemporary auditors undoubtedly understand that audits can become enmeshed in the blame games played in the public sector involving a broad set of stakeholders who may alternatively feel threatened, discomforted, or strengthened by their content (Justesen & Skaerbaek, 2010 political context and adapt their communication style to amplify or mitigate the reputational risks associated with the reports. To our knowledge, there are no published studies examining differences between the language of audit reports written by SAI and internal auditors. As such, we contribute to a new line of comparative inquiry into public auditing and accountability. Our findings show that the language used to communicate audit findings varies systematically between SAI and internal auditors, with SAI auditors using more negative and emotive language. These results corroborate findings from previous studies on the behavior of SAIs, including Triantafillou's (2020) conclusion that Danish SAI auditors prioritize criticism over constructive, more conciliatory advice to visibly demonstrate independence from the government. They also align with the notion that SAI auditors "hunt for headlines" to attract attention and mobilize stakeholders (Kells, 2011, 387) or display a negativity bias (Hong, 2019;Nielsen & Moynihan, 2017), cultivating perceptions of independence over the approval of auditees.
We note, however, that these findings differ from those reported in Svärdsten (2019), who highlights the reluctance of Swedish SAI auditors to make clear negative statements about government performance for fear of being challenged unless they can refer to external authoritative sources to support their findings. While using different approaches, our study suggests that Canadian legislative auditors do not fear expressing more negative views, at least compared to internal auditors. This apparent difference between Canada and Sweden may be related to the level of political legitimacy enjoyed by legislative auditors. In Canada, the SAI benefits from strong public trust and parliamentary support. As a general rule, public criticism of its expertise in response to adverse negative findings is unlikely to be perceived favorably by parliamentarians or the public. Cross-national computational text analyses could help ascertain whether the tone of audits is more negative in Canada than in other jurisdictions.
With regard to internal auditors, our results indicate that they use a more neutral and positive language in their reports. It is a limitation of our method, however, that we cannot distinguish between substantive findings and the language used to describe them. A more negative language could be the result of the straightforward description of more negative findings. However, the same negative results might have been framed using more positive language. In fact, the tendency to use more positive language to embellish unfavorable facts are precisely why measures of positivity are considered less reliable in the finance literature (Loughran & McDonald, 2016). Nonetheless, our comparative research design demonstrates that although they investigate similar issues, internal auditors are either reporting more positive findings or describing similar adverse findings in more positive terms than their legislative counterparts.
These results also indicate more subservience from internal auditors, suggesting a relative lack of independence or a more general fear of being isolated within their organization as a result of criticizing colleagues and exposing them to greater reputational risks (Liston-Heyes & Juillet, 2020). These findings give credence to the perception of internal auditors as "lapdogs," mostly providing the illusion of assurance and oversight (Lonsdale, 1999, p. 175;Power, 2003).
However, it is also possible that these linguistic choices reflect a conscious strategic choice to use a more constructive approach to feedback, one that focuses on building trust and persuading, as opposed to confronting, executives into improvements, or compliance.
In any case, to the extent that institutional incentives discourage the clear and frank communication of negative findings, a practical implication of our study is that governments and public executives should work to provide them with more supportive institutional environments conducive to expressing, receiving, and learning from negative audit findings. Such institutional conditions may include stronger independence from line managers and senior executives by being located outside the audited organizations and the involvement of more powerful audit committees composed of external members. However, fostering an organizational culture that is less risk averse and more open to the honest discussion of reported short comings may be equally important.
Our results also indicate that both types of audit reports became more positive over time. This may reflect the growing skepticism among auditors about the effectiveness of blame-and-shame strategies. Faced with the limited attention capacity of parliamentarians and the media, legislative auditors may find, like internal auditors, that more positive and constructive language facilitates the voluntary adoption of audit recommendations. As Morin (2008) found that extensive and embarrassing press coverage does not necessarily lead to lasting impacts on audited organizations.
Finally, we found that legislative and internal audit reports exhibit similar levels of readability and have deteriorated over the last 20 years. In contrast to our expectations, differences in institutional environments do not lead legislative auditors to strive for more clarity or internal auditors to strategically cultivate opacity when communicating their findings. In fact, readability scores indicate that both types of reports are difficult to read for the average citizen (Stone & Lodhia, 2019). In the Canadian context, this result is particularly surprising for legislative audits since the OAG, as with other jurisdictions, has explicitly sought to increase its use of plain language since the early 2000s (Good, 2014, p. 130;Hazgui et al., 2022).
A potential explanation may be that auditors are encouraged to adopt an opaquer technical-professional language by the growing complexity of the public sector (Schillemans & Van Twist, 2016). It may also be that there are few incentives to improve the readability of reports despite official policy since audit reports remain primarily read by highly educated readers with good knowledge of government (e.g., politicians, journalists), who in turn relay audit findings to the general public. Hence, if the SAI wishes to appeal directly to the public, it can do so through media interviews and social media postings accompanying the release of audits. On this score, Canada's OAG may have a more active communication strategy than other SAIs (Hazgui et al., 2022), potentially fulfilling its institutional incentive (and democratic imperative) to better communicate with citizens.
Lastly, while we believe that the institutional incentives discussed above would be found in many countries (e.g., internal auditors fearing the creation of reputational risks, the importance of public perceptions for SAIs), careful consideration of national specificities is paramount in any study emphasizing the importance of institutional context.
In this regard, it is possible that some of our results are distinctively Canadian. For example, the proactive disclosure of reports may represent an unusual level of transparency for internal auditing. Moreover, the media standing of Canada's OAG, accrued through a history of high-profile clashes with governments, may be unusually high among SAIs. In the period under study, hard-hitting audits and highly mediatized interventions by the OAG played important roles in several political controversies, including one that contributed to the downfall of the government (Good, 2014, p. 123).
Given this history, it may be that the Canadian OAG finds it easier to adopt a more negative tone in its reporting when seeking to draw attention to important findings. That being said, combative relationships between governments and legislative auditors appear rather common, at least among Westminster systems. For example, Australia's auditor generals fought "many protected battles with government" (Funnell, 2015, p. 94), while the tensions between the Audit Office and the Treasury during New Zealand's NPM revolution have been well documented (Pallot, 2003). Crossnational studies are needed to uncover the extent to which the rhetorical styles of auditors vary across jurisdictions and whether any eventual differences are attributable to differences in the institutional context.

CONCLUSION
Our study confirms the value of recognizing public auditors as institutionally embedded social agents. Like ministers, executives, and parliamentarians, auditors are sensitive to the role that adverse audit findings can play in the blame games of the public sector. In light of these stakes, they respond to the incentives presented by their institutional environment through variations in the type of language used to present audit findings. By empirically tracking these linguistic variations, we contribute to the literature on public management, accountability and auditing by providing further evidence that, despite common professional norms and standards, auditing practices are influenced by their organizational and social context.
Our study also represents a unique investigation of the differences in how legislative and internal auditors perform their roles. It demonstrates that legislative auditors use more negative and emotive language in writing their reports than internal auditors and posits that such linguistic patterns are consistent with the institutional embeddedness of these two accountability functions. In particular, legislative auditors write reports that are more negative in tone and likely to grab the attention of parliamentarians, the media, and the public. In contrast, internal auditors minimize the reputational risks to their organization and facilitate cooperation by using relatively more positive language. Somewhat surprisingly, we find that both functions are writing reports that average citizens would find increasingly difficult to read, despite government-wide mandates to enhance transparency.
Finally, our study also provides a first application of computational text analysis to a corpus of public sector audits, demonstrating how valuable information can be extracted from very large volumes of bureaucratic documents. As the drive toward digital and open government incentivizes public administrations to release more documents, the methods used here could facilitate the uncovering of trends and patterns in practices, discourses, and culture, generating fresh insights into new and old problems of public accountability.

ACKNOWLEDGMENT
We gratefully acknowledge funding received from the Social Sciences and Humanities Research Council of Canada (SSHRC)-grant #435-2019-0102. We are very grateful for research assistance received from Matt Beach, Sakshi Gupta, Jessica Lim, and Somen Dutta as well as helpful comments from David Zussman, Terrence Liston, and Saif Mohammad.

ORCID
Catherine Liston-Heyes https://orcid.org/0000-0002-7322-8562 NOTES 1 For details of the measures, see https://readable.com/features/readability-formulas/. 2 See https://github.com/chartbeat-labs/textacy 3 Textblob uses a dictionary from: https://github.com/sloria/TextBlob/blob/61e8441d5a58bbbc0de8d2a66262451baf08fc2d/ textblob/en/en-sentiment.xml. 4 We also used a sentiment lexicon specialized for financial disclosures (Loughran & McDonald 2011). However, we found that over 88% of reports had zero counts in each sentiment category, highlighting the evolution of audit reports away from purely financial concerns. For these reasons, we omit the L&M indicators in the remainder of the analyses. 5 For additional details, see https://nrc.canada.ca/en/research-development/products-services/technical-advisory-services/ sentiment-emotion-lexicons. 6 In Canada, legislative audits are mainly conducted by the Office of the Auditor General (Canada's SAI) although a small number of audits are conducted by the Office of the Commissioner of Official Languages and the Public Service Commission. The latter two conduct external independent audits on public organizations' compliance with official language and staffing laws, respectively. All these audits are tabled before parliamentary committees. 7 The indices are calibrated on years of education except for the Flesch_reading_ease, which is calibrated on a 1-100 scale and decreases in the difficulty of a text, and the Lix indicator, which simply adds the average length of words to the average sentence length.