Ignoring non‐English‐language studies may bias ecological meta‐analyses

Abstract Meta‐analysis plays a crucial role in syntheses of quantitative evidence in ecology and biodiversity conservation. The reliability of estimates in meta‐analyses strongly depends on unbiased sampling of primary studies. Although earlier studies have explored potential biases in ecological meta‐analyses, biases in reported statistical results and associated study characteristics published in different languages have never been tested in environmental sciences. We address this knowledge gap by systematically searching published meta‐analyses and comparing effect‐size estimates between English‐ and Japanese‐language studies included in existing meta‐analyses. Of the 40 published ecological meta‐analysis articles authored by those affiliated to Japanese institutions, we find that three meta‐analysis articles searched for studies in the two languages and involved sufficient numbers of English‐ and Japanese‐language studies, resulting in four eligible meta‐analyses (i.e., four meta‐analyses conducted in the three meta‐analysis articles). In two of the four, effect sizes differ significantly between the English‐ and Japanese‐language studies included in the meta‐analyses, causing considerable changes in overall mean effect sizes and even their direction when Japanese‐language studies are excluded. The observed differences in effect sizes are likely attributable to systematic differences in reported statistical results and associated study characteristics, particularly taxa and ecosystems, between English‐ and Japanese‐language studies. Despite being based on a small sample size, our findings suggest that ignoring non‐English‐language studies may bias outcomes of ecological meta‐analyses, due to systematic differences in study characteristics and effect‐size estimates between English‐ and non‐English languages. We provide a list of actions that meta‐analysts could take in the future to reduce the risk of language bias.


| INTRODUC TI ON
Global environmental change threatens ecosystems and biodiversity around the world (Ceballos et al., 2015;Díaz et al., 2019;WWF, 2018). A sound understanding of ecosystem responses to environmental drivers and human activities is therefore urgently required to inform policy and practice to mitigate against adverse ecological change (Sutherland, Pullin, Dolman, & Knight, 2004). It is increasingly demanded that this understanding draws on rigorous scientific evidence bases, best formed through the unbiased and systematic collation, appraisal, and meta-analysis of primary empirical research (Pullin, 2012;Sutherland et al., 2004). Meta-analysis can provide a powerful set of tools for summarizing the results of multiple studies, quantifying the variation in results among studies, and evaluating whether hypotheses are supported by the assemblage of existing studies (Gurevitch, Koricheva, Nakagawa, & Stewart, 2018;Koricheva, Gurevitch, & Mengersen, 2013).
A common criticism of meta-analysis is the ignorance of potential biases during the search and selection of studies to be quantitatively synthesized (Borenstein, Hedges, Higgins, & Rothstein, 2009). Indeed, for a meta-analysis to obtain a robust estimate of an overall true effect size, a random subset of all relevant primary studies should be included in the analysis. For example, because the nature and direction of a study's results can affect its likelihood of publication (publication bias: Bayliss & Beyer, 2015;CEE, 2018;Higgins & Green, 2011), the omission of unpublished data and gray literature may result in a biased sample of primary studies that give rise to an overestimated overall effect size (McAuley, Pham, Tugwell, & Moher, 2000;Turner, Matthews, Linardatos, Tell, & Rosenthal, 2008). Publication bias and its consequences for meta-analysis are widely recognized, and mitigation measures to minimize this bias exist (Bayliss & Beyer, 2015;CEE, 2018). A much more overlooked bias in evidence synthesis is language bias, wherein the nature and direction of a study's results can affect the chosen language of its publication (Egger et al., 1997;Grégoire, Derderian, & Le Lorier, 1995;Higgins & Green, 2011;Juni et al., 2002).
Omitting studies published in languages other than English, a common practice in meta-analysis, could therefore also lead to a biased sample of primary studies. However, the prevalence and importance of language bias in ecological meta-analyses have never been assessed to date (Livoreil et al., 2017). This is concerning, given that conclusions derived from biased meta-analyses could lead to wasted resources if management actions are ineffectively prescribed or, worse still, may lead to unexpected or even perverse outcomes.
Previous studies of language bias in medical science have revealed differences in statistical results between publication languages (Egger et al., 1997;Grégoire et al., 1995;Juni et al., 2002).
Referred to as "English-language bias" (Egger et al., 1997) or "Tower of Babel" bias (Grégoire et al., 1995), it has been shown that positive or statistically significant results are more likely to be published in English than other languages (language bias in statistical results in Figure 1). This focus on statistical results is presumably because medical meta-analysts are typically concerned with estimating the overall effects of treatments on a single species (i.e., Homo sapiens) under controlled conditions (e.g., the effectiveness of a drug at reducing symptoms of a disease). In contrast, ecological meta-analyses are typically interested in variation among effect sizes and attributing this variation to meaningful covariates that vary among studies, such as species biogeographical contexts and intervention intensity (i.e., effect modification). Ecological meta-analyses thus typically combine heterogeneous studies on a wide range of organisms and ecosystems (Gurevitch et al., 2018;Koricheva et al., 2013). Doing so may give rise to another type of language bias, if studies with particular characteristics are more likely to be published in non-English languages because, for instance, they are deemed unsuitable for international journals. For example, studies conducted on particular ecosystems, at particular intervention intensities, or conducted by local practitioners who do not speak English, could be systematically omitted from meta-analyses (language bias in study characteristics in Figure 1). Considering that up to 36% of scientific studies on biodiversity conservation is published in languages other than English (Amano, González-Varo, & Sutherland, 2016), and that non-English studies are typically omitted from ecological meta-analyses, an assessment of the impacts of language bias on ecological inferences drawn from meta-analyses is urgently needed.
Here, we address this knowledge gap and assess the risk of language bias on outcomes of meta-analyses. We searched for published peer-reviewed meta-analysis articles that analyzed sufficient numbers of both English-and Japanese-language peer-reviewed studies (10 or more effect-size estimates in each language). The Japanese language was chosen not only for a practical reason (most of the authors are native Japanese-language speakers) but also because the Japanese language is one of the major non-English languages for scientific documentation in biodiversity conservation (Amano et al., 2016). We first tested the differences in effect-size estimates between English-and Japanese-language studies included in eligible meta-analyses, to quantify the impacts of excluding the Japanese-language studies on the overall mean effect sizes. To investigate the possible causes of differences in effect-size estimates between the languages, we then tested for language bias in study characteristics by investigating between-language differences in study characteristics that were deemed important in modifying the effects in the original meta-analyses (i.e., potential effect modifiers).
Next, we quantified differences in effect-size estimates between the languages after controlling for the revealed differences in the study characteristics. Finally, we discuss the processes through which language bias may arise and propose guidelines for incorporating non-English-language studies in ecological meta-analyses.

| Systematic literature search
We searched for ecological meta-analysis articles including both English-language and Japanese-language studies. We performed searches in Web of Science Core Collection (https://webof knowl edge.com/), CAB Direct (https://www.cabdi rect.org/), and Wiley Online Library (https://onlin elibr ary.wiley.com/) using Englishlanguage search strings via Bangor University institutional access, and in CiNii (https://ci.nii.ac.jp/), which is the largest and most comprehensive database in Japan, using a Japanese-language search string with no subscription (Appendix S1). We also performed Web-based searches on Google Scholar (in English: https://schol ar.google.com/; in Japanese: https://schol ar.google.co.jp/) using an English-language search string and a Japanese-language search string (Appendix S1).
Each search string contains meta-analy* OR "meta analy*." We tailored a search string for each bibliographic platform, database, and Web-based search engine (Appendix S1). We used translated version of the terms for the searches in Japanese (AppendixS1). All searches were conducted in Gwynedd, Wales, United Kingdom, on 19, 24, and

| Eligibility screening
Our search strings retrieved 1,504 unique articles. Duplicates were removed using Mendeley's "Check for Duplicates" tool and manually.
We then screened articles according to titles and abstract, followed by full texts. We screened these articles to obtain meta-analysis articles on ecological or evolutionary topics, suitable for assessing possible language biases. We included meta-analysis articles that expressed the outcome of multiple studies on a common scale, through the calculation of an "effect size" for each study, which represents the magnitude of a difference between control and treatment means (e.g., log response ratio, standardized mean difference). We included meta-analysis articles that aimed to quantitatively combine effect sizes to yield an overall estimate, or attribute variation in effect sizes to meaningful covariates using meta-regression. Note that a meta-analysis article can include multiple meta-analyses. To identify meta-analysis articles authored by individuals capable of searching literature in both English and Japanese, we included only meta-analysis articles conducted by research teams with at least one author affiliated with a Japanese institution. Finally, the meta-analyses had to include 10 or more effect-size estimates published in both English and Japanese languages (AppendixS4). Articles meeting all of these eligibility criteria were included in our analysis. We used a modified version of the ROSES Flow Diagram for Systematic Review for reporting the number of articles retrieved at each screening stage (Haddaway, Macura, Whaley, & Pullin, 2017) (AppendixS5). We do not report a critical appraisal component of the diagram, because we did not conduct critical appraisal of the meta-analysis articles.

| Data selection
We extracted effect-size estimates provided by four eligible metaanalyses published in three meta-analysis articles by Koshida and F I G U R E 1 The fate of ignoring studies published in relevant language(s). Studies providing certain information (e.g., local-scale studies focusing on specific hypotheses on a single species) may be more likely to be published in non-English languages (language bias in study characteristics) because, for example, those studies tend to be conducted by local practitioners or they are often not of great interest from an international perspective. After the analysis, statistically significant or positive results may be more likely to be published in higherimpact, English-language journals (language bias in statistical results)

Biased by ignoring
English studies

Analysis
English-language studies

Non-English-language studies
Research conducted by non-English-language speakers As a result, we used 58 effect-size estimates: 11 from six English-language studies and 47 from 25 Japanese-language studies from Koshida and Katayama (2018) ("rice-field meta-analysis" from hereon; Table 1 and Appendix S6). We also used 134 effect-size estimates: 100 from 13 English-language studies and 34 from two Japanese-language studies from Osada et al. (2013) ("leaf life span meta-analysis" from hereon; Table 1 and Appendix S7).
From Spake et al. (2019), we used 65 effect-size estimates: 41 from six English-language studies and 24 from three Japaneselanguage studies on the effect of thinning on abundance of ground-layer plants ("plant forestry meta-analysis" from hereon; Table 1 and Appendix S8), and 41 effect-size estimates: 26 from six English-language studies and 15 from four Japanese-language studies on the effect of thinning on abundance of saplings and seedlings ("sapling forestry meta-analysis" from hereon; Table 1 and Appendix S9).

| Effect-size difference between languages
We first tested homogeneity of variance and normality of effect-size estimates (log response ratio and life span ratio) using Levene's test and two-sample Kolmogorov-Smirnov test, respectively. We then tested differences in effect-size estimates between English-language studies and Japanese-language studies, using a two-sample t test or Welch two-sample t test (Ruxton, 2006) where the assumption of homogeneity of variance was not met. While these analyses assume independence between effect-size estimates, some studies included in each meta-analysis have multiple effect-size estimates. Although some of those effect-size estimates (e.g., those estimated at different locations) can be assumed as independent comparisons, others may not, for example, by sharing common control groups. We could not however include study as a random effect (a common solution in meta-analyses, e.g., see Spake et al., 2019), because (a) an unbalanced number of effect-size estimates in each study (ranging from 1 to 36 in our analyses) can lead to unstable parameter estimates (Harrison et al., 2018), and (b) language is a study-level variable and can therefore be confounded with the random effect of each study.
Therefore, our statistical tests may be vulnerable to increased type I errors. However, even if effect-size estimates are not independent to each other, the estimation of a mean effect size is not affected by this (Borenstein et al., 2009), and thus, mean effect sizes in each language shown in Figures 2 and 6 should be reliable.
Meta-analyses typically estimate precision-weighted mean effect sizes, with each effect size weighted by the inverse of its variance (in addition to between-study variance). Studies that omit the TA B L E 1 Details of eligible meta-analyses, the number of studies, and the number of effect-size estimates (total, in English, and in Japanese) included in each meta-analysis Note: The two meta-analyses in Spake et al. (2019) used some studies in common, and thus, the total number of studies does not equate to the sum of studies used in each meta-analysis. Meta-analysis article: a published article (paper) that conducted at least one relevant meta-analysis. Metaanalysis: a statistical analysis of multiple effect-size estimates measuring the effect of an intervention on a distinct group of subjects. Study: a paper included in a meta-analysis, providing at least one effect-size estimate. Effect-size estimate: effect sizes estimated from data published in original primary studies and used in a meta-analysis. required information to estimate effect-size variance may thus be excluded (Borenstein et al., 2009). Therefore, as a sensitivity analysis, we repeated the above analyses but on a subset of effect-size estimates from studies that reported corresponding variance measures.
We did not perform this sensitivity analysis for Osada et al. (2013) because the original meta-analysis did not provide standard deviations of effect-size estimates.

| Language bias in study characteristics
We tested associations between the languages and study characteristics using two-way chi-squared tests and Mann-Whitney U tests.
For Koshida and Katayama (2018), taxa, management types, landscape types, soil types, and outcomes measured were analyzed. For Osada et al. (2013), measurement conditions, study countries, and plant families were analyzed. Study countries were tested because Japanese-language studies are not necessarily conducted in Japan.
In the case of Spake et al. (2019), intervention intensity and stand age were analyzed. We chose these variables because the original meta-analyses treated the variables as potential effect modifiers and they were available for analyses. Note that although strictly speaking these are the characteristics of each effect-size estimate, many of those characteristics are usually determined at the study level; hence, we used the term "study characteristics" instead of "characteristics of effect-size estimates." The statistical tests were conducted in R version 3.5.0 (R Core Team, 2018).

| Language bias in statistical results
We used two approaches to assess whether effect-size estimates differed between languages after controlling differences in study characteristics. First, for each meta-analysis, we fitted linear models with effect-size estimates as the response variable, with explanatory variables including publication language in addition to factors that had a significant association with language (see above) as fixed factors, then compared two models: with and without the fixed factor language. Second, we fitted linear mixed models with effect-size estimates as the response variable, publication language as the fixed factor, and factors that had a significant association with language as random factors, and tested the significance level of the fixed factors using likelihood-ratio tests with the reduced models (Quinn & Keough, 2002). We ran the linear mixed models and performed likelihood-ratio tests using lme4 package (Bates, Mächler, Bolker, & Walker, 2015) in R version 3.5.0 (R Core Team, 2018).

| Searches and screening
Of the 1,504 unique articles retrieved by our search strings, 40 articles met our inclusion criteria as ecological meta-analyses conducted by at least one author affiliated with a Japanese institution.
These comprised meta-analyses from a wide range of subdisciplines, F I G U R E 2 Differences in mean effect sizes between English-(blue) and Japanese-language studies (red). (a) Koshida and Katayama (2018) showing the effects of rice-field abandonment on biodiversity (rice-field meta-analysis). (b) Osada et al. (2013) showing the effect of light on plants' leaf life span (leaf life span meta-analysis). (c) Spake et al. (2019) showing the effect of thinning on groundlayer plant abundance (plant forestry meta-analysis). (d) Spake et al. (2019) showing the effect of thinning on sapling and seedling abundance (sapling forestry meta-analysis). The number of effect-size estimates in each language is also shown. The error bars show 95% confidence intervals. Diamonds and dotted lines show mean effect sizes estimated from all studies pooled (i.e., English-+ Japaneselanguage studies) including forestry, phenology, agriculture, and ecosystem services.
Only three published meta-analysis articles searched for studies published in both English and Japanese languages. Thirty-six English-language meta-analysis articles included studies published only in English, while one Japanese-language meta-analysis article synthesized only Japanese-language studies (listed as "Evidence base" for the reason for exclusion in AppendixS10). From the three articles meeting our criteria, four separate meta-analyses had sufficient data to examine potential language bias effects (Table 1; also see Section 2.3 Data Selection in Methods). Japanese-language studies constituted 81% of the effect-size estimates from the ricefield meta-analysis, 25% from the leaf life span meta-analysis, and 37% each from the two forestry meta-analyses (Table 1).

| Effect-size differences between languages
Effect sizes representing the effect of rice-field abandonment on biodiversity differed between English-and Japanese-language studies from the rice-field meta-analysis (  In contrast, effect sizes did not differ significantly between English-language and Japanese-language studies from the plant forestry and sapling forestry meta-analyses (Table 2). However, mean effect sizes from English-language-only studies in the sapling forestry meta-analysis were 52% and 29% smaller than those estimated from Japanese-language-only studies and from all studies, respectively ( Figure 2d). Results differed for comparisons based on the subset of effect-size estimates that were associated with standard deviations (i.e., those that allow weighted meta-analysis). Effect sizes differed significantly between languages in the plant forestry meta-analysis. The mean effect size estimated from Japanese-language-only studies was 318% more positive than the mean effect from Englishlanguage-only studies in the plant forestry meta-analysis (t = −2.85; df = 29; p = .008; AppendixS12). The difference remained nonsignificant in the sapling forestry meta-analysis (t = −0.98; df = 27; p = .33).

| Language bias in study characteristics
For the rice-field meta-analysis, although all of the included studies were conducted in Japan, the proportion of Japanese-language studies varied significantly among taxa (x 2 = 25.07; df = 3; p < .001) and landscape types (x 2 = 14.38; df = 1; p < .001; Figure 3). Almost all studies on amphibians, fish, and plants and in complex landscapes were those published in Japanese (Figure 3).
In the forestry meta-analyses, the moderating influences of forest thinning intensity (volume removed, %) and stand age on effect sizes were analyzed (Spake et al., 2019). Despite the fact that all studies were conducted in Japan, thinning intensities were higher for studies published in Japanese than English for both the plant forestry meta-analysis (U lower = 350; n1 = 41; n2 = 24; p = .049) and the sapling forestry meta-analysis (U lower = 109; n1 = 26; n2 = 15; p = .02; Figure 5a). Stand age was younger in Japanese-language

TA B L E 2
Results of statistical tests for homogeneity of variance, normality, and differences in effect sizes between English-and Japanese-language studies studies for the plant forestry meta-analysis (U lower = 646; n1 = 41; n2 = 24; p = .04), but it did not differ significantly between the languages for the sapling forestry meta-analysis (U lower = 168; n1 = 26; n2 = 15; p = .47; Figure 5b). This difference remained when reanalyzed for the subset of effect-size estimates associated with sample sizes and standard deviations for the plant forestry meta-analysis (U lower = 54.5; n1 = 13; n2 = 18; p = .01); however, stand age did not differ significantly between the languages (U lower = 131; n1 = 13; n2 = 18; p = .58).

| Language bias in statistical results
We next tested whether differences in effect sizes among languages remained after controlling for differences in study characteristics between the English-language and Japanese-language studies. In the rice-field meta-analysis, effect sizes did not differ significantly between the languages when the effects of taxa and landscape types were controlled for (Table 3, Figure 6a,b). Thus, the differences in effect sizes between the languages shown in Figure 2a seem to be largely attributable to the differences in study characteristics, together with the differences in effect sizes between taxa and landscape types (Figure 6a,b).
In contrast, effect sizes from the leaf life span meta-analysis still differed among languages after controlling for the effects of measurement conditions, plant families, and study countries (Table 3, Figure 6c). English-language studies tended to report larger effect sizes than the Japanese-language studies (Figure 6c), even for studies conducted only in Japan or those with field measurements, indicating a systematic difference in reported statistical results.
Effect-size estimates associated with standard deviations from the plant forestry meta-analysis differed among languages after controlling for the difference in forest thinning intensity (Table 3, Figure 6d). This again suggests a systematic difference in reported statistical results between languages, but in this case, the Japaneselanguage studies tended to report larger effect sizes than the English-language studies (Figure 6d).

| D ISCUSS I ON
Our reanalysis of four published meta-analyses has revealed differences in mean effect sizes from studies published in English and Japanese. Specifically, two out of the four multiple-language ecological meta-analyses (and in another, when focusing only on effect-size estimates associated with standard deviations) had effect sizes that differed by language. Although most of the ecological meta-analyses identified in our initial screening neither searched for nor included studies published in multiple languages, Japanese-language studies constituted a high proportion of effect-size estimates included in the four eligible meta-analyses (Table 1). These results suggest that English-language studies do not necessarily comprise a random subset of the global literature, and thus, ignoring non-English-language studies in ecological meta-analyses may lead to biased estimates of overall mean effect sizes, and biased inferences about ecological effects. This is a serious concern, given that meta-analyses are often F I G U R E 3 Differences in study characteristics between English-(blue) and Japanese-language studies (orange) for the rice-field metaanalysis (Koshida & Katayama, 2018). Characteristics with a significant difference between the languages are in bold. The number of effectsize estimates in each language is also shown in each bar used to inform decision-making in conservation policy and practice across a range of contexts (Gurevitch et al., 2018).
Our results also illustrate that language bias in study characteristics could lead to biased estimates of overall mean effect sizes, if non-English-language studies are ignored. This is a potential threat to ecological meta-analyses, which typically synthesize heterogeneous studies on multiple species, conducted at multiple study scales (temporal, spatial) and with different methods (e.g., observational, experimental) (Spake & Doncaster, 2017). It is especially hard, or even impossible, to address the effect of language bias in study characteristics on overall mean effect sizes, when none or only a few effect-size estimates are available in English for some study characteristics (e.g., effect-size estimates on amphibians, fish, and plants in Koshida & Katayama, 2018). As a substantial amount of scientific literature on specific ecosystems and endemic species seems to be published in non-English languages (Amano et al., 2016), omitting F I G U R E 4 Differences in study characteristics between English-(blue) and Japanese-language studies (orange) for the leaf life span meta-analysis (Osada et al., 2013). Characteristics with a significant difference between the languages are in bold. The number of effect-size estimates in each language is also shown in each bar F I G U R E 5 Differences in two study characteristics, (a) thinning intensity and (b) stand age, between English-(blue) and Japaneselanguage studies (orange) for the plant and sapling forestry meta-analyses (Spake et al., 2019). Black circles show medians, squares show interquartile ranges, and outer lines show ranges. The number of effect-size estimates in each language is also shown below each panel non-English-language studies might result in biased samples of study characteristics, potentially leading to invalid inferences. The general risk of language bias in study characteristics might thus be higher for ecology than for medical sciences.
We discuss four principal reasons that might give rise to language bias in study characteristics for Japan. First, there are well-estab-  (Kidera et al., 2018;Koshida & Katayama, 2018). International journals often demand "generality" or "transferability" of study findings, and may judge studies on local topics in Japan as too system-specific. Therefore, studies on plants, amphibians, and fish might be more likely to have been submitted to, and published in, Japaneselanguage journals. Third, conservation scientists and applied ecologists may choose to publish in Japanese-language journals in order to target a Japanese audience of policymakers and practitioners, who may not read English-language articles. Finally, available resources (time, human, financial) might influence the choice of language for publication. In the leaf life span meta-analysis, most Japaneselanguage studies measured leaf life span in the field, while most English-language studies employed experimental designs ( Figure 4).
As measuring leaf life span in the fields is less laborious compared to conducting experiments, this might suggest that studies conducted by more highly funded research groups are more likely to afford English-language proofing, and be published in English-language journals. Those potential processes through which language bias in study characteristics arises will need to be fully investigated in the future.
We found that the differences in effect sizes between the languages remained after controlling for the influence of covariates investigated in the leaf life span and plant forestry meta-analyses.
In the leaf life span meta-analysis, between-language differences in effect sizes were apparent even for studies conducted only in Japan, or those from field experiments (Figure6c). This finding might be explained by a general tendency for authors reporting smaller effects in non-English languages, as has been observed in medical sciences (Egger etal.,1997), due to preemption that their results are TA B L E 3 Comparisons of models for testing effect-size differences between languages, with and without language as a fixed factor Note: Statistically significant results (in bold) indicate that effect sizes differ between English-and Japanese-language studies even after controlling for the relevant fixed or random factor(s).
not strong or interesting enough to be accepted by international English-language journals. Such a tendency is only possible given the availability of local-language journals, where authors can submit their works for publication. Hence, the existence of non-English journals could in effect mitigate publication bias by providing non-English-language speakers with opportunities to report statistically insignificant results, but only if these journals are included in meta-analyses.
Interestingly, we found the opposite effect of language bias in statistical results in the plant forestry meta-analysis; Japaneselanguage studies had larger mean effect sizes compared to Englishlanguage studies (Appendix S12). One possible explanation for this is the issue of pseudoreplication (Davies & Gray, 2015;Spake & Doncaster, 2017), as pseudoreplicated study designs were prevalent in the contributing studies from Japan. While this finding still indicates that ignoring non-English-language studies could bias the estimation of mean effect sizes in ecological meta-analyses, it also suggests that we should not always assume biases toward a certain direction (i.e., providing larger effects) unless relevant non-English-language studies are identified and incorporated appropriately.
Our study has several important caveats. First, our findings are drawn from only four meta-analyses that met our inclusion criteria.
This may partly be due to our criterion for meta-analysis articles to have been published by at least one author affiliated to a Japanese institution. Relaxing this criterion might have increased the sample size. However, even with this restriction, we screened over 1,500 papers and identified 40 potentially relevant meta-analysis articles, of which only three actively searched and included studies published in the two languages. Our small sample size therefore reflects a current common practice of ignoring non-English-language studies in ecological meta-analyses. Second, our findings might depend on how the authors of the meta-analysis articles collected, collated, and appraised the studies for their meta-analyses. However, it was not possible to investigate whether these decisions influenced our F I G U R E 6 Tests for language bias in statistical results after controlling for the differences in study characteristics. Differences in effect sizes between English-(blue) and Japanese-language studies (red) for (a) each taxon and (b) each landscape type in Koshida and Katayama (2018), for (c) studies conducted in Japan, and each measurement condition in Osada et al. (2013), and when (d) the effect of thinning intensity is controlled for in the plant forestry meta-analysis (Spake et al., 2019). The number of effect-size estimates in each language is also shown. The error bars show standard errors. In (d), black line shows the regression line based on all effect-size estimates associated with standard deviations and sample sizes, while red and blue dotted lines show the regression lines based only on effect-size estimates in English-and Japanese-language studies reporting standard deviations and sample sizes, respectively results. Third, within these meta-analyses, we divided studies into two smaller groups according to language, which may have led to increased type II errors of not detecting significant differences (e.g., in Figure 2c,d). Fourth, we explored meta-analyses synthesizing studies published only in English and Japanese, and therefore, biases from other languages remain to be explored. Finally, it is possible that other study characteristics (e.g., geographical difference within Japan), not considered by the original meta-analyses, might have better explained differences among effect sizes between the languages.
Despite being based on a small sample size, our findings have a broad, yet simple, implication for meta-analyses in ecology and conservation science. Future meta-analyses-particularly those conducted at global extents or in regions where English is not widely spoken-should actively search for relevant non-English-language studies and, if appropriate, include them. Searches for non-English-language studies could be implemented by collaboration with native speakers of the relevant non-English languages (Walpole, 2019) or with the aid of emerging technologies (e.g., litsearchr package in R translates search strings into multiple languages: https://eliza grames.github.io/litse archr /). The flip side is also true; ignoring English-language studies could also introduce language bias, and therefore, even national-level decision-making in non-English-speaking regions requires rigorous assessment of studies in English and in relevant non-English languages. Currently, the use of non-English-language studies in science is not common practice (Baethge, 2013;Neimann Rasmussen & Montgomery, 2018) or even discouraged (Lazarev & Nazarovets, 2018). However, our findings highlight the importance of re-evaluating the role of non-English scientific knowledge in science and a potential risk of ignoring it in meta-analyses. Understanding the generality of our findings across languages, and quantifying any bias incurred from ignoring non-English-language studies, requires further research on other non-English languages (including Spanish, Portuguese, and Chinese; Amano et al., 2016), in which large volumes of scientific literature are published.