A large number of randomised trials authored by Yoshitaka Fujii have been retracted, in part as a consequence of a previous analysis finding a very low probability of random sampling. Dr Yuhji Saitoh co-authored 34 of those trials and he was corresponding author for eight of them. We found a number of additional randomised, controlled trials that included baseline data, with Saitoh as corresponding author, that Fujii did not co-author. We used Monte Carlo simulations to analyse the baseline data from 32 relevant trials in total as well as an outcome (muscle twitch recovery ratios) reported in several. We also compared a series of muscle twitch recovery graphs appearing in a number of Saitoh's publications. The baseline data in 14/32 randomised, controlled trials had p < 0.01, of which seven p values were < 0.001. Eight trials reported four ratios of the time for the return of muscle activity after neuromuscular blockade, the distributions of which were homogeneous: the p values for the observed Q statistics were 0.0055, 0.031, 0.016 and 0.0071. Comparison of graphs revealed multiple coincident or near-coincident curves across a large number of publications, a finding also inconsistent with random sampling. Combining the continuous and categorical probabilities of the 32 included trials, we found a very low likelihood of random sampling: p = 1.27 × 10−8 (1 in 100,000,000). The high probability of non-random sampling and the repetition of lines in multiple graphs suggest that further scrutiny of Saitoh's work is warranted.
In 2006, an analysis of homogeneity in meta-analyses identified a very extreme degree of between-study homogeneity in five studies published by Joachim Boldt . Suspicions raised by readers of a 2009 publication subsequently led to an institutional investigation and ultimately the retraction of more than 90 of Boldt's published studies for lack of ethics approval and fabrication of data .
In 2012, a similar analysis of the baseline variables in a large number of studies published by Yoshitaka Fujii found a very low probability of random sampling . This evidence formed an important part of the request that prompted a multi-institutional investigation of Fujii's publications, ultimately leading to the recommendation that over 180 papers should be retracted, again for lack of ethics approval and fabrication . The methods used for the analysis have subsequently been refined . One of Fujii's co-authors on 34 of those retracted papers was Dr Yuhji Saitoh, who was first and corresponding author on eight of these trials.
Following concerns raised over a new submission to the journal Anaesthesia and Intensive Care, we undertook a more focused analysis of data in randomised, controlled trials with Dr Yuhji Saitoh as an author.
In 2013, randomised, controlled trials published in six anaesthesia journals (2002–2012) were surveyed (unpublished). The distributions of mean (SD) for baseline variables were analysed using a published method . Additional studies for authors of at least two trials for which p < 0.05 were retrieved. The analyses were repeated using Monte Carlo simulations, which is a more reliable method than that used for Fujii (as described in the June 2015 issue of Anaesthesia). Monte Carlo simulations were also used for baseline categorical variables. The method used to analyse baseline continuous variables has been described in detail [3, 5]. In summary, Monte Carlo simulations were used instead of an independent t-test or ANOVA to generate a p value for differences between means. The aspect of interest is the probability that the difference in means would be less than reported (the left-hand tail of the distribution), which is equal to (1 − p)/2 where ‘p’ is the p value generated by a two-sided t-test or ANOVA. However, parametric tests of summary data generate p = 1 when the means are the same, as if they were identical to an infinite number of decimal places. Monte Carlo simulations are needed when the precision of means is insufficient to discriminate their differences. Monte Carlo simulations were also used for categorical variables and Stouffer's method to combine p values for continuous variables, categorical variables and all baseline variables. The Kolmogorov–Smirnov test was used, against a uniform distribution, for the p values of variables and randomised, controlled trials. The homogeneity of the standardised mean differences in the twitch recovery times (in a train-of-four) in the relevant studies was also analysed using Monte Carlo simulations for the Q statistic, as well as for the tau statistic and effect size probability. This type of analysis was used to identify the unusual homogeneity in the results of Boldt et al. in 2006 . The code used to program the Monte Carlo simulations is available as an online appendix (Appendix S1).
By December 2013, 11 trials with baseline data published by Saitoh and co-authors had been so detected and analysed: 6/11 had unlikely distributions of baseline data, and it was noticed additionally that at least two others shared graph lines in common, which seemed unlikely, so a wider comparison of graphs in Saitoh's publications was also undertaken. Graphs were copied and transparently pasted on top of other graphs.
An additional six trials with baseline data that Saitoh had co-authored with Fujii had previously been analysed,  but the association was not recognised at the time. After the submission to Anaesthesia and Intensive Care raised concerns, the Y Saitohs were identified as the same individual. A number of additional published trials and one unpublished trial (the paper submitted to Anaesthesia and Intensive Care) could then be added to the analysis. All analyses were conducted in R .
In addition to the unpublished trial submitted to Anaesthesia and Intensive Care, we retrieved 40 studies with Yuhji Saitoh as an author and for which Yoshitaka Fujii was not corresponding author (Appendix S2 [1–40]); in all we analysed baseline continuous data in 32 randomised, controlled trials (Appendix S2 [1–32]). Dr Saitoh was corresponding author for 26 of these trials (Appendix S2 [1–17, 19–21, 23–25, 30–32]), six of which have been retracted (Appendix S2 [12–17]) and one rejected before publication (Appendix S2 ). A further two randomised, controlled trials with Dr Saitoh as corresponding author that did not present baseline data have been retracted (Appendix S2 [33, 34]).
The baseline variables of 14/32 trials had combined p < 0.01 (one right-hand p value), of which seven were < 0.001 (Table 1 and Fig. 1). These p values are for the distribution of baseline means and rates and are less extreme than those calculated for 158 randomised, controlled trials with Yoshitaka Fujii as author (Fig. 2). The probability for distributions of standard deviations and their associated means can also be calculated. For example, both means and standard deviations are proximate in Fig. 3 of reference (Appendix S2 ), reproduced (with permission) in Fig. 3. The probability that a similar table would contain mean (SD) combinations as or more similar than reported was 0.0000089, determined in 100 million Monte Carlo simulations.
Table 1. The probabilities that simple random sampling would result in groups as similar as reported for: means (continuous variables); rates (categorical variables); continuous and categorical probabilities combined. Reference numbers are as listed in online Appendix S2
Reference Appendix S2
p value for baseline variables
AA, Anesthesia and Analgesia; AAS, Acta Anaesthesiologica Scandinavica; AIC, Anaesthesia and Intensive Care; An, Anaesthesia; BJA, British Journal of Anaesthesia; CJA, Canadian Journal of Anesthesia; EJA, European Journal of Anaesthesiology; FJMS, Fukushima Journal of Medical Sciences; JA, Journal of Anesthesia; JCA, Journal of Clinical Anesthesia.
Eight papers (Appendix S2 [19, 21–23, 25, 27, 28, 30]) reported mean (SD) times for train-of-four twitches at four time points (T1, T2, T3, T4) in two (or three) groups. The ratio of means for two groups at times T1:T4 varied little, across several RCTs, ranging from 0.75 to 0.77 for all four time points (Fig. 4). The Monte Carlo p values for the homogeneity (Q statistic) of these results were 0.0055 (T1), 0.031 (T2), 0.016 (T3) and 0.0071 (T4). These and other ratios of muscular function and post-tetanic count after neuromuscular blockade were presented graphically in 14 papers (Appendix S2 [5, 11, 16, 19, 23–25, 27, 30–32, 36, 37, 40]). The lines of some of these graphs were coincident, or nearly so, and are presented in Fig. 5 (all graphs reproduced with permission).
We have found improbable distributions of baseline data (1 in 100,000,000 combined) and improbable homogeneity of results across a substantial number of studies published by Dr Yuhji Saitoh, mirroring similar findings in the previous analysis of the work of Yoshitaka Fujii.
Saitoh has co-authored 36 papers with Fujii, 11 with Saitoh as corresponding author of which eight have already been retracted. The investigation into Fujii concluded that three trials authored by Saitoh were probably conducted and reported honestly (Appendix S2 [4, 10, 11]). Analyses of baseline data indicate that it is unlikely that two of these (Appendix S2 [4, 11]) reported the results of simple random allocation of participants into groups.
The possibility of a more widespread problem within a research network suggests that such institutional investigations should not be restricted to single authors. In the case of Boldt, for example, his co-authors published a paper without him  and this paper was also subsequently retracted.
The findings of this analysis support further institutional investigations into research published by Dr Yuhji Saitoh. Until such a time that these results can be explained, as was also recommended in the case of Fujii , we think it is important that Dr Saitoh's data are excluded from meta-analyses or other reviews of the relevant subjects.
The authors would like to acknowledge the help of Dr Neville Gibbs and Dr Steve Yentis in the preparation of this paper.
No external funding or competing interests declared. JC is an editor of Anaesthesia and this manuscript has undergone additional external review as a result.