Corpus Use in Language Learning: A Meta‐Analysis
We would like to thank the participants at the Teaching and Language Corpora conferences where earlier versions of this paper were presented and especially Luke Plonsky, Lourdes Ortega, and John Norris for their invitation to a symposium on meta‐analysis at the International Association for Applied Linguistics in 2014 in Brisbane kindly sponsored by Language Learning. Our thanks to Luke Plonsky again for his input on an earlier draft of this paper as well as to the anonymous reviewers. We are also grateful to the authors and coauthors who responded to our e‐mails and in some cases managed to provide papers or further information on their studies: Kiyomi Chujo, Susan Conrad, Averil Coxhead, Ewa Donesch‐Jezo, Laura Gavioli, Zeping Huang, Ali Akbar Jafarpour, Betsy Kerr, Hsien‐Chin Liou, Gillian Mansfield, Daehyeon Nam, Yasunori Nishina, Kathryn Oghigian, Simon Smith, and Serge Verlinde (whether their papers could finally be included or not).
Abstract
This study applied systematic meta‐analytic procedures to summarize findings from experimental and quasi‐experimental investigations into the effectiveness of using the tools and techniques of corpus linguistics for second language learning or use, here referred to as data‐driven learning (DDL). Analysis of 64 separate studies representing 88 unique samples reporting sufficient data indicated that DDL approaches result in large overall effects for both control/experimental group comparisons (d = 0.95) and for pre/posttest designs (d = 1.50). Further investigation of moderator variables revealed that small effect sizes were generally tied to small sample sizes. Research has barely begun in some key areas, and durability/transfer of learning through delayed posttesting remains an area in need of further investigation. Although DDL research demonstrably improved over the period investigated, further changes in practice and reporting are recommended.
Open Practices

This article has been awarded Open Materials and Open Data badges. All materials and data are publicly accessible via the Open Science Framework at https://osf.io/jkktw. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.
Number of times cited: 22
- Shaoqun Wu, Alannah Fitzgerald, Ian H. Witten and Alex Yu, Automatically Augmenting Academic Text for Language Learning, Handbook of Research on Integrating Technology Into Contemporary Language Learning and Teaching, 10.4018/978-1-5225-5140-9.ch025, (512-537)
- Andrea E. Tyler and Lourdes Ortega, Chapter 1. Usage-inspired L2 instruction, Usage-inspired L2 Instruction, 10.1075/lllt.49.01tyl, (3-26), (2018).
- Greg Kessler, Technology and the future of language teaching, Foreign Language Annals, 51, 1, (205-218), (2018).
- Johanna F. Vos, Herbert Schriefers, Michel G. Nivard and Kristin Lemhöfer, A Meta‐Analysis and Meta‐Regression of Incidental Second Language Word Learning from Spoken Input, Language Learning, 68, 4, (906-941), (2018).
- Laura-May Simard, Le corpus comme aide à la rédaction de résumés scientifiques pour des étudiants LANSAD : une approche comparativeUsing a corpus for abstract writing in ESP classes: a comparative approach, ASp, 10.4000/asp.5122, 73, (75-104), (2018).
- Soyeon Moon and Sun-Young Oh, Unlearning overgenerated be through data-driven learning in the secondary EFL classroom, ReCALL, 10.1017/S0958344017000246, 30, 01, (48-67), (2017).
- Emma Marsden and Luke Plonsky, Conclusion, Critical Reflections on Data in Second Language Acquisition, 10.1075/lllt.51.10mar, (219-228), (2018).
- Meilin Chen and John Flowerdew, Introducing data-driven learning to PhD students for research writing purposes: A territory-wide project in Hong Kong, English for Specific Purposes, 10.1016/j.esp.2017.11.004, 50, (97-112), (2018).
- Meilin Chen and John Flowerdew, A critical review of research and practice in data-driven learning (DDL) in the academic writing classroom, International Journal of Corpus Linguistics, 10.1075/ijcl.16130.che, 23, 3, (335-369), (2018).
- Maggie Charles, Corpus-assisted editing for doctoral students: More than just concordancing, Journal of English for Academic Purposes, 10.1016/j.jeap.2018.08.003, 36, (15-25), (2018).
- Tatyana Karpenko-Seccombe, Practical concordancing for upper-intermediate and advanced academic writing: Ready-to-use teaching and learning materials, Journal of English for Academic Purposes, 10.1016/j.jeap.2018.10.001, (2018).
- Oliver James Ballance, Pedagogical models of concordance use: correlations between concordance user preferences, Computer Assisted Language Learning, 10.1080/09588221.2017.1307228, 30, 3-4, (259-283), (2017).
- Stephen Jeaco, Helping Language Learners Put Concordance Data in Context, International Journal of Computer-Assisted Language Learning and Teaching, 10.4018/IJCALLT.2017040102, 7, 2, (22-39), (2017).
- Robert Godwin-Jones, OER use in intermediate language instruction: a case study, CALL in a climate of change: adapting to turbulent global conditions – short papers from EUROCALL 2017, 10.14705/rpnet.2017.eurocall2017.701, (128-134), (2017).
- Alex Boulton, Data-Driven Learning and Language Pedagogy, Language and Technology, 10.1007/978-3-319-02328-1_15-1, (1-12), (2017).
- Alex Boulton, Data-Driven Learning and Language Pedagogy, Language, Education and Technology, 10.1007/978-3-319-02237-6_15, (181-192), (2017).
- Atsushi Mizumoto, Sawako Hamatani and Yasuhiro Imao, Applying the Bundle–Move Connection Approach to the Development of an Online Writing Support Tool for Research Articles, Language Learning, 67, 4, (885-921), (2017).
- Hyejin Park and 남대현, Corpus linguistics research trends from 1997 to 2016: A co-citation analysis, Linguistic Research, 10.17250/khisli.34.3.201712.008, 34, 3, (427-457), (2017).
- Clinton Hendry and Emily Sheepy, Evaluating lexical coverage in Simple English Wikipedia articles: a corpus-driven study, CALL in a climate of change: adapting to turbulent global conditions – short papers from EUROCALL 2017, 10.14705/rpnet.2017.eurocall2017.704, (146-150), (2017).
- Hansol Lee, Mark Warschauer and Jang Ho Lee, The Effects of Corpus Use on Second Language Vocabulary Learning: A Multilevel Meta-analysis, Applied Linguistics, 10.1093/applin/amy012, (2018).
- Hansol Lee, Mark Warschauer and Jang Ho Lee, Advancing CALL research via data-mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment, ReCALL, 10.1017/S0958344018000162, (1-15), (2018).
- Eloi Puig-Mayenco, Jorge González Alonso and Jason Rothman, A systematic review of transfer studies in third language acquisition, Second Language Research, 10.1177/0267658318809147, (026765831880914), (2018).




