Standard Article

Corpus Analysis of the World Wide Web

  1. William H. Fletcher

Published Online: 5 NOV 2012

DOI: 10.1002/9781405198431.wbeal0254

The Encyclopedia of Applied Linguistics

How to Cite

Fletcher, W. H. 2012. Corpus Analysis of the World Wide Web. The Encyclopedia of Applied Linguistics. .

Publication History

  1. Published Online: 5 NOV 2012


The World Wide Web has become a primary meeting place for information and recreation, for communication and commerce, for a quarter of the world's population. Millions of Web authors have created billions of Web pages, unknowingly providing texts to be mined for their linguistic and cultural content. The Web has evolved into the resource of first resort for lexicographers and linguists, for translators, teachers, and other language professionals. As a source of machine-readable texts for corpus linguists and researchers in complementary fields like natural language processing (NLP), information retrieval, and text mining, the Web offers extraordinary accessibility, quantity, variety, and cost-effectiveness. Investigators in these disciplines have developed scores of tools and products from Web content for both researchers and end users, and authored hundreds of scholarly papers on their projects.


  • educational linguistics;
  • natural language processing;
  • second language acquisition;
  • corpus;
  • language for specific purposes;
  • language learning technology