Chapter 3. Statistical Language Modeling

  1. Alexander Clark PhD member Lecturer2,
  2. Dr Chris Fox Reader3 and
  3. Shalom Lappin Professor4
  1. Ciprian Chelba Diploma Engineer, PhD Research Scientist

Published Online: 29 JUN 2010

DOI: 10.1002/9781444324044.ch3

The Handbook of Computational Linguistics and Natural Language Processing

The Handbook of Computational Linguistics and Natural Language Processing

How to Cite

Chelba, C. (2010) Statistical Language Modeling, in The Handbook of Computational Linguistics and Natural Language Processing (eds A. Clark, C. Fox and S. Lappin), Wiley-Blackwell, Oxford, UK. doi: 10.1002/9781444324044.ch3

Editor Information

  1. 2

    Royal Holloway, University of London, UK

  2. 3

    University of Essex, UK

  3. 4

    King's College London, UK

Author Information

  1. Speech Technology Group at Microsoft Research, USA

Publication History

  1. Published Online: 29 JUN 2010
  2. Published Print: 16 JUL 2010

ISBN Information

Print ISBN: 9781405155816

Online ISBN: 9781444324044

SEARCH

Keywords:

  • statistical language modeling;
  • statistical language model, prior probability values P(W) for strings of words W in a vocabulary V;
  • measures of language model quality;
  • n-gram language model, finite state machine (FSM) - driving decoding (search) process;
  • language model, assigning non-zero probability - to unseen strings of words;
  • structured language model;
  • formal language, and class of context-free grammars (CFGs);
  • separate left-to-right word predictor - in language model;
  • UPenn Treebank corpus - subset of WSJ (Wall Street Journal) corpus

Summary

This chapter contains sections titled:

  • Introduction to Statistical Language Modeling

  • Structured Language Model

  • Speech Recognition Lattice Rescoring Using the Structured Language Model

  • Richer Syntactic Dependencies

  • Comparison with Other Approaches

  • Conclusion

  • Notes