Research Article
A text pattern-matching tool based on Parsing Expression Grammars
Article first published online: 17 JUL 2008
DOI: 10.1002/spe.892
Copyright © 2008 John Wiley & Sons, Ltd.
Additional Information
How to Cite
Ierusalimschy, R. (2009), A text pattern-matching tool based on Parsing Expression Grammars. Software: Practice and Experience, 39: 221–258. doi: 10.1002/spe.892
Publication History
- Issue published online: 30 JAN 2009
- Article first published online: 17 JUL 2008
- Manuscript Accepted: 21 MAY 2008
- Manuscript Revised: 20 MAY 2008
- Manuscript Received: 1 OCT 2007
Funded by
- Brazilian Research Council (CNPq). Grant Number: 300993/2005-6
- Abstract
- References
- Cited By
Keywords:
- pattern matching;
- Parsing Expression Grammars;
- scripting languages
Abstract
Current text pattern-matching tools are based on regular expressions. However, pure regular expressions have proven too weak a formalism for the task: many interesting patterns either are difficult to describe or cannot be described by regular expressions. Moreover, the inherent non-determinism of regular expressions does not fit the need to capture specific parts of a match. Motivated by these reasons, most scripting languages nowadays use pattern-matching tools that extend the original regular-expression formalism with a set of ad hoc features, such as greedy repetitions, lazy repetitions, possessive repetitions, ‘longest-match rule,’ lookahead, etc. These ad hoc extensions bring their own set of problems, such as lack of a formal foundation and complex implementations. In this paper, we propose the use of Parsing Expression Grammars (PEGs) as a basis for pattern matching. Following this proposal, we present LPEG, a pattern-matching tool based on PEGs for the Lua scripting language. LPEG unifies the ease of use of pattern-matching tools with the full expressive power of PEGs. Because of this expressive power, it can avoid the myriad of ad hoc constructions present in several current pattern-matching tools. We also present a Parsing Machine that allows a small and efficient implementation of PEGs for pattern matching. Copyright © 2008 John Wiley & Sons, Ltd.

1097-024X/asset/olbannerleft.jpg?v=1&s=2d7d001211f2c40f177a231141601e9f52afc1f3)
1097-024X/asset/olbannerright.jpg?v=1&s=3aec7891a8ba78b361ead9743adfc0b6eae6369a)
1097-024X/asset/cover.gif?v=1&s=5a70ecca2928358eca4ff75b9921d0552ffb6539)