• search hit 5 of 23
Back to Result List

Fast packrat parsing in a live programming environment

  • Language developers who design domain-specific languages or new language features need a way to make fast changes to language definitions. Those fast changes require immediate feedback. Also, it should be possible to parse the developed languages quickly to handle extensive sets of code. Parsing expression grammars provides an easy to understand method for language definitions. Packrat parsing is a method to parse grammars of this kind, but this method is unable to handle left-recursion properly. Existing solutions either partially rewrite left-recursive rules and partly forbid them, or use complex extensions to packrat parsing that are hard to understand and cost-intensive. We investigated methods to make parsing as fast as possible, using easy to follow algorithms while not losing the ability to make fast changes to grammars. We focused our efforts on two approaches. One is to start from an existing technique for limited left-recursion rewriting and enhance it to work for general left-recursive grammars. The second approachLanguage developers who design domain-specific languages or new language features need a way to make fast changes to language definitions. Those fast changes require immediate feedback. Also, it should be possible to parse the developed languages quickly to handle extensive sets of code. Parsing expression grammars provides an easy to understand method for language definitions. Packrat parsing is a method to parse grammars of this kind, but this method is unable to handle left-recursion properly. Existing solutions either partially rewrite left-recursive rules and partly forbid them, or use complex extensions to packrat parsing that are hard to understand and cost-intensive. We investigated methods to make parsing as fast as possible, using easy to follow algorithms while not losing the ability to make fast changes to grammars. We focused our efforts on two approaches. One is to start from an existing technique for limited left-recursion rewriting and enhance it to work for general left-recursive grammars. The second approach is to design a grammar compilation process to find left-recursion before parsing, and in this way, reduce computational costs wherever possible and generate ready to use parser classes. Rewriting parsing expression grammars is a task that, if done in a general way, unveils a large number of cases such that any rewriting algorithm surpasses the complexity of other left-recursive parsing algorithms. Lookahead operators introduce this complexity. However, most languages have only little portions that are left-recursive and in virtually all cases, have no indirect or hidden left-recursion. This means that the distinction of left-recursive parts of grammars from components that are non-left-recursive holds great improvement potential for existing parsers. In this report, we list all the required steps for grammar rewriting to handle left-recursion, including grammar analysis, grammar rewriting itself, and syntax tree restructuring. Also, we describe the implementation of a parsing expression grammar framework in Squeak/Smalltalk and the possible interactions with the already existing parser Ohm/S. We quantitatively benchmarked this framework directing our focus on parsing time and the ability to use it in a live programming context. Compared with Ohm, we achieved massive parsing time improvements while preserving the ability to use our parser it as a live programming tool. The work is essential because, for one, we outlined the difficulties and complexity that come with grammar rewriting. Also, we removed the existing limitations that came with left-recursion by eliminating them before parsing.show moreshow less

Download full text files

  • tbhpi135.pdfeng
    (1916KB)

    SHA-512:4bddce3044419211de2a1ee2d02a25c82e47752d1c8542b72f46f92818ab52701409edce850d3a149ff6bb900464f704c69f601f58deb27bd8d27c2c217c802d

Export metadata

Additional Services

Search Google Scholar Statistics
Metadaten
Author details:Friedrich Eichenroth, Patrick ReinORCiD, Robert HirschfeldORCiDGND
URN:urn:nbn:de:kobv:517-opus4-491242
DOI:https://doi.org/10.25932/publishup-49124
ISBN:978-3-86956-503-3
ISSN:1613-5652
ISSN:2191-1665
Title of parent work (German):Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam
Subtitle (English):improving left-recursion in parsing expression grammars
Publication series (Volume number):Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam (135)
Publisher:Universitätsverlag Potsdam
Place of publishing:Potsdam
Publication type:Monograph/Edited Volume
Language:English
Year of first publication:2022
Publication year:2022
Publishing institution:Universität Potsdam
Publishing institution:Universitätsverlag Potsdam
Release date:2022/05/16
Tag:left recursion; packrat parsing; parsing expression grammars
Issue:135
Number of pages:79
RVK - Regensburg classification:ST 230
Organizational units:Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH
DDC classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Publishing method:Universitätsverlag Potsdam
Open Access / Gold Open-Access
Peer review:Nicht ermittelbar
License (German):License LogoKeine öffentliche Lizenz: Unter Urheberrechtsschutz
Accept ✔
This website uses technically necessary session cookies. By continuing to use the website, you agree to this. You can find our privacy policy here.