I am in the process of revamping some old code, part of which contains the GLR algorithm. The GLR algorithm is rather involved, so I figured this was the perfect opportunity to explain it once and for all, if for no other reason than to have a copy of this on hand.
Like many parsers, the GLR algorithm is an LR based parser will build a parse tree from a series of tokens. Unlike most though, it is able to handle ambiguous grammars. Of practical interest, this means that a GLR parser can handle the ambiguous grammar of C++. It accomplishes this, in concept, keeping a multitude of stacks, each representing one of the many possible parses of the input. Naturally, the more ambiguous any particular input, the greater the performance hit for using this approach.
To explain what goes on inside a GLR parser. We’ll consider what happens when processing a single input token. As a prerequisite, you need an LR table, produced from a suitable grammar, such as the wormhole parser generator.
(More …)