Resolving Grammar Errors

Common Prefix Ambiguities

Imagine the following grammar:

myRule:
  "A" "B" |
  "A" "B" "C"

The first alternative is a prefix of the second alternative. Now lets consider the input ["A", "B"]. For this input the first alternative would be matched as expected.

However for the input ["A", "B", "C"] the first alternative would still be matched but this time incorrectly as alternation matches are attempted in order.

There are two ways to resolve this:

  • Reorder the alternatives so that shorter common prefix lookahead paths appears after the longer ones.

    myRule:
      "A" "B" "C" |
      "A" "B"
    
  • Refactor the grammar to extract common prefixes.

      myRule:
        "A" "B" ("C")?
    

Ambiguous Alternatives Detected

An Ambiguous Alternatives Error occurs when Chevrotain cannot decide between two alternatives in an alternation (OR DSL method).

Chevrotain "looks ahead" at most K (3 by default)open in new window tokens to determine which alternative to pick. An Ambiguous Alternatives Error indicates that more than K tokens lookahead is needed.

Lets consider a more concrete example:

fiveTokensLookahead:
  "A" "B" "C" "D" "1" |
  "A" "B" "C" "D" "2"

In order to decide between these two alternatives, Chevrotain must "look ahead" five tokens as the disambiguating tokens are "1" and "2". Five is a larger than the default maxLookaheadopen in new window of four, so an error will be raised.

We could solve this case by increasing the global maxLookaheadopen in new window to 5, however this is not recommended due to performance and grammar complexity reasons. From a performance perspective this is particularly problematic as some analysis done on the grammar (during initialization) may become exponentially more complex as the maxLookahead grows.

We could also specify the MAX_LOOKAHEADopen in new window config on the specific DSL method invocation where the problem occurs, This is still not the optimal solution in this case.

The recommended solution in this case would be to refactor the grammar to require a smaller lookahead. In our trivial example the grammar can be refactored to be LL(1), meaning only one token of lookahead is needed. The needed change is a simple extraction of the common prefix before the alternation.

oneTokenLookahead:
  "A" "B" "C" "D"
  (
    "1" |
    "2"
  )

Note that the number of lookahead tokens needed to choose between alternatives may in fact be infinite, for example:

infiniteTokensLookahead:
  ("A")* "1"  |
  ("A")* "2"

No matter how large a maxLookahead we choose, the sequence of "A"s could always (potentially) be longer... The solution in this case is the same as before, extraction of the common prefix before the alternation, for example:

oneTokenLookahead:
  ("A")*
  (
    "1" |
    "2"
  )

In some rare cases refactoring the grammar is not possible, in those cases it is still possible to resolve the ambiguity using the backtracking feature Although this is strongly discouraged due to performance and complexity reasons...

Terminal Token Name Not Found

This problem can no longer occur in versions of Chevrotain after (and including) 6.0.0. See V5 of these Docsopen in new window if you have not yet upgraded.

Infinite Loop Detected

  • Note This error is only relevant in versions prior to 4.4.0 See: https://github.com/chevrotain/chevrotain/issues/958

A repetition must consume at least one token in each iteration. Entering an iteration while failing to do so would cause an infinite loop because the condition to entering the next iteration would still be true while the parser state has not been changed. essentially this creates a flow that looks like:

// iteration lookahead condition (always true)
while (true) {
  // single iteration grammar
}

Lets look at a few real examples that can cause this error

$.MANY(() => {
  return;
  // unreachable code
  $.CONSUME(Plus);
});

By returning early in the iteration grammar we prevent the parser from consuming The plus token and thus the next time the parser checks if it should enter the iteration The condition (nextToken === Plus) would still be true.

$.MANY(() => {
  // Never wrap Chevrotain grammar in JavaScript control flow constructs.
  if (condition) {
    $.CONSUME(Plus);
  }
});

This is similar to the previous example as if the condition is false, once again the parser will consume nothing in the iteration. Modeling conditional grammar paths must be done using Chevrotain grammar constructs such as OPTION and/or GATEopen in new window.

For example the above example should be written as:

$.MANY(() => {
  $.OPTION(() => {
    $.CONSUME(Plus);
  });
});

Ignoring Ambiguities

In some rare cases the Parser may detect ambiguities that are not actually possible or are perhaps implicitly resolved, e.g:

  • by the order of alternatives (an alternation alternative is attempted in the order listed).

In such cases the ambiguities may be ignored explicitly by using the IGNORE_AMBIGUITIESopen in new window property on the relevant DSL method.

For example:

  • Ignoring all ambiguities of an alternation.

    $.OR({
      IGNORE_AMBIGUITIES: true,
      DEF: [
        { ALT: () => $.SUBRULE($.myRule) },
        { ALT: () => $.SUBRULE($.myOtherRule) },
      ],
    });
    
  • Ignoring ambiguities related to a specific alternative of an alternation:

    $.OR([
      { ALT: () => $.SUBRULE($.myRule), IGNORE_AMBIGUITIES: true },
      { ALT: () => $.SUBRULE($.myOtherRule) },
    ]);