ITokenConfig | @chevrotain/types

Hierarchy

ITokenConfig

Index

Properties

Optional categories

Categories enable polymorphism on Token Types. A TokenType X with categories C1, C2, ... ,Cn can be matched by the parser against any of those categories. In practical terms this means that: CONSUME(C1) can match a Token of type X.

Optional group

group: string

The group property will cause the lexer to collect Tokens of this type separately from the other Tokens.

For example this could be used to collect comments for post processing.

See: https://github.com/chevrotain/chevrotain/tree/master/examples/lexer/token_groups

Optional label

label: string

The Label is a human readable name to be used in error messages and syntax diagrams.

For example a TokenType may be called LCurly, which is short for "left curly brace". The much easier to understand label could simply be "{".

Optional line_breaks

line_breaks: boolean

Can a String matching this Token Type's pattern possibly contain a line terminator? If true and the line_breaks property is not also true this will cause inaccuracies in the Lexer's line / column tracking.

Optional longer_alt

longer_alt: TokenType

The "longer_alt" property will cause the Lexer to attempt matching against another Token Type every time this Token Type has been matched.

This feature can be useful when two Token Types have common prefixes which cannot be resolved (only) by the ordering of the Tokens in the lexer definition.

Note that the longer_alt capability is cannot be chained, only a single longer_alt will be checked for a specific Token.

For example see: https://github.com/chevrotain/chevrotain/tree/master/examples/lexer/keywords_vs_identifiers For resolving the keywords vs Identifier ambiguity.

name

name: string

Optional pattern

pattern: TokenPattern

This defines what sequence of characters would be matched To this TokenType when Lexing.

For Custom Patterns see: http://chevrotain.io/docs/guide/custom_token_patterns.html

Optional pop_mode

pop_mode: boolean

If "pop_mode" is true the Lexer will pop the last mode of the modes stack and continue lexing using the new mode at the top of the stack.

See: https://github.com/chevrotain/chevrotain/tree/master/examples/lexer/multi_mode_lexer

Optional push_mode

push_mode: string

A name of a Lexer mode to "enter" once this Token Type has been matched. Lexer modes can be used to support different sets of possible Tokens Types

Lexer Modes work as a stack of Lexers, so "entering" a mode means pushing it to the top of the stack.

See: https://github.com/chevrotain/chevrotain/tree/master/examples/lexer/multi_mode_lexer

Optional start_chars_hint

start_chars_hint: (string | number)[]

Possible starting characters or charCodes of the pattern. These will be used to optimize the Lexer's performance.

These are normally automatically computed, however the option to explicitly specify those can enable optimizations even when the automatic analysis fails.

e.g:

strings hints should be one character long.
```
 { start_chars_hint: ["a", "b"] }
```
number hints are the result of running ".charCodeAt(0)" on the strings.
```
 { start_chars_hint: [97, 98] }
```
For unicode characters outside the BMP use the first of their surrogate pairs. for example: The '💩' character is represented by surrogate pairs: '\uD83D\uDCA9' and D83D is 55357 in decimal.
Note that "💩".charCodeAt(0) === 55357