Custom Token Patterns

Chevrotain is not limited to only using JavaScript regular expressions to define Tokens. Tokens can also be defined using arbitrary JavaScript code, for example:

// our custom matcher
function matchInteger(text, startOffset) {
  let endOffset = startOffset
  let charCode = text.charCodeAt(endOffset)
  // 0-9 digits
  while (charCode >= 48 && charCode <= 57) {
    endOffset++
    charCode = text.charCodeAt(endOffset)
  }

  // No match, must return null to conform with the RegExp.prototype.exec signature
  if (endOffset === startOffset) {
    return null
  } else {
    let matchedString = text.substring(startOffset, endOffset)
    // according to the RegExp.prototype.exec API the first item in the returned array must be the whole matched string.
    return [matchedString]
  }
}

const IntegerToken = createToken({
  name: "IntegerToken",
  pattern: matchInteger
})

This feature is often used to implement complex lexing logic, such as python indentationopen in new window.

See in depth guide for further details.