Chevrotain
Home
Features
Tutorial
Guide
FAQ
Changes
APIs
Playground
Benchmark
Discussions
GitHub
Home
Features
Tutorial
Guide
FAQ
Changes
APIs
Playground
Benchmark
Discussions
GitHub
  • Tutorial

    • Introduction
    • Lexer
    • Parser
    • Semantics
    • Semantics - CST Visitor
    • Semantics - Embedded
    • Fault Tolerance

Semantics - CST Visitor

Run and Debug the source code.

Introduction

In the previous tutorial step we have implemented a parser for a "mini" SQL Select grammar. The current problem is that our parser only validates that the input conforms to the grammar -- in other words, it is just a recognizer. But in most real-world use cases the parser will also have to output some result/data structure/value.

This can be accomplished using a CST (Concrete Syntax Tree) Visitor defined outside our grammar:

  • See the in-depth documentation of Chevrotain's CST capabilities

Enabling CST

This feature is automatically enabled when a Parser extends the Chevrotain CstParser class.

The invocation of any grammar rule will now automatically create a CST.

function parseInput(text) {
  const lexingResult = SelectLexer.tokenize(text);
  const parser = new SelectParser(lexingResult.tokens);

  // CST automatically created.
  const cstOutput = parser.selectStatement();
}

The CST Visitor

Creating a CST is not enough; we also need to traverse this structure and execute our actions (semantics).

Each Chevrotain parser instance exposes two BaseVisitor classes which can be extended to create custom user visitors.

// BaseVisitor constructors are accessed via a parser instance.
const parserInstance = new SelectParser();

const BaseSQLVisitor = parserInstance.getBaseCstVisitorConstructor();

// This BaseVisitor includes default visit methods that simply traverse the CST.
const BaseSQLVisitorWithDefaults =
  parserInstance.getBaseCstVisitorConstructorWithDefaults();

class myCustomVisitor extends BaseSQLVisitor {
  constructor() {
    super();
    // The "validateVisitor" method is a helper utility which performs static analysis
    // to detect missing or redundant visitor methods
    this.validateVisitor();
  }

  /* Visit methods go here */
}

class myCustomVisitorWithDefaults extends BaseSQLVisitorWithDefaults {
  constructor() {
    super();
    this.validateVisitor();
  }

  /* Visit methods go here */
}

const myVisitorInstance = new myCustomVisitor();
const myVisitorInstanceWithDefaults = new myCustomVisitorWithDefaults();

In our example we will use the BaseVisitor constructor (without defaults).

Visitor Methods

So we now know how to create a CST visitor. But how do we actually make it perform the actions (semantics) we wish? For that we must create a visit method for each grammar rule.

Recall the selectClause grammar from the previous step:

selectClause:
  "SELECT" Identifier ("," Identifier)*;

Let's create a visitor method for the selectClause rule.

class SQLToAstVisitor extends BaseSQLVisitor {
  constructor() {
    super();
    this.validateVisitor();
  }

  // The Ctx argument is the current CSTNode's children.
  selectClause(ctx) {
    // Each Terminal or Non-Terminal in a grammar rule is collected into
    // an array with the same name (key) in the ctx object.
    let columns = ctx.Identifier.map((identToken) => identToken.image);

    return {
      type: "SELECT_CLAUSE",
      columns: columns,
    };
  }
}

So far pretty simple. Now let's add another visit method for "selectStatement". First, let's recall its grammar.

selectStatement
   : selectClause fromClause (whereClause)?

And now to the code:

class SQLToAstVisitor extends BaseSQLVisitor {
  constructor() {
    super();
    this.validateVisitor();
  }

  // The Ctx argument is the current CSTNode's children.
  selectClause(ctx) {
    /* as above... */
  }

  selectStatement(ctx) {
    // "this.visit" can be used to visit non-terminals and will invoke the correct visit method for the CstNode passed.
    let select = this.visit(ctx.selectClause);

    //  "this.visit" can work on either a CstNode or an Array of CstNodes.
    //  If an array is passed (ctx.fromClause is an array) it is equivalent
    //  to passing the first element of that array
    let from = this.visit(ctx.fromClause);

    // "whereClause" is optional, "this.visit" will ignore empty arrays (optional)
    let where = this.visit(ctx.whereClause);

    return {
      type: "SELECT_STMT",
      selectClause: select,
      fromClause: from,
      whereClause: where,
    };
  }
}

Full Visitor

We still have a few grammar rules we need to build visitors for.

fromClause
   : "FROM" Identifier

whereClause
   : "WHERE" expression

expression
   : atomicExpression relationalOperator atomicExpression

atomicExpression
   : Integer | Identifier

relationalOperator
   : ">" | "<"

Let's implement those as well.

class SQLToAstVisitor extends BaseSQLVisitor {
  constructor() {
    super();
    this.validateVisitor();
  }

  selectStatement(ctx) {
    /* as above... */
  }

  selectClause(ctx) {
    /* as above... */
  }

  fromClause(ctx) {
    const tableName = ctx.Identifier[0].image;

    return {
      type: "FROM_CLAUSE",
      table: tableName,
    };
  }

  whereClause(ctx) {
    const condition = this.visit(ctx.expression);

    return {
      type: "WHERE_CLAUSE",
      condition: condition,
    };
  }

  expression(ctx) {
    // Note the usage of the "rhs" and "lhs" labels defined in step 2 in the expression rule.
    const lhs = this.visit(ctx.lhs[0]);
    const operator = this.visit(ctx.relationalOperator);
    const rhs = this.visit(ctx.rhs[0]);

    return {
      type: "EXPRESSION",
      lhs: lhs,
      operator: operator,
      rhs: rhs,
    };
  }

  // these two visitor methods will return a string.
  atomicExpression(ctx) {
    if (ctx.Integer) {
      return ctx.Integer[0].image;
    } else {
      return ctx.Identifier[0].image;
    }
  }

  relationalOperator(ctx) {
    if (ctx.GreaterThan) {
      return ctx.GreaterThan[0].image;
    } else {
      return ctx.LessThan[0].image;
    }
  }
}

Usage

So we know how to create a CST Visitor, but how do we actually use it?

// A new parser instance with CST output enabled.
const parserInstance = new SelectParser([], { outputCst: true });
// Our visitor has no state, so a single instance is sufficient.
const toAstVisitorInstance = new SQLToAstVisitor();

function toAst(inputText) {
  // Lex
  const lexResult = selectLexer.tokenize(inputText);
  parserInstance.input = lexResult.tokens;

  // Automatic CST created when parsing
  const cst = parserInstance.selectStatement();
  if (parserInstance.errors.length > 0) {
    throw Error(
      "Sad sad panda, parsing errors detected!\n" +
        parserInstance.errors[0].message,
    );
  }

  // Visit
  const ast = toAstVisitorInstance.visit(cst);
  return ast;
}
Edit this page on GitHub
Last Updated: 2/27/26, 9:21 PM
Contributors: Shahar Soel, Soel, Tom Haggie, bd82, i060847, Hocdoc, Ben Fisher, I060847, Gaurav
Prev
Semantics
Next
Semantics - Embedded