JTB

Java Tree Builder (JTB) - How to use JTB

Environment

JTB 1.5.0+ is meant to be used as a front end for JavaCC 5.0+, with a 1.8.0+ JDK, and so is command line by nature.
Previous versions required a JDK1.7+.

But it has been integrated within the JavaCC Eclipse Plugin, which embeds the latest JTB version each time the plugin is released, and avoids running the tool on the command line.

It also exists as a Maven artifact in Maven Central (edu.purdue.cs.jtb).

Run from the command line

to view all available options:

to parse system.in as the input file:

to parse input-file as the input file:

Command line options

See the following table. Note that since 1.5.0 command line options still can take the same form as before (like -chm), but also can take the file form (like JavaCC: -JTB_CHM=true) (without spaces), and even a mixed form (-chm=true).

Input file options section

options {(JTB_BOOL_OPT=(true|false); | JTB_STR_OPT="str";)*}

Note that in the generated annotated .jj file these JTB options are removed but the JavaCC options are kept unchanged.

Note also that command line options do overwrite input file options since 1.5.0, as opposed to before, to follow JavaCC behavior. In other words, input file options are applied only when the corresponding command line options are not explicitly set. Therefore, if an option setting with the default value is mandatory for proper functioning of the generated parser (e.g. STATIC=false;), it is recommended to explicitly set it in the input file, and not set it in the command line setting (which can come from theJavaCC Eclipse plugin, an Ant or Maven script…).

Options are the following

Command line File and command line Description Since Default
-chm JTB_CHM=(false|true) Generate nodes children methods 1.5.0 false
-cl JTB_CL=(false|true) Print a list of the classes generated to standard out   false
-d dir JTB_D="dir" Short for (and overwrites) "-nd dir/syntaxtree -vd dir/visitor -hkd dir/hook"   none
-dl JTB_DL=(false|true) Generate depth level info   false
-do JTB_DO=(false|true) Print a list of resulting (file and command line) options to standard out 1.5.0 false
-e JTB_E=(false|true) Suppress JTB semantic error checking   false
-eg class JTB_EG="class" Calls class (to be found in the classpath) to run a user supplied generator 1.5.0 none
-f JTB_F=(false|true) Use descriptive node class field names   false
-h N/A Display this help message and quit    
-hk JTB_HK=(false|true) Generate node scope hook interface, default implementation and method calls 1.5.0 false
-hkd JTB_HKD="dir" Use dir as the directory for the node scope hook interface and class 1.5.0 hook
-hkp JTB_HKP="pkg" Use pkg as the package for the node scope hook interface and class 1.5.0 hook
-ia JTB_IA=(false|true) Inline visitors accept methods in syntax tree classes   false
-jd JTB_JD=(false|true) Generate JavaDoc-friendly comments in the generated files   false
-nd dir JTB_ND="dir" Use dir as the directory for the syntax tree nodes   syntaxtree
-noplg JTB_NOPLG=(false|true) Do not parallelize user nodes classes generation 1.5.0 false
-nosig JTB_NOSIG=(false|true) Do not generate signature annotations and classes 1.5.0 false
-novis JTB_NOVIS=(false|true) Do not generate visitors interfaces and classes 1.5.0 false
-np pkg JTB_NP="pkg" Use pkg as the package for the syntax tree nodes   syntaxtree
-npfx str JTB_NPFX="str" Use str as prefix for the syntax tree nodes   none
-ns class JTB_NS="class" Use class as the class which all node classes will extend   none
-nsfx str JTB_NSFX="str" Use str as suffix for the syntax tree nodes   none
-o file JTB_o="file" Use file as the filename for the annotated output grammar   jtb.out.jj
-p pkg JTB_P="pkg" Short for (and overwrites) "-np pkg.syntaxtree -vp pkg.visitor -hkp pkg.hook"   none
-pp JTB_PP=(false|true) Generate parent pointers in all node classes   false
-printer JTB_PRINTER=(false|true) Generate syntax tree dumping and formatter visitors   false
-si N/A Read from standard input rather than a file    
-tk JTB_TK=(false|true) Generate (store) special tokens in the tree's NodeTokens   false
-tkjj JTB_TKJJ=(false|true) Generate (print) special tokens in the annotated grammar (implies -tk)   false
-vd dir JTB_VD="dir" Use dir as the directory for the default visitor classes   visitor
-vis str JTB_VIS="str" Use str as the specification string for the visitor class(es) 1.5.0 Void,void,None;VoidArgu,void,A;Ret,R,None;RetArgu,R,A
-vp pkg JTB_VP="pkg" Use pkg as the package for the default visitor classes   visitor
-w JTB_W=(false|true ) Do not overwrite existing files   false

Examples

Below will be referenced different examples, which are part of the JTB tests suite:

  1. a small example named eg1.jtb, and its generated files under ex1jtb, similar to a JJTree example eg1.jjt

Syntax tree nodes

Two types of syntax tree nodes are generated, in the syntaxtree directory:

  1. one, called user node, for each
    • BNFProduction, unless it is tagged by the ! indicator
    • JavaCodeProduction, only if it is tagged by the % indicator

Base nodes are invariant (i.e. they do not depend on the input grammar; they may vary over JTB’s versions).

User nodes are generated with as many (class) fields of (base or user) nodes as the nodes’ children. See the 2 examples.

These fields are by default named f0, f1, … fn, but can be named like their type (class name) through the -f option. See the SmallGrammar.jtb example.

Syntax tree default sub directory and sub package (relative to the grammar / parser directory /package) are syntaxtree and can be changed by the -np and -nd options. See the SmallGrammar.jtb example.

The nodes can be optionally prefixed -npfx str or suffixed -nsfx str by a string (usually to distinguish them better), and can optionally inherit from a super class -ns class. See the SmallGrammar.jtb example.

By default JTB generates the syntax tree user nodes in parallel (through the JDK 1.8 stream mechanism; the -noplg option tells JTB to generate them in serial. See the SmallGrammar.jtb example.

For examples using the the ! and % indicators on BNFProductions and JavaCodeProductions, see FullGrammar.jtb example.
Note that using the ! and % indicators breaks the grammar compatibility with a pure JavaCC grammar.
These features have been added to mimic corresponding JJTree features but should be used very seldom.

Visitors

Before 1.5.0 by default 4 visitor interfaces and 4 default implementations were generated. They covered the combination of with or without argument and with or without (void) return type.

Since 1.5.0 the -vis / JTB_VIS option enables to customize the generated visitor(s): one can set:

These types can be:

Syntax is: "Namepart,(Ret|void),(Arg(,Arg)*)?(;Namepart,(Ret|void),(Arg(,Arg)*)?)*", meaning it is:

The class names will be: **Visitor**, where **Prefix** will be:

Ex: "Void,void,None;Vis2,R,A,a.b.MyClass[],short..." will create:

Visitors generation can be suppressed by the -novis option.

Visitors default sub directory and sub package (relative to the grammar / parser directory / package) are visitor and can be changed by the -vd and -vp options. See the SmallGrammar.jtb example.

Default implementation is of type Depth first, or Child first, which means that a node’s children will be visited before the node’s brothers, and they just walk all the tree doing nothing.

Writing new visitors is easy as one has just to extend a default visitor and override the appropriate methods.

The -dl option enables generating a field reflecting the depth level. See the SmallGrammar.jtb example.

The -ia option enables generating visit methods with inlined code, that is the accept() call of each field of a production is replaced by complete piece of code that walks one level deeper in the grammar on the corresponding field. See the FullGrammar.jtb example.

Control signature

User nodes have each different children deriving from the grammar, and the default visitors generated code walks through these children in a non generic way (that is referencing fields 0, 1, … n);
usually the user visitors are coded by first copying the default visitors code and then modifying it, so they also access the children in a non generic way.

Therefore if one starts coding a visitor, and then modifies some productions definitions in the grammar, some user visitor code is likely to no more work on the new children: most of time a ClassCastException will occur.

So a mechanism inspired from the Java serialization version control has been added:

To use this mechanism compile using the ControlSignatureProcessor.java annotation processor, in the JDK 6+ way:
you will get compile errors on methods for which the old signature value in the old user visitor does not match any more the new generated signature value in the newly generated NodeConstants.java class.

See targets compile_ap_pkg, make_ap_jar and compile_visitors_with_ap in build.xml in JTB’s source distribution, for an example on how to use the annotation processor within javac compilation:

Do not forget to create / update the META-INF/services/javax.annotation.processing.Processor file for your project, with your <pkg>.visitor.signature.ControlSignatureProcessor class name, or use the alternative javac options.

The signature values are always generated when visitors are generated.

Signature files and annotations generation can be suppressed by the -nosig option; once they are generated, they are not overwritten, so the user has to delete them if he needs JTB to regenerate them.

Special tokens and comments

In a JavaCC grammar some tokens are called special tokens and are recognized by the TokenManager and passed to the Parser, which just links them to the next non special tokens; usually this feature is used to handle comments.

In JTB these special tokens can be stored in the parse tree’s NodeToken nodes through the -tk option. See the eg1.jtb example.
These special tokens can also be printed in the annotated .jj file through the -tkjj option.

By default, JTB prints, in the syntaxtree nodes and in the visitors, user-friendly javadoc comments reminding the productions grammatical structures; this can be turned off through the -jd option. See the SmallGrammar.jtb example.

Parent pointer

The -pp option enables generating in the syntax tree nodes (see the SmallGrammar.jtb example):

Children methods

The -chm option enables generating, in the syntax tree nodes, methods for accessing generically the children (through lists and numbers, and for all children, base only or user only children). See the SmallGrammar.jtb example.

Node hooks

The -hk option enables generating (see the SmallGrammar.jtb example):

The directory and package for these files are by default hook and can be changed by the -hkd and -hkp options.

External generator method call

Setting a class name in the -eg class option and setting the classpath (for the directory or jarfile for the external classes and tools) enables JTB to call, through the Java Reflection API, this class’s method int generate(Map<String, Object>), passing it a data model map holding some JTB global information (see JTB#callExternalGenerator(finalList<UserClassInfo> aClasses)):

This allows for example to use a template processor like Freemarker to generate different visitors…

A small test class EDU.purdue.jtb.freemarker.egt.ExternalGeneratorTester.java dumping the data model and trying to configure a Freemarker environment is embedded in the JTB jar and can be used to test the generation, mainly the classpath setting. Note that the Freemarker jar file is not provided with JTB.

Example: for a custom class my.gen.generator.java compiled into <path-to-classes-directory>:

See the EGTGrammar.jtb example.

Try ExpansionChoices / catch / finally

In JavaCC

The JavaCC grammar allows a special ExpansionUnit called here ExpansionUnitTCF which includes an ExpansionChoices within a try {...}, followed by an optional list of catch(VarDecl()) {Block()} and an optional finally {Block()} (named here TC*F?).
In the generated parser JavaCC will output these lines of code unchanged around those ones generated for the ExpansionChoices; usually they are used to take specific actions on a lexical or parse error.

In the JTB tests suite there are 2 examples of this syntax (in production named Tcf(), in grammars named TcfGrammar, one in JTB and its equivalent in JJTree:

  1. TcfGrammar.jtb, and its generated files under jtb-tcf, and under jj-tcf

In JJTree & JTB

You can use these ExpansionUnitTCF constructs in JJTree and JTB: they are kept in the annotated .jj file, which parses the input and builds along the corresponding abstract syntax tree (the AST).

There is a significant difference between JJTree and JTB here: during parsing:

The key aspects of JJTree are its concept of node scope and its bottom-up building of the tree; JTB’s ones are its concept of user and base nodes and its top-down building of the tree.

Note that, if you want to change the default behaviors:

In Visitors

But what about the visitors (the default generated ones or the user implementations)?
Should they contain the try / catch* / finally? or not?

JJTree does not generate them in the generated default visitor.

JTB 1.4.x generated them in the generated default visitors, considering that the user could need to catch exceptions while navigating in the tree.
This has been removed in 1.5.0, considering the previous reason would occur very seldom, and that the user would only use a TC*F? for parse exception handling purposes and not for tree building or tree walking purposes, and would extend the default visitor with a custom one if he has specific needs.