View this PageEdit this PageAttachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide
Hotspots: Admin Pages | Turn-in Site |
Current Links: Cases Final Project Summer 2007

Team8's third milestone -OR- How to use TGen

Team8 consists of:
Thomas Annandale of
Michael Tiffany
Jason Faulkner
Carl Cox

If you want to just get straight to downloading our project and copying our code, our entire project is here:
Also, here's our CRC/UML stuff in Boost format: team8.ver

This Case page is being written in hopes of helping someone who's never used TGen before. In coding the third milestone (see Summer 2000 Milestones), figuring out exactly how to use TGen was my biggest stumbling block. I will try to go through how I used TGen for the milestone in hopes of helping you understand how to use it.

The first thing to do is to download and unzip it. Copy all of the files to the same directory as your image and file-in the file ending with .cs (changeset). I would strongly recommend reading through the inside the tgen folder.

Type "TGenUI open" in your Workspace and Do-It to start up TGen.

The easiest way to learn TGen is through example. Option-click on the top right window and select "load specs". Type in "example3" and press Enter. If you copied all of the files from the tgen directory into your image directory, you should be looking at this:
Uploaded Image: team8_pic1.gif

The top-left windows contains the specifications for the scanner. The bottom-left window has the specs for the grammar. The top-right window gives TGen related output, and the bottom-right window contains the input that you want to scan and parse. Option-click on the top-right window and click "result" to have TGen show you the results of the parse. What it's showing you is the inspect of the head node of the parse tree that is created for the input in the bottom right window. Right now, since you're currently in "derivation" mode (see the button on the very top right), the tree is being created out of DerivationTreeNodes, which are a standard node that doesn't do anything, but show you what the tree looks like in the inspect window. When you start specifying your own nodes to use, you'll have to change the mode to "AST" by clicking the button in the top-right corner, but for now, we'll stick with derivation node.

The top-left window contains the specifications for the scanner. You define your tokens much the same way that you did for lex in CS2330. The only difference is that pretty much anything that isn't alphanumeric has to be escaped with a backslash (\), and if your grammar isn't compiling and you aren't getting any messages, the first thing you should check is that all your non-alphanumerics are preceded with a backslash. To compile your code, make a change in your grammar and accepted it with Alt-S. In the example scanner above, "space" is being defined as a token that contains any number of spaces, tabs, or return carriages. You can get the entire list of special characters (tabs, spaces, etc) inside the file. The "ignoreDelimeter" directive next to the definition of "space" means that no space token is ever actually passed to the parser. Instead, all instances of space are ignored. We will see more complex scanner definitions later. It is not necessary to define a token for every possible input. The scanner will pass to the parser any input that isn't explicitly defined as a token.

The bottom-left window contains the specifications for the parser. As you can see, the format is very similar to the yacc format used in CS2330. The only interesting things are the directives in brackets to the right of the rule. Most of the time, these directives specify the name of the class that the parser should try to create the node out of, but there are two special directives that can be used to tell the parser to do special things. These special directives are "liftRightChild" and "liftLeftChild". Unfortunately, I really have no idea what these special directives do, and neither did my team. Fortunately, we got a 100 anyway, so my advice is to just forget that there's any such thing as a special directive.

As I mentioned, the other thing that is specified in the brackets are the name of the class to construct the node out of. At the moment, since we're in "derivation" mode, these class names are being ignored and DerivationTreeNodes are being used instead, which is ok since none of those classes are defined anyway (Unless you happen to have a "Plus", "Times", "A", "B", and "C" class defined in your image).

Now that you've (hopefully) warmed up a bit, let's move on to the scanner and parser that we wrote for our project. If you've already downloaded and unzipped our project ( into your image directory, you can open this by option-clicking on the top-right window and selecting "load specs" and typing in "team8". Make sure that the button on the top-right of the screen says "AST" (so that it uses the nodes that we've defined), and that the button on the top-left says "SLR(1), LALR(1), LR(1) parser" (because our grammar doesn't work as LL(1) for some reason).
Uploaded Image: team8_pic2.gif

If you've filed in our stuff, you can also see this grammar in action. Type in some LaTeX in the bottom-right window (or just use the LaTeX that's already there) and option-select "result" from the top-right window. That will bring up the head node of our parse tree. Type "self asMathObject openInWorld" and Do-It to open a cute little ImageMorph of the equation.
Uploaded Image: team8_pic4.gif

The key to understanding TGen is to understand how TGen interacts with it's nodes. There are four methods that TGen calls inside the nodes that it uses. These methods are:


The addChildrenFirst: and addChildrenLast: have something to do with those crazy leftLeftChild and liftRightChild special directives. I would just pretend they don't exist.

Let me give you an example of how and when these functions are called. For any grammer rule that consists of only a signal terminal, for example:
Rule1 : Terminal {NodeType} ;
TGen will create a new instance of the class "NodeType" (which you must define!), and call setAttribute: on that node with the text of the terminal as the parameter. You can use this to store the specific text that the user input.

For any rule that contains nonterminals, for example, of the form:
Rule2 : NonTerm1 NonTerm2 Nonterm3 ... NonTermN {NodeType} ;
TGen will, same as before, create a new instance of the class "NodeType", but this time it will call the method setChildrenInitial: inside the node, with a Collection of the NonTerminals as it's parameter (remember, TGen is a bottom-up parser. Note that terminals can be interspersed within the nonterminals, and are ignored. Every non-terminal that is used in a grammer rule will already have a node associated with it by the time the rule is matched). This is how the tree is constructed.

Ok, time for an example. In our program, we had a node called "DefaultNode" that was the parent of every other node type we used. It had two children, "children" and "symbol" for storing what was passed to it in addChildrenInitial: and setAttributes:, respectively.
Uploaded Image: team8_pic6.gif

The addChildrenInitial: method simply saved a copy of the Collection that it was passed, so that it remembered it's children.
Uploaded Image: team8_pic7.gif

The setAttributes: method saved a copy of the string that it was passed.
Uploaded Image: team8_pic8.gif

This is all of the code that you need to build your tree. Now you just have to figure out how to do something meaningful with it. For our project we were supposed to be turning our trees into displayable equations. We already had our code for a "MathObject" written, which is a class that takes two other MathObjects, a left and a right, and an operation, and makes a graphical representation of the operation between the left and right. Sounds a lot like a tree, right? Perfectly suited for using the parse tree! All we have to do now is translate our parse tree into a MathObject tree. The way we did this was by giving all of our nodes an "asMathObject" method which turned the node into its corresponding MathObject. Then we just make a node (all of them subclassing DefaultNode) for handling all of the different operations of MathObject, assign these nodes to their proper grammar rules, and we're set. Let's take the Add rule as an example:

Add : Add '+' Mul {AddNode} ;
Add : Mul {DefaultNode} ;

This rule creates a tree of Add nodes that looks like this:

/ \
/ \
/ \
Add + Mul
/ \
/ \
/ \
Add + Mul
/ \
/ \
/ \
Add + Mul

So now our task is to turn this tree of AddNodes into a tree of MathObjects with add as their operation (the MathAdd class). So the method for the AddNode class is:

^MathAdd left: (children first asMathObject)
right: (children second asMathObject).

Remember that "children" is the instance variable that we defined in DefaultNode that contains the Collection of NonTerminals that was passed in when the node was created. Since there are two nonterminals in the Add rule, children first will get the node of the first nonterminal, and children second will get the node of the second nonterminal. In our case, we called asMathObject on those nodes as well, thus recursively creating our MathObject tree out of the parse tree.

There is one other case that needs to be considered. What about rules with only one terminal, for example:

Add : Mul {DefaultNode} ;

In this case, you're simply passing information up the tree, and no special combination of nodes is necessary (there is only one). I had this case handled in DefaultNode's asMathObject method:

^children first asMathObject.

which just passes the information from the child node up to the parent node with no changes made.

That's all, Folks. Now give me my extra credit.

This tutorial was made by Thomas Annandale. If you like it, you should send him money.

Links to this Page