Back
Linguist comprises a core of classes that handle the basic features of any language, plus a large set of classes you write yourself, each of which deals with a single aspect of your target language. We supply a comprehensive set for you to use or modify as you prefer.
As a simple example, let's take a few lines that would be recognized by the standard package:
variable Count put 5 into Count prompt Count exit
Part of the job of the core classes is to read in the script and break it up into lines and tokens. Linguist then works its way through the tokens. In this example, the first thing it finds is the word variable. Because nothing in Linguist is specific to English, a keyword handler for this word must be provided somewhere. Linguist is organized into packages (see the Reference) and all the handlers in this example can be found in the basic package.
Variables of various kinds are the key to Linguist, and are where most of the work is done. They are consequently the largest classes. Whenever you put a value into a variable, move an object to a position on the screen, display a graphic, play a movie or set an attribute of something, the command handler simply calls a method in the appropriate variable and lets it do the job. This is object orientation at its best, where only the object itself knows how to do something; the other classes merely give it an instruction to do it now.
Keyword handlers
To locate a handler, Linguist searches the packages, one by one. For each package it takes the keyword (here the word variable), makes its first character upper-case, then prepends the result and the identifier K for Keyword. The result in this case is BasicKVariable. It then looks for a class of that name in the basic.keyword package. If the class is not present, the compiler moves on to the next package. If this were graphics it would look for GraphicsKVariable in graphics.keyword. This process continues until either it finds a handler or there are no more packages. The latter causes an error to be reported and compilation is aborted.
Once a handler is found, linguist calls its handleKeyword() method. Every keyword handler has one and it's where the real compilation is done. The handler (written by you) takes over at this point to complete the job of identifying what the script is asking for. Once it finishes, it returns a runtime handler containing all necessary data. The main compiler class adds this to a Vector that ends up having one element per command in your script.
In the case of a variable declaration, the keyword handler will ask for the next token (word). It's expecting the name of a variable, so the token is checked against the symbol table in case of duplicates. It then instantiates and preinitializes a runtime handler - in this case called BasicHVariable - containing the current script line number (for reporting errors), the name of the variable, its location in the compiled script and a flag that tells if it's an array. The handler is returned to Linguist, which places it into the Vector that is to become the runtime program.
Other keyword handlers work in much the same way, some more complex than others. I'll provide more detail in the tutorial.
Runtime handlers
Once the script has been compiled it can be saved to disk or run directly. (Saving is done by serialization, so you need to make sure all of your classes are Serializable.) The runtime process is very simple; Linguist starts with the first handler in the program Vector and calls its execute() method. The value returned is the index of the next handler to be executed, or zero if this script thread has ended. The overhead is very low.
The execute() method contains the real meat of what you have embedded inside the script language. It can be as simple or as complex as you like, from simple assignments to complex mathematical functions or even whole embedded Java programs. You can also use it as a jumping-off point for native methods.
Linguist forces you to break your project down into blocks that can be represented by script words. This is an essential discipline in large projects and the extra effort involved in setting up the structures is amply repaid. There's little temptation to hack a temporary patch; it's easier to create a new command or syntax variant and work inside a new runtime handler. Mistakes made in one place are contained and don't propagate through the rest of your application.