; ** README.stax-0.5.0 ******************************* ; A bried description of the STAX programming language ; ; John Haskins, Jr. ; l'Universite de Virginie (University of Virginia) ; jwh6q@virginia.edu ; ---------------------------------------------------- STAX, The Programming Language... --------------------------------- The STAX programming langauge is similar in style to assembly langauge. The language contains directives, instructions and labels. Note, the example STAX program fragment below. ; Comment: ; STAX program to accept { 0^n1^n2^n | n>=0 } .input_alphabet "012" ; input alphabet declaration .stack_alphabet "ab" ; stack alphabet declaration ; First state in STAX source code is assumed to be the start state, ; here: 'qinit0'. qinit0: ; label ('qinit0') declaring the state's name accept ; 'accept' keyword... an accepting state $, -, nop, qdone ; an instruction -, -, push[0]:$, qinit1 ; a second instruction ... q1: deny ; 'deny' keyword... a non-accepting state '0', -, push[0]:'a', q1 ; an instruction '1', 'a'[0], pop[0], q2 ; a second instruction ... qdone: accept ; 'accept' keyword... an accepting state halt ; 'halt' keyword instruction... halt machine First and foremost, in any STAX program, the valid input alphabet and stack alphabet sets must be declared. In STAX, alphabet symbols are 8-byte char- acters: A through Z, a through z, 0 through 9, etc. (e.g., basically, any of the printable characters except '$' and '-'). [NOTE: In this first ver- sion of STAX there is no shorthand notation for declaring a set of character symbols as there is in, say, 'lex' (e.g., [A-Z]). This will probably be added in the next version.] Each symbol of the alphabet must be separately declared after the directive in double quotes; whether '.input_alphabet' or '.stack_alphabet' is declared first is irrelavent. The symbols '$' and '-' are reserved in STAX. The dollar sign ($) marks the end of input and should be appended to the end of the input string fed to the 'stax' executable (e.g., stax p7.stax 0000$); '$' is also used to mark the bottom of a stack. The dash (-) is used to denote epsilon. Once the alphabet symbols are declared, all that remains is to list states of the machine. Three states are shown above: 'qinit0', 'q1' and 'qdone'. 'qinit0', the first state declared in the program is assumed by the STAX simu- lator to be the starting state of the machine. Immediately after the label, declaring the state, the keyword 'accept' or 'deny' must appear; this brands the state to be either an accepting state or a non-accepting state. If the simulation jumps to a state branded 'accept' and no more input symbols exist, the simulation halts and returns 1 (accepting); if the simulation jumps to a state branded 'deny' and no more input symbols exist, the simulation halts and returns 0 (non-accepting). After a state is declared [a label (i.e., state name) followed by a colon [:] character)] and branded either 'accept' or 'deny', instructions describing appropriate actions to take while in the state must be given. The simplest instruction is the 'halt' keyword instruction. If the simulator jumps to a state whose instruction is 'halt' the machine will immediately halt and return 0 or 1 for accepting or non-accepting, respectively. No other instruc- tions may be listed along-side a halt instruction. More complex instructions follow the pattern, A, B[C], D[E]:F, G; if A is the current input symbol and the stack symbol B appears on top of stack C, execute D on stack E using sym- bol F and jump to state G, where C, E and F are optional. The instruction '1', 'a'[0], pop[0], q2 in state 'q1', for example, reads: if '1' is the current input symbol and stack symbol 'a' appears on top of stack 0, pop the topmost symbol from stack 0 and jump to state 'q2'. The mappings are as follows: A -> '1' B -> 'a' C -> [0] D -> pop E -> [0] F (no match) G -> q2 The instruction $, -, nop, qdone from state 'qinit0' reads: if $ is the current input symbol, do nothing and jump to state 'qdone'. The mappings for this instruction are: A -> $ B -> - C (no match) D -> nop E (no match) F (no match) G -> qdone Note that neither the dollar sign ($) nor the dash (-) are in quotes; this is because both are reserved symbols in the STAX language. The dollar sign (as stated above) denotes the end of the input string. The dash denotes an epsilon; in other words, execute the 'nop' and jump to 'qdone' regardless of the topmost symbol of any stack. When the dash is the input symbol as in the instruction -, -, push[0]:$, qinit1 it denotes that an epsilon transition should occur. The machine will fork a new copy of itself and execute non-deterministically alongside the original in the simulator. Legal mappings from the instruction template: A, B[C], D[E]:F, G are given below: A -> -, $ or symbol from '.input_alphabet' B -> -, $ or symbol from '.stack_alphabet' C -> number of a stack D -> nop, pop or push E -> number of a stack F -> $ or symbol from '.stack_alphabet' G -> state name (e.g., 'qdone') Note that the F member of the template is only valid when used with the 'push' operation. Do not be mislead by the authors use of 'q' in the names of states. State names need not begin with 'q'; this is mearly a convention that the author became familiar with in the computational theory book written by M. Sipser at MIT. State names are strikingly similar to variable names in C; they must begin with a letter (A-Z, a-z), may contain any mix of letters and digits (0-9) thereafter and are case SenSiTIve. Finally, comments may appears virutally anywhere in the program and are handled similarly to C++ comments. Anything typed after a semicolon (;) until the end of that line will be ignored by the compiler. Happy coding!