Tutorial 1

Tutorial 1 | Tutorial 1 Solution | PL0 Concrete Syntax | PL0 Data Structures

Question 1

Given the following PL0 program:

var
	x: int; y: int; max: int;
begin
	read x; read y;
	if x < y then
		max := y
	else
		max := x;
	write max
end

Part A: Describe the series of tokens supplied by the scanner (lexical analyser). Note that int is a predefined identifier, not a keyword in the language. See Figure 1 in the PL0 Concrete Syntax handout for the definitions of PL0 Tokens. Do this for the if/else block.


KW_IF, IDENTIFIER("x"), LESS, IDENTIFIER("y") KW_THEN,
IDENTIFIER("max"), ASSIGN, IDENTIFIER("y"), KW_ELSE,
IDENTIFIER("max"), ASSIGN, IDENTIFIER("x"), SEMICOLON

Part B: Give the parse tree that results from parsing just the if statement in the above program (i.e. ignore everything up to and including the reads and everything from the write onwards). The top node in the parse tree should be for the non-terminal IfStatement and the tree should be consistent with the PL0 Concrete Syntax.

Part C: Give the abstract syntax tree for the if statement within the program. The tree should be consistent with the tree nodes used in the compiler (See PL0 Compiler Data Structures).

Question 2

In which compiler module would you expect the following to occur:

  1. An undeclared identifier

    Conceptually, this would occur in the static semantics analyser (using the symbol table), but in the PL0 compiler, this is done as the symbol table (and abstract syntax tree) is being set up in the abstract syntax tree

  2. An illegal character in the source is detected

    Lexical analyser (or scanner)

  3. The use of “]” instead of “)”

    Parser

  4. Unreachable code is eliminated

    Source code optimiser - Here we distinguish between optimisations that are based only on the source code and optimisations that depend on the machine (but it isn’t expected in the answers).

  5. Run time checks for range overflow are inserted

    Intermediate code generation

  6. Run time checks for range overflow are eliminated where it can be shown from the context that they are redundant

    Source code optimiser

  7. The use of a ‘;’ instead of a ‘,’ in a parameter list is detected

    Parser

  8. The insertion appropriate for a C language #include file macro command (which results in textual substitution of the contents of the file at this point) is performed. (Warning: this is something of a trick question!)

    This is actually done in a separate pre-processor, which is run before the compiler. If not done separately, it would have to be done early, in the scanner

  9. A type conflict in an expression is detected.

    Static semantics analyser.

  10. A constant too large for the target machine is detected.

    If the programming language defines the maximum size for an integer value then it makes sense to do the check early on in the lexical analyser (scanner) but for some languages (like C) the maximum size of an integer may be dependent on target machine (e.g. 16-bit versus 32-bit versus 64-bit machines) and then there is a strong argument for deferring it to target machine code generation. Note that a compiler for a language may have options to generate code for different target machines.

  11. A branch (jump) instruction that has as its destination an unconditional branch instruction is changed to go directly to the destination of the second branch.

    Source code optimiser (or peephole optimiser).

  12. Whether the language has name equivalence or structural equivalence of types is an influence.

    Static semantic analyser.

  13. Whether ’;’s are used as separators or terminators for statements.

    Parser.

  14. Code for coercing an integer to a real where this is required in a mixed mode expression (an expression with both integers and reals) is created.

    This is handled in the static semantic analyser which transforms the expression to include a coercion node, the code generation for which will convert the integer to a real.

  15. The appropriate instance of an overloaded method required for a particular call is identified. Overloading of methods is said to occur when the same name is used for more than one method. The appropriate instance is identified using the types of the parameters.

    Static semantic analysis