Lecture 8
- About the Java PL0 Interpreter
- Interpreter Stack Frames
- PL0 Interpreter Implementation in Java
- Rewrite Rules for Grammars
1.0 - Implementation of PL0 Interpreter in Java
The Tutorial 4 Compiler isn’t actually a compiler - it doesn’t have the code generation phase. In place of code generation phase, we have implemented an interpreter that executes the nodes in the Abstract Syntax Tree (AST)
- The Tutorial 4 PL0 Compiler has a:
- Parser Recursive Descent Parser
- Static Checker (As described on in the previous lecture)
- Interpreter
1.1 - The PL0 Interpreter
- The PL0 Interpreter:
- Evaluates expression nodes, and;
- executes the statement nodes
- in the abstract syntax tree directly in order to run the given program
2.0 - Interpreter Stack Frames
2.1 - Nested Procedures in PL0
- Remember that PL0 is a procedural programming language, in which we can define procedures and nested procedures
// main program defined at static level 1
var x : int;
y : int; // Main program variables, static level 1
procedure p() = // defined at static level 2
var y : int; // Procedure p variable, static level 2
procedure q() =
var z : int; // Procedure q variables, static level 3
begin z := y; x := z end;
begin ... end // procedure p's block
begin ... end // main program block
-
The
static level
of a procedure is the procedure’s depth of scope- The
static level
of the main program isstatic level 1
- Any
procedure
defined within the main program hasstatic level 2
- Any other
procedure
hasstatic level x+1
wherex
is the static level of its parent procedure. Variables
have the same static level as the procedure that they’re defined in
- The
-
An inner procedure can access its local variables, and the variables of all procedures in which it is nested
Procedure Level Variables Accessible Main method x : int; y : int; Procedure p x : int; y : int; // This is the local variable y, not the variable y defined in main Procedure q x : int; y : int; // This is the local variable y, not the variable y defined in main z : int -
When we’re writing our compilers and interpreters, we need to keep track of the value of these local variables for each procedure invocation
2.2 - Interpreter Stack Frames
- A stack of frames (or activation records) is used to keep track of the values of variables for each active procedure invocation (including the main method)
- A frame for the activation of a procedure p contains:
- Static link to the frame of the most recent invocation of procedure in which it was defined (so that we can access variables in the procedure in which the variable was defined (or other procedures in which the defining procedure is nested within)
- Dynamic link to the frame of the procedure activation that called p (so that we can return to the procedure activation that called p after executing the instructions)
- The static level of procedure (depth of scope; need static link + static level to access all of the variables that this procedure can access - for nested scopes)
- An array of entries containing the values of the local variables for this activation p (keep track of the values of the local variables)
2.2.1 - Example of Interpreter Stack Frames
- Consider the following code:
var x : int; y : int; procedure f() = var y : int; begin y := x; if x > 0 then begin x := x - 1; call f() end else y := x; write y end; begin x := 2; call f() end
-
To start running this code, we first create an activation record for the main procedure:
- Static Link = null Since the main procedure is at the highest static level
- Dynamic Link = null Since the main procedure activation record is the first activation record in the stack
- We note that our entries array is [,] - in our symbol table we designate that each variable will be stored in the entries array with a given offset.
- When implementing the entries array, we initialise it to an array of NULLs.
- We can implement in the Dynamic Semantics that if we try to access a variable that hasn’t been assigned a value (i.e. is NULL) we get an error.
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [_, _] -
We then start by running the statement list in the main procedure’s body. The first statement to execute is an assignment statement
x := 2
- The expression
is a constant expression, with no additional complications - We now want to evaluate the left value,
. - It is a variable, at static level of 1, with an offset of 0.
- The current frame (i.e. the one created in the previous step) has a level of 1 so therefore, the value of the expression will be stored in
entries[0]
- And we then update our stack frame:
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [2, _] - The expression
-
We then execute the second statement in the statement list:
call f()
- When we call the procedure
f()
, we need to create an activation record for it. - Note that for the activation record for the procedure
f()
, both the Static Link and Dynamic Link point to the same activation record (the activation record for the main procedure)- The
Dynamic Link
points to the activation record that called this procedure. - The
Static Link
points to the most recent activation record
- The
- Also note that the entries array only has one element - this is for the re-declared integer variable
y
which has an offset of 0 (stored inentries[0]
) - Additionally, note that the static level has been incremented by 1
- Now we, we create the new activation record, and grow the stack downward.
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [2, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [_] - When we call the procedure
-
We have now entered the procedure, and now evaluate the first statement in the assignment:
y := x;
-
First, we evaluate the expression
- we need to know what static level and offset it has. - We know that
is defined at a static level of 1, so we follow the dynamic links until we reach an activation record with static level of 1. - In this case, this is the previous frame.
- From the previous frame’s metadata, we know that
is defined at a level of 1, with offset 0, i.e. is stored at entries[0]
- From there, we obtain the current value of
, and now we work on assigning it to
- We know that
is at a static level of 2 and an offset of 0 - stored at entries[0]
at level 2 - We then know that we want to assign y,
entries[0]
the value of 2.
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [21, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] - We know that
-
We then evaluate the following conditional:
if x > 0 then begin x := x - 1; call f() end else
- Using the process outlined before in step 4, we determine that
- Since
, we execute the code in the then
part of the conditional (as opposed to theelse
part of the conditional)
- Since
- Using the process outlined before in step 4, we determine that
-
In evaluating the conditional, we want to execute the following statement:
x := x - 1;
- We know from before that
and the activation record that contains is the activation record for the main procedure. - We want to decrement the value of x, so we update that activation record
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [1, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] - We know from before that
-
We then execute the other statement in the
then
branch of the conditional - calling proceduref()
again- Since we have called a function, we create a new activation record.
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [1, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [_] - The
static link
isprocedure f
as that is the activation record that called the function - The
dynamic link
ismain
as that is the most recent invocation (occurrence??) of the procedure that defines the function - The
level
is 2, as that is the depth of scope of the procedure (note that it doesn’t increment). - So, variable
will be stored at a static level of 2, and offset of 0, i.e. entries[0]
-
We then execute the first statement in the body of
f()
y := x;
- From before, we know that
is stored at a static level of 1, offset of 0 and - We look at our current activation record, and see that it’s at a static level of 1
- Therefore,
is not stored here - Need to follow the chain of static links back until we get to the correct static level
- Following the static links, we see that
- Therefore,
- Looking at the activation record, we see that y is at a static level of 2, with offset of 0
- The current activation record is at a static level of 2 and therefore we’re looking at the right activation record
- We assign y the value of
by setting entries[0]
to 1
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [1, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [1] - From before, we know that
-
We then evaluate the conditional statement - since
, we execute the then
branch of the conditional.- We evaluate the first statement in the branch:
x := x - 1;
- From before, we know that
is stored in the activation record for the main procedure, at an offset of 0. - We decrement the entry by 1
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [10, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [1] -
We then evaluate the second statement in the branch:
- Another procedure invocation of
f()
call f()
- To represent this procedure invocation, we create a new activation record
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [0, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [1] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [_] - Another procedure invocation of
-
We then evaluate the first statement in the procedure f:
y := x;
- We know that
is stored at a static level of 1, and offset of 0 (in the activation record for the main procedure) - From this, we know that
- From this, we know that
- We know that
is stored at a static level of 2, and offset of 0 (in the current activation record) and from this we set the value of y
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [0, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [1] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [0] - We know that
-
We then evaluate the conditional
- Since
- Since
, we evaluate the else branch of the conditional - We’re essentially done with this procedure invocation, and can remove the activation record
- We then pop that activation record, and return to the previous activation record (using the static link)
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [0, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] procedure f
Static Link = procedure f Dynamic Link = main level = 2 entries = [1] -
In the second invocation of the procedure f, we continue executing the next statement after the procedure call (of the activation record that we have just popped from the stack).
write y
**// Output (To STDOUT): 0**
-
Following this same pattern, we finish the procedure execution of the second invocation of procedure f and pop from the stack once again after evaluating the
write y
statement.▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [0, _] procedure f
Static Link = main Dynamic Link = main level = 2 entries = [2] **// Output (To STDOUT): 01**
-
For the last time, we evaluate the write y statement and pop the activation record corresponding to the first invocation of the procedure f,
- At this point, the static level is 1, and we have returned to the activation record corresponding to the main
▶️ main
Static Link = null Dynamic Link = null level = 1 entries = [0, _] **// Output (To STDOUT): 012**
-
3.0 - PL0 Interpreter Implementation in Java
The Interpreter class is contained in the interpreter package.
- The interpreter is implemented using the Visitor pattern
StatementVisitor
Visitor interface for statements which implements a method for visiting each different type of Statement node in ASTExpTransform<Value>
Visitor interface for expressions which implements a method for evaluating each expression in the AST
3.1 - The Value Abstract Class & Subclasses
- When we evaluate an expression, we supply it a
Value
(i.e. a type in the sense of Java generics)-
The Value class is an abstract class that has implementing subclasses encompassing the types of values we could have within our implementation of PL0
-
The Value class has three implementing subclasses:
We evaluate both Integers and Booleans to have the
IntegerValue
type.public static class IntegerValue extends Value { private int value; public IntegerValue(int value) {this.value = value;} public int getInteger() { return this.value; } public String toString() { return Integer.toString(value); } }
public static class AddressValue extends Value { private int level; private int offset; public AddressValue(int level, int offset) { this.level = level; this.offset = offset; } public int getAddressLevel() { return level; } public int getAddressOffset { return offset; } public String toString() { return "address(" + level + "," + offset + ")";} }
- Note that here we use the level and offset to keep track of where each address value is
- We just need these two pieces of information to retrieve the value of these variables.
- Note that here we use the level and offset to keep track of where each address value is
-
3.2 - The Interpreter Class
-
In our Interpreter class, we have the following variables for keeping track of the state of the interpreter
public class Interpreter implements StatementVisitor, ExpTransform<Value> { private final BUfferedReader in; // For reading from STDIN private final PrintStream outStream; // Program output stream private final Errors errors; // Error handler; For error information private final VisitorDebugger debug; // Visitor Debugger; For debug info private Frame currentFrame; // Top of stack, current activation record }
-
We then execute code in the AST using the
executeCode
methodpublic void executeCode(DeclNode.ProcedureNode node) { beginExec("Program"); SymEntry.ProcedureEntry procEntry = node.getProcEntry(); // Set up the main activation frame // Frame(dynamicLink, staticLink, procEntry); currentFrame = new Frame(null, null, procEntry); visitBlockNode(procEntry.getBlock()); endExec("Program"); }
3.3 - Evaluating Expressions in the Java PL0 Interpreter
3.3.1 - Evaluating an Error Expression Node
What happens when we evaluate an
ErrorExpNode
?
-
In the Java PL0 interpreter, we have the following phases:
- Parsing
- Static Analysis
- Interpretation (iff there were no errors in the
Parsing
andStatic Analysis
phase)
-
Therefore, it should be impossible to evaluate an
ErrorExpNode
if our interpreter has been implemented correctly.// Defensive programming, as this code shouldn never be executed. public ErrorValue visitErrorExpNode(ExpNode.ErrorNode node) { // Error when an error node is evaluated as evaluation should not occur if // there are errors in the AST errors.fatal("PL0 Internal error: attempt to evaluate ErrorExpNode", node.getLocation()); return null; // Never reached. }
3.3.2 - Evaluating a Constant Node
-
A constant node has a value, and we need to return a Value type
- We essentially just wrap up the value in an
IntegerValue
and return that:
/** * Expression evaluation for a constant - resolve to the constant's value */ public IntegerValue visitConstNode(ExpNode.ConstNode node) { beginExec("ConstNode"); IntegerValue result = new IntegerValue(node.getValue()); endExec("ConstNode"); return result; }
- We essentially just wrap up the value in an
⚠️ Note here that we have the
beginExec(...)
andendExec(...)
function calls - these just print out a message when we run the interpreter in debug mode.
3.3.3 - Evaluating an Identifier Node
⚠️ Identifier nodes are introduced into our AST during the parsing phase, but were all eliminated during the static analysis phase. Therefore, we program defensively again:
/**
* Expression evaluation for an identifier node - should never be reached
*/
public Value visitIdentifierNode(ExpNode.IdentifierNode node) {
/* Error when identifier node is evaluated, identifier nodes should
* be eliminated by the semantic syntax process
*/
errors.fatal("PL0 Internal error: attempt to evaluate IdentifierNode",
node.getLocation());
return null; // Never reached
}
3.3.4 - Evaluating a Variable Node
-
Instead of returning the value of a variable, we return the address of the variable.
- If we wanted the value of the variable, we could dereference it.
public Value visitVariable(ExpNode.VariableNode node) { beginExec("Variable"); // Get variable from symbol table. SymEntry.VarEntry entry = node.getVariable(); // Use the symbol table entry to construct a new Address Value Value lValue = new AddressValue(entry.getLevel(), entry.getOffset()); endExec("Variable"); return lValue; }
3.3.5 - Evaluating a Binary Node
/**
* Expression evaluation for a binary operator expression
**/
public Value visitBinaryNode(ExpNode.BinaryNode node) {
beginExec("Binary");
int result = -1;
/* Evaluate the left and right sides of the operator expression */
int left = node.getLeft().evaluate(this).getInteger();
int right = node.getRight().evaluate(this).getInteger();
/* Perform the operation on the left and right side of the expression */
switch (node.getOp()) {
/* Mathematical operations */
case ADD_OP:
result = left + right;
break;
case SUB_OP:
result = left - right;
break;
case MUL_OP:
result = left * right;
break;
case DIV_OP:
/* Error when division by zero occurs */
if (right == 0) {
runtime("Division by zero", node.getRight().getLocation(),
currentFrame);
}
result = left / right;
break;
/* Logical operations - resulting in 1 for true and 0 for false */
case EQUALS_OP:
result = (left == right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case NEQUALS_OP:
result = (left != right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case GREATER_OP:
result = (left > right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case LESS_OP:
result = (left < right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case LEQUALS_OP:
result = (left <= right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case GEQUALS_OP:
result = (left >= right ? Type.TRUE_VALUE : Type.FALSE_VALUE);
break;
case INVALID_OP:
default:
errors.fatal("PL0 Internal error: Unknown operator",
node.getLocation());
}
endExec("Binary");
return new IntegerValue(result);
}
-
A binary node has two expressions, so we have to evaluate both of them
- Since the AST has already been type checked (static checking has been completed), we know that it is correct and that it will return an integer.
int left = node.getLeft().evaluate(this).getInteger(); int right = node.getRight().evaluate(this).getInteger();
-
The next steps are then operator-dependent, so we use a switch case on the binary operator to determine what to do:
switch (node.getOp()) { /* Mathematical operations */ case ADD_OP: result = left + right; break; case SUB_OP: result = left - right; break; case MUL_OP: result = left * right; break; ... }
- In the case of division, we have to be careful in the case of a division by zero:
case DIV_OP: /* Error when division by zero occurs */ if (right == 0) { runtime("Division by zero", node.getRight().getLocation(), currentFrame); } result = left / right; break;
⚠️ Types of Errors:
- Syntax Errors - Errors reported during Parsing
- Static Errors - Errors reported during Type Checking Phase
- Runtime Errors - Errors reported when the code is running.
- Typically, runtime errors are for things that are impractical or impossible to check during the syntax analysis or static analysis phase.
- We still want to return integers (even for the logical operators that return binary values).
- We define
Type.TRUE_VALUE=1
andType.FALSE_VALUE=0
so that we can be stylistically consistent whilst also indicating that we’re returning an integer that represents a Boolean value
- We define
-
At the end, we want to return an instance of Value, so we wrap the Integer using the
IntegerValue
class:endExec("Binary"); return new IntegerValue(result);
3.3.6 - Evaluating a Unary node
The concept of evaluating a Unary Node is the same as evaluating a Binary Node (except, simpler)
- Evaluate the argument
- Use the switch case to perform operator-dependent things
/**
* Expression evaluation for a unary operator expression
**/
public Value visitUnaryNode(ExpNode.UnaryNode node) {
beginExec("Unary");
/* Handle unary operators */
int result = node.getArg().evaluate(this).getInteger();
//noinspection SwitchStatementWithTooFewBranches
switch (node.getOp()) {
case NEG_OP:
result = -result;
break;
default:
// Never reached
errors.fatal("PL0 Internal error: Unknown operator",
node.getLocation());
}
endExec("Unary");
return new IntegerValue(result);
}
3.3.7 - Evaluating a Dereference Node
We use Dereference nodes to get access to the value of a variable.
/**
* Expression evaluation for dereference - evaluate sub-expression
*/
public Value visitDereferenceNode(ExpNode.DereferenceNode node) {
beginExec("Dereference");
Value lValue = node.getLeftValue().evaluate(this);
/* Resolve the frame containing the variable node */
Frame frame = currentFrame.lookupFrame(lValue.getAddressLevel());
/* Retrieve the variables value from the frame */
Value result = frame.lookup(lValue.getAddressOffset());
if (result == null) {
runtime("variable accessed before assignment", node.getLocation(),
currentFrame);
return null; // Never reached
}
endExec("Dereference");
return result;
}
-
We evaluate the left value of the dereference node, which will return an address.
Value lValue = node.getLeftValue().evaluate(this);
-
We then need to determine the value of the variable stored at this address.
-
We start by looking up the activation record in which is at the static level of the
lValue
’s address. -
The
currentFrame.lookupFrame
method follows the link in the activation record until it reaches a frame with the correct static level. -
Once we have obtained the correct frame, we look up the value in the array (using the address offset).
Value lValue = node.getLeftValue().evaluate(this); /* Resolve the frame containing the variable node */ Frame frame = currentFrame.lookupFrame(lValue.getAddressLevel()); /* Retrieve the variables value from the frame */ Value result = frame.lookup(lValue.getAddressOffset());
-
-
We then perform some validation, alerting the user if a variable has been accessed before it has been assigned.
-
Note that this is an implementation decision or runtime semantics that we have decided upon for this particular compiler implementation.
if (result == null) { runtime("variable accessed before assignment", node.getLocation(), currentFrame); return null; // Never reached }
-
-
We then return the result
endExec("Dereference"); return result;
3.3.8 - Evaluating a Narrow Subrange Node
- In evaluating a Narrow Subrange Node, we need to perform a few checks / operations:
-
Evaluate the expression to get the value.
beginExec("NarrowSubrange"); Value val = node.getExp().evaluate(this);
-
Get the subrange type
Type.SubrangeType subrange = node.getSubrangeType();
-
Check that the value that we want to put into the subrange - we want to check that the subrange type can actually hold an integer of that value.
- If the Subrange can’t hold that integer, we throw a runtime error
if (!subrange.containsElement(subrange.getBaseType(), val.getInteger())) { runtime("bounds check failed at line " + node.getLocation().getLine() + ": " + val + " not in " + subrange, node.getLocation(), currentFrame); }
-
Return the value.
endExec("NarrowSubrange"); return val;
-
3.4 - Executing Statements
3.4.1 - Executing a StatementList
- Executing a StatementList is essentially just just executing each statement one by one:
public void visitStatementListNode(StatementNode.ListNode node) {
beginExec("StatementList");
for (StatementNode statement : node.getStatements()) {
statement.accept(this);
}
endExec("StatementList");
}
3.4.2 - Executing an Assignment Node
public void visitAssignmentNode(StatementNode.AssignmentNode node) {
beginExec("Assignment");
/* Evaluate the code to be assigned */
Value value = node.getExp().evaluate(this);
/* Assign the value to the variables offset */
Value lValue = node.getVariable().evaluate(this);
assignValue(lValue, value);
endExec("Assignment");
}
- An assignment node has an expression (i.e. the value that we want to assign) and a left value (i.e. the variable to be assigned).
-
To get the value of the expression, we execute it:
beginExec("Assignment"); Value value = node.getExp().evaluate(this);
-
We then evaluate the “value” of the
lValue
(that is, its address)Value lValue = node.getVariable().evaluate(this);
-
We then use the
assignValue
function to assign the value.private void assignValue(Value lValue, Value value) { /* Resolve the frame containing the variable node */ Frame frame = currentFrame.lookupFrame(lValue.getAddressLevel()); /* Assign the variables value to the offset in the frame */ frame.assign(lValue.getAddressOffset(), value); }
assignValue(lValue, value);
-
We then use
endExec
to signify that we’re done and finish the methodendExec("Assignment");
-
3.4.3 - Visiting a Read Node
- In visiting a read node, we:
- Get the value from the read node (and throw an error if it’s not an Integer)
- Assign the value to a variable.
public void visitReadNode(StatementNode.ReadNode node) {
beginExec("Read");
/* Read next int from standard input */
IntegerValue result = null;
try {
result = new IntegerValue(Integer.parseInt(in.readLine()));
} catch (Exception e) {
runtime("invalid value read - must be an integer",
node.getLocation(), currentFrame);
// Never reached
}
Value lValue = node.getLValue().evaluate(this);
assignValue(lValue, result); // Assign the result to the address of the left value.
endExec("Read");
}
3.4.4 - Visiting a Write Node
- In visiting a write node, we:
- Evaluate the expression to be written
- Write the value to the stream.
public void visitWriteNode(StatementNode.WriteNode node) {
beginExec("Write");
/* Evaluate the write expression */
int result = node.getExp().evaluate(this).getInteger();
/* Print the result to the outStream */
outStream.println(result);
endExec("Write");
}
3.4.5 - Visiting a Procedure Call
A procedure call is a little more complicated - we have to add another activation record to our stack for the procedure that we’re calling.
public void visitCallNode(StatementNode.CallNode node) {
beginExec("Call");
/* Decent to the executing procedures frame */
currentFrame = currentFrame.enterFrame(node.getEntry());
/* Resolve the code block to call and execute the block */
node.getEntry().getBlock().accept(this);
/* Return to the parent frame */
currentFrame = currentFrame.exitFrame();
endExec("Call");
}
-
First we initialise the call, and we enter in to the
SymEntry.ProcedureEntry
.- Note that
node.getEntry()
returns the symbol table procedure entry of the procedure call. - Once we get the symbol table procedure entry corresponding to the new procedure call, we set it to the top of the stack.
currentFrame = currentFrame.enterFrame(node.getEntry());
- Note that
-
We then execute the block:
node.getEntry()
returns the symbol table procedure entrynode.getEntry().getBlock()
gets the block of code to be executednode.getEntry().getBlock().accept(this)
executes the block of code.
node.getEntry().getBlock().accept(this);
-
After we complete the execution of the code, we want to return to the previous stack frame (using the dynamic link in the stack frame).
currentFrame = currentFrame.exitFrame();
-
Run endExec for debugging purposes.
endExec("Call");
3.4.6 - Executing an If Statement / If Node
- Obtain the condition and evaluate the condition
- If the condition is true, then execute the THEN component of the conditional
- If the condition is true then evaluating the condition returned the value that represents
boolean
true (1)
- If the condition is true then evaluating the condition returned the value that represents
- Else execute the ELSE component of the conditional
public void visitIfNode(StatementNode.IfNode node) {
beginExec("If");
ExpNode condition = node.getCondition();
if (condition.evaluate(this).getInteger() == Type.TRUE_VALUE) {
/* Execute then statement if condition evaluates to true */
node.getThenStmt().accept(this);
} else {
/* Execute else statement if condition evaluates to false */
node.getElseStmt().accept(this);
}
endExec("If");
}
3.4.7 - Executing a While Statement / While Node
- We first obtain the condition
- Execute the body of the loop
node.getLoopStmt()
if
public void visitWhileNode(StatementNode.WhileNode node) {
beginExec("While");
/* Execute loop statement while the condition is true */
ExpNode condition = node.getCondition();
while (condition.evaluate(this).getInteger() == Type.TRUE_VALUE) {
node.getLoopStmt().accept(this);
}
endExec("While");
}
4.0 - Rewrite Rules for Grammars
-
We have a systematic process for writing a recursive-descent parser for an EBNF grammar
-
However, not all EBNF grammars are suitable for Recursive Descent Parsing
-
Sometimes we can re-write them into a form that is suitable
-
Recursive Descent Parsing is a parsing technique that uses a single-symbol lookahead.
- Single symbol lookahead must be used when we want to determine which alternative sequence is resent.
- Single symbol lookahead is also used when recognising optional statements.
4.1 - Left Factoring Grammar Productions
-
Productions such as:
-
Are not suitable for recursive descent parsing because the two alternatives share a common prefix
-
To get around this, one can re-write the production as follows, to get rid of the common prefix.
$\color{lightblue}\text{IfStmt} \rightarrow \text{if ( Cond ) Stmt ElsePart}\ \text{ElsePart}\rightarrow\epsilon\text{ | else \color{pink} Stmt}$
-
Note that this grammar still has the dangling else problem - it is still ambiguous
-
This is equivalent to the following production in EBNF
4.1.1 - Formal Definition of Left Factor Rewriting Rule
-
To remove the left factor from:
-
One can rewrite the production using an additional new non-terminal
as follows:
4.2 - Removing Left Recursion from Grammars
-
A production of the following form is not suitable for recursive descent parsing because the left recursion in the grammar leads to an
infinite recursion
in the parsing program -
Let’s try to write the recursive descent parser for it:
void parseE() { if (tokens.isMatch(First(E + T))) { parseE(); tokens.match(Tokens.PLUS); parseT(); } else if (tokens.isMatch(First(T))) { parseT(); } else { // Assuming that the alternatives are not nullable errors.error("Syntax error"); } }
- Suppose our current token is one that can start with E so we call
parseE()
- Then, within the
parseE()
method, we callparseE()
again, and the token is still one that can startparseE()
- Therefore, we’re stuck in an infinite loop.
- Suppose our current token is one that can start with E so we call
Let’s try to re-write this production
-
The production
matches sequences of the form: - That is, a
followed by zero or more occurrences of
-
With this knowledge, we can re-write our grammar as follows:
$E\rightarrow{\color{pink}T}E'\ E'\rightarrow\epsilon|{\color{lightblue}+T}E'$
- Note that this new refactor has removed the direct left recursion
- This may work perfectly fine (dependent on what
are)
-
Note that this can be re-written in EBNF as
4.2.1 - Formal Definition of Immediate Left Recursion Rule (Simple Case)
-
To remove the left recursion from the production:
-
Which matches a
followed by zero or more occurrences of , one can re-write the production as follows: $A\rightarrow{\color{lightblue}\beta}A'\ A'\rightarrow\epsilon |{\color{pink}\alpha}\ A'$
4.2.2 - Formal Definition of Immediate Left Recursion Rule (General Case)
-
The general case for direct left recursion has the form:
-
We use grouping to see the structure in the same form as the simple case
-
And then re-write the production as follows, using the previous rule:
$A\rightarrow ({\color{lightblue}\beta_1}|{\color{lightblue}\beta_2}|\cdots|{\color{lightblue}\beta_m})A'\ A'\rightarrow\epsilon|({\color{pink}\alpha_1}|{\color{pink}\alpha_2}|\cdots|{\color{pink}\alpha_n})A'$
4.2.3 - Immediate Left Recursion Example
This is of the same form as
Using the Immediate Left Recursion rule, we have:
Which instantiated for the example gives:
Note: In EBNF, this is equivalent to
4.2.4 - Indirect Left-Recursion Rewriting Rule
-
The production for A has both a direct left-recursion (first alternative) and an indirect left recursion through B and C
- We have direct left-recursion in both the first production and the second production in that
and - We also have indirect left-recursion between the second and third production in that
- We have direct left-recursion in both the first production and the second production in that
-
The rewriting for indirect left recursion takes multiple steps
-
Firstly, we remove the direct left recursion from
-
We then remove the direct left recursion from B
-
We re-combine our grammar and notice that we still have some indirect left-recursion
- We notice that in the first production
, and that we have a production . - We observe that we can eliminate the indirect left recursion through
by first replacing the occurrence of in the first production by its definition and remove the no longer used production for
-
However, notice that we still have some indirect left-recursion in that
and -
To eliminate the indirect recursion through C, we replace the occurrence of C by its definition and remove C
-
Removing the grouping and distributing the symbols outside of the parentheses into the alternative operator
exposes a direct left recursion: -
This direct left recursion is removed using the direct left recursion rewriting rule (which introduces a new non-terminal,
) -
Now, we’ve removed all the direct and indirect left recursion, and this grammar is now suitable to be parsed by recursive descent parsing.
- We notice that in the first production
-