Lecture 18

  1. Regular Expressions
  2. Finite Automata (State Machines)

1.0 - Regular Expressions

1.1 - Defining Lexical Tokens using Regular Expressions

1.2 - Regular Expressions Syntax

1.3 - Language of Regular Expressions

1.3.1 - Concatenation of Languages

1.3.2 - Iteration of Languages

2.0 - Finite Automata (State Machines)

2.1 - Deterministic Finite Automata (Formal)


The language of a DFA is the set of strings that it can recognise.

2.1.1 - DFA Example


What language (set of strings) does this DFA match?

L(DFA)={ab,c}\def\rq{\text{'}}\mathcal{L}(DFA)=\{\lq ab\rq, \lq c\rq\}

2.2 - Nondeterministic Finite Automata (NFA)

The differences between the formal definition of a DFA and NFA are very similar - the differences are highlighted here in green.


The language of a NFA is the set of strings that it can recognise. This is similar to the definition of a language of a DFA. The differences are highlighted here in green.

2.2.1 - NFA Example

2.3 - Converting a Regular Expression to NFA

  1. Single Symbol / Empty Transition
  2. Concatenation
  3. Alternatives
  4. Repetition

2.3.1 - NFA for Single Symbol or Empty Transition

This is the NFA for matching a single symbol aa or empty transition ε\varepsilon

2.3.2 - NFA for Concatenation

This is the NFA template for matching the concatenation of the expressions e fe\ f.

  1. Firstly, we transform the expression ee into its NFA

  2. We next obtain the NFA for the expression ff

  3. Joining the final state of expression ee’s NFA to the start state of the NFA for expression ff gives the NFA for the concatenation of the expressions e fe\ f.

2.3.3 - NFA for Alternatives

This is the NFA template for matching the alternative expressions e  fe\ |\ f

  1. Firstly, we obtain the NFA for the expression ee

  2. We similarly obtain the NFA for the expression ff

  3. Joining the two NFAs in the following configuration gives the following larger NFA for alternation

2.3.4 - NFA for Repetition

This is the NFA template for matching the repeated expression ee^*

  1. Firstly, we obtain the NFA for the expression ee

  2. We the introduce a new start and end state:

    • The start state has the option to match the expression ee again, or jump to the start state
    • The old end state may loop back to the old start state, or jump to the new end state:

2.3.5 - NFA Example - Repetition and Concatenation

Construct the NFA for the Regular Expression a b  ca\ b\ |\ c

We want to construct the NFA for the regular expression a b  ca\ b\ |\ c. We recognise that the outermost construct is the alternative construct. To construct this NFA, we first need to create the NFAs for the sub-expressions a ba\ b and cc

  1. We first construct the NFA for the expression a ba\ b - we recognise that this is a concatenation, and we first need to construct the NFAs for each of the sub-expressions

    1. We first construct the NFA for the expression aa
    1. We the construct the NFA for the expression
    1. Using the NFA concatenation rules, we join the two sub-NFAs into the NFA for matching the concatenation of expressions a ba\ b
  2. We the construct the NFA for the expression cc - this is straightforward as we are only creating a NFA that matches a single character.

Joining these sub-expression NFAs using the alternation rule, we get:

And, finally numbering the states:

2.3.6 - Example for Repetition of Alternatives

Construct the NFA for the Regular Expression (a  b)(a\ |\ b)^*

  1. We first construct the NFA for the expression aa

  2. We the construct the NFA for the expression bb

  3. We then place the NFAs for expression aa and bb into the NFA for alternatives

  4. We then place the NFA for the expression a  ba\ |\ b into the NFA for

2.4 - From Regular Expression to NFA

In the translation from a regular expression, rr to an NFA, the generated NFA has a few properties that do not necessarily hold for an arbitrary NFA (i.e. one not generated from a regular expression)

The translation rules preserve these properties.

2.5 - From NFA to DFA

A DFA cannot have

A NFA N can be translated to an equivalent DFA DD

How is this done?

2.5.1 - Empty Closure

Empty Closure of a State

Empty Closure of a Set of States


Empty Closure Example

What is the empty closure of the NFA shown?

State X\text{State}\ X εclosure(x,NFA)\varepsilon-\text{closure}(x, NFA)
1 {1,2,6}\{1, 2, 6\}
2 {2}\{2\}
3 {3}\{3\}
4 {4,5}\{4,5\}
5 {5}\{5\}
6 {6}\{6\}
7 {7,5}\{7, 5\}
  1. From state 1, we can go to state 2 or 6 only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={1,2,6}\varepsilon-\text{closure}(x,NFA)=\{1, 2, 6\}
  2. From state 2, we can not go to any other states only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={2}\varepsilon-\text{closure}(x,NFA)=\{2\}
  3. From state 3, we can not go to any other states only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={3}\varepsilon-\text{closure}(x,NFA)=\{3\}
  4. From state 4, we can go to state 5 only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={4,5}\varepsilon-\text{closure}(x,NFA)=\{4,5\}
  5. From state 5, we can not go to any other states only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={5}\varepsilon-\text{closure}(x,NFA)=\{5\}
  6. From state 6, we can not go to any other states only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={6}\varepsilon-\text{closure}(x,NFA)=\{6\}
  7. From state 7, we can go to state 55 only transitioning on ε\varepsilon. Therefore, εclosure(x,NFA)={7,5}\varepsilon-\text{closure}(x,NFA)=\{7, 5\}

2.6 - Constructing the DFA from the NFA

The first step in constructing the DFA from a NFA is determining the label of the start state.

The following process is repeated until there are no unmarked DFA states left.

2.6.1 - NFA to DFA Process - Simple Example

Consider the following NFA and follow the (above) process to turn it into a DFA

  1. The start state of the DFA is going to consist of the initial start state, and all of the states that can be reached by state 1 by empty transitions. That is, our new initial stages merges the old states {1,2,6}\{1, 2, 6\}
  2. We now need to determine what states can be reached from this new state:
    • We can go from 2→3 on “a”
    • We can go from 6→7 on “c”
  3. On that transition to state 3, we transition to a new state that is state 3 + any state that we can reach from state 3 + empty transitions.
    • Since there are no such states, we just leave it as is
    • We can transition out of this state on “b”
  4. On that transition to state 7, we transition to a new state that is state 7 + any empty transition.
    • We can transition to state 5, therefore, the new state is {7,5}\{7, 5\}
  5. Revisiting the transition out of state 3 on “b”, we can get to state 4
    • This state also has a null transition to state 5: {4,5}\{4, 5\}
  6. The states that include 55 are final or accepting states as 5 was the original accepting state.

2.6.2 - NFA to DFA Process - Example 2

Consider the following NFA and follow the above process to turn it into a DFA for the REGEX statement for (ba)a(b|a)^* a^*

  1. We start from the initial state of the DFA.
    • This state contains the NFA’s start state {1}\{1\}, and any state that we can get to via empty transitions {2,3,5,8,9,11}\{2, 3, 5, 8, 9, 11\}
    • Therefore, the empty closure of the start state is {1,2,3,5,8,9,11}\{1, 2, 3, 5, 8, 9, 11\}
      • This means that the start state of the DFA merges these states from the NFA
    • We now determine the transitions out of this state
      • We have a transition out on aa as 9a109\overset{a}{\rightarrow}10
      • We have a transition out on bb as 3b43\overset{b}{\rightarrow}4
      • We have a transition out on cc as 5c65\overset{c}{\rightarrow} 6
    • We now mark the initial state as seen or visited, and move to other states.
  2. We now consider the transition out of the initial state on aa:
    • From NFA state 9, we can transition to state 10 on aa
    • From state 10, we can get to state 11 and back to state 9 on empty transitions
    • Therefore, the empty closure of this state is {10,9,11}\{10,9,11\}
  3. We now consider the transition out of the initial state on bb:
    • From NFA state 4, we can transition to the following states on empty transitions: {4,7,2,3,5,8,9,11}\{4, 7, 2, 3, 5, 8, 9, 11\}
    • From this state, we can transition out on aa, from 9a109\overset a \rightarrow 10 which takes us to a state containing the empty closure of 10 (which is the state above)
    • From this state, we can transition out on bb, from 3b43\overset b \rightarrow 4 which takes us to a state containing the empty closure of 4 (which is this state)
    • From this state, we can transition out on cc, from 5c65\overset c \rightarrow 6 which takes us to a state containing the empty closure of 6 (which is the state below)
  4. We now consider the transition out of the initial state on cc:
    • From NFA state 5, we can transition to state 6 on cc
    • From state 6, we can get to the following states on empty transitions: (this is state 6’s empty closure) {6,7,2,3,5,8,9,11}\{6, 7, 2, 3, 5, 8, 9, 11\}
    • From this state, we can transition out on aa, from 9a109\overset a \rightarrow 10 which takes us to a state containing the empty closure of 10
    • From this state, we can transition out on bb, from 3b43\overset b \rightarrow 4 which takes us to a state containing the empty closure of 4
    • From this state, we can transition out on cc, from 5c65\overset c \rightarrow 6 which takes us to a state containing the empty closure of 6 (which is this state)

2.7 - Minimising a DFA

2.7.1 - Process for Minimising a DFA

3.0 - Regular Expressions for Lexical Analyser