Formal Languages, Automata and Compilers
Lecure 4
2020-21
Curs 4
1 Grammars of type 3 and finite automata
2 Closure properties for type 3 languages
3 Regular Expressions
4 The automaton equivalent to a regular expression Algorithm
Grammars of type 3 and finite automata
Lecture 4
1 Grammars of type 3 and finite automata
2 Closure properties for type 3 languages
3 Regular Expressions
4 The automaton equivalent to a regular expression Algorithm
Grammars of type 3 and finite automata
From regular grammars to finite automata
For any regular grammar G (in normal form) there exists a nondeterministic finite automaton A such that L(A) =L(G):
In grammar G In automaton A
T Σ =T
N Q=N∪ {f},F ={f}
S q0=S
q →ap p∈δ(q,a) q→a f ∈δ(q,a) if S→ǫ add S to F
Grammars of type 3 and finite automata
From finite automata to regular grammars
For any deterministic finite automaton there exists a regular grammar G such that L(A) =L(G):
In automaton A In grammar G
Σ T = Σ
Q N =Q
q0 S=q0
δ(q,a) =p q→ap
δ(q,a)∈F q →a
if q0∈F add rule q0→ǫ
Closure properties for type 3 languages
Lecture 4
1 Grammars of type 3 and finite automata
2 Closure properties for type 3 languages
3 Regular Expressions
4 The automaton equivalent to a regular expression Algorithm
Closure properties for type 3 languages
Closure under intersection
If L1,L2∈ L3, then L1∩L2∈ L3
Let A1= (Q1,Σ1, δ1,q01,F1)and A2= (Q2,Σ2, δ2,q02,F2)be deterministic automata such that L1=L(A1)and L2=L(A2).
The deterministic automaton A that recognizes L1∩L2: A= (Q1×Q2,Σ1∩Σ2, δ,(q01,q02),F1×F2) δ((q1,q2),a)) = (q′1,q2′)iff:
δ1(q1,a) =q1′ δ2(q2,a) =q2′
Closure properties for type 3 languages
Closure under complement and set difference operations
If L∈ L3then L= (Σ∗\L)∈ L3
Let A= (Q,Σ, δ,q0,F)be a finite automaton with L(A) =L. The automaton A′ which accepts L=L(A):
A′ = (Q,Σ, δ,q0,Q\F)
Closure properties for type 3 languages
Closure under complement and set difference operations
If L∈ L3then L= (Σ∗\L)∈ L3
Let A= (Q,Σ, δ,q0,F)be a finite automaton with L(A) =L. The automaton A′ which accepts L=L(A):
A′ = (Q,Σ, δ,q0,Q\F)
If L1,L2∈ L3then L1\L2∈ L3:L1\L2=L1∩L2
Closure properties for type 3 languages
Closure under product
Let A1= (Q1,Σ, δ1,q01,{f1})and A2= (Q2,Σ, δ2,q02,{f2}) automata with a single final state such that L1=L(A1)and L2=L(A2).
Automaton A (withǫ-transitions) which accepts L1·L2: A= (Q1∪Q2,Σ, δ,q01,{f2})
Closure properties for type 3 languages
Closure under union
Let A1= (Q1,Σ1, δ1,q01,{f1})and A2= (Q2,Σ2, δ2,q02,{f2}) automata with a single final state such that L1=L(A1)and L2=L(A2).
Automaton A (withǫ-transitions) which accepts L1∪L2: A= (Q1∪Q2∪ {q0,f},Σ1∪Σ2, δ,q0,{f})
Closure properties for type 3 languages
Closure under iteration
Let A= (Q,Σ, δ,q01,{qf})automaton with a single final state such that L(A) =L.
Automaton A (withǫ-transitions) which accepts L∗(=L(A)∗):
A= (Q∪ {q0,f},Σ, δ′,q0,{f})
Regular Expressions
Lecture 4
1 Grammars of type 3 and finite automata
2 Closure properties for type 3 languages
3 Regular Expressions
4 The automaton equivalent to a regular expression Algorithm
Regular Expressions
Regular expressions - definition
Definition 1
IfΣis an alphabet, then a regular expression overΣcan be defined by induction:
∅,ǫ, a (a∈Σ) are regular expressions which describe the languages: ∅,{ǫ},{a}.
If E , E1, E2are regular expressions, then:
(E1|E2)is a regular expression which describes the language L(E1)∪L(E 2)
(E1·E2)is a regular expression which describes the language L(E1)L(E2)
(E∗)is a regular expression which describes the language L(E)∗ The priority order for the operators is: ∗,·,|
Regular Expressions
Examples
(a|b)|(c|d)−→ {a,b,c,d}
(0|1)·(0|1)−→ {00,01,10,11}
a∗b∗ −→ {anbk|n,k ≥0} (a|b)∗ −→ {a,b}∗
(0|1|2|...|9)(0|1|2...|9)∗ describes the set of positive integers (a|b|c|...|z)(a|b|c|...|z|0|1|2...|9)∗describes the set of identifiers Two regular expressions E1,E2are equivalent, written E1=E2, if L(E1) =L(E2)
Regular Expressions
Properties
(p|q)|r =p|(q|r) (pq)r =p(qr) p|q =q|p p·ǫ=ǫ·p=p p|∅=p|p=p
∅ ·p=p· ∅=∅ p(q|r) =pq|pr (p|q)r =pr|qr ǫ|pp∗ =p∗ ǫ|p∗p=p∗
Regular Expressions
From a regular expression to a finite automaton
Theorem 1
For any regular expression E over the alphabetΣ, there existis a finite automaton (withǫ- transitions) A, such that L(A) =L(E).
Proof : structural induction
if E ∈ {∅, ǫ,a}(a∈Σ)then the corresponding automata:
Regular Expressions
Proof
E =E1|E2
E =E1E2
E =E1∗
Regular Expressions
Representing regular expressions as trees
Input: the regular expression E =e0e1. . .en−1 Precedence of operators:
prec(|) = 1, prec(·) = 2, prec(∗) = 3 (prec(()= 0).
Output: The corresponding tree: t.
Method: Two stacks:
STACK1 operators stack
STACK2 trees stack (contains trees for sub-expressions of the initial regular expression)
Method tree(r,tL,tR), whre r is the root node, tL the left sub-tree tR
Regular Expressions
Algorithm
i = 0;
while(i < n){ c=ei; switch(c) {
case′(′: {STACK1.push(c); break;}
casesymbol (from alphabet): { STACK2.push(tree(c,NULL,NULL)); break;} caseoperator: {
while (prec(STACK1.top())>=prec(c)) build tree();
STACK1.push(c); break;
}
case′)′: {
do{build tree();}while(STACK1.top()!=′(′);
STACK1.pop(); break;
} } i++;
}
while(STACK1.not empty()) build tree();
t = STACK2.pop();
Regular Expressions
Algorithm
build tree()
op = STACK1.pop();
t1 = STACK2.pop();
switch (op) { case′∗′: {
new tree = tree(op, t1, NULL);
STACK2.push(new tree); break;
}
case′|′, ′·′: { t2 = STACK2.pop();
new tree = tree(op, t2, t1);
STACK2.push(new tree); break;
} }
Regular Expressions
Example
a∗·b|a·(b|c)∗
The automaton equivalent to a regular expression
Lecture 4
1 Grammars of type 3 and finite automata
2 Closure properties for type 3 languages
3 Regular Expressions
4 The automaton equivalent to a regular expression Algorithm
The automaton equivalent to a regular expression Algorithm
The automaton equivalent to a regular expression
E =E1|E2
E =E1E2
E =E1∗
The automaton equivalent to a regular expression Algorithm
Remarks
for any occurrence of a symbol fromΣorǫin E , 2 states must be added in the equivalent automaton.
for every occurrence of|and∗in a regular expression E , two more states must be added
for operator·no additional states must be added
if n is the number of symbols from E and m is the number of brackets and operators·, then the number of states of the automaton equivalent to E is p=2(n−m).
The automaton equivalent to a regular expression Algorithm
Algorithm
Input: The regular expression E with n symbols, from which m are brackets and product operators;
Output: The automaton withǫ- transitions equivalent to E ;
Let generateState()be a method that generates a new state at every call (states: positive numbers greater or equal to 1)
1. Build the tree corresponding to expression E ; 2. Traverse the tree in postorder;
For every node N, two states are generated (if necessary) and assigned to N.i, N.f ; N.i, N.f , represent the initial and final states of automaton ANequivalent to expression ENwhich corresponds to the sub-tree with root N
ANwill be build from previously build automata, using Theorem 1
3. The initial state of the automaton is N.i, the final state is N.f , where N is the root node of the tree
The automaton equivalent to a regular expression Algorithm
Example
E =a|b∗·c
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
If N is labelled with a (is a leaf): N.i=generateState(), N.f=generateState(), δ(N.i,a) =N.f
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
If N is labelled with a (is a leaf): N.i=generateState(), N.f=generateState(), δ(N.i,a) =N.f
If N is labelled with|:
N.i=generateState(), N.f =generateState(), δ(N.i, ǫ) ={L.i,R.i}, δ(L.f, ǫ) =N.f, δ(R.f, ǫ) =N.f
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
If N is labelled with a (is a leaf): N.i=generateState(), N.f=generateState(), δ(N.i,a) =N.f
If N is labelled with|:
N.i=generateState(), N.f =generateState(),
δ(N.i, ǫ) ={L.i,R.i}, δ(L.f, ǫ) =N.f, δ(R.f, ǫ) =N.f If N is labelled with·:
N.i=L.i, N.f =R.f ,
δ(L.f, ǫ) =R.i
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
If N is labelled with a (is a leaf): N.i=generateState(), N.f=generateState(), δ(N.i,a) =N.f
If N is labelled with|:
N.i=generateState(), N.f =generateState(),
δ(N.i, ǫ) ={L.i,R.i}, δ(L.f, ǫ) =N.f, δ(R.f, ǫ) =N.f If N is labelled with·:
N.i=L.i, N.f =R.f ,
δ(L.f, ǫ) =R.i If N is labelled with∗(R does not exist in this case):
N.i=generateState(), N.f =generateState(), δ(N.i, ǫ) ={L.i,N.f},
δ(L.f, ǫ) ={L.i,N.f}
The automaton equivalent to a regular expression Algorithm
2. Traverse the tree in post-order.
If N is the current node and L and R its children, then:
If N is labelled with a (is a leaf): N.i=generateState(), N.f=generateState(), δ(N.i,a) =N.f
If N is labelled with|:
N.i=generateState(), N.f =generateState(),
δ(N.i, ǫ) ={L.i,R.i}, δ(L.f, ǫ) =N.f, δ(R.f, ǫ) =N.f If N is labelled with·:
N.i=L.i, N.f =R.f ,
δ(L.f, ǫ) =R.i If N is labelled with∗(R does not exist in this case):
N.i=generateState(), N.f =generateState(), δ(N.i, ǫ) ={L.i,N.f},
δ(L.f, ǫ) ={L.i,N.f}
The automaton equivalent to a regular expression Algorithm
Example
E =a|b∗·c
The automaton equivalent to a regular expression Algorithm
Example
δ a b c ǫ
1 {2} ∅ ∅ ∅
2 ∅ ∅ ∅ {10}
3 ∅ {4} ∅ ∅
4 ∅ ∅ ∅ {3,6}
5 ∅ ∅ ∅ {3,6}
6 ∅ ∅ ∅ {7}
7 ∅ ∅ {8} ∅
8 ∅ ∅ ∅ {10}
9 ∅ ∅ ∅ {1,5}
10 ∅ ∅ ∅ ∅
The automaton equivalent to a regular expression Algorithm
The correctness of the algorithm
Theorem 2
The automaton withǫ- transitions built by the algorithm is equivalent to the expression E .
Proof:
The transitions defined in the algorithm correspond to the constructions in Theorem 1.