Skip to main content

Context-Free Grammars and Push-Down Automata

Amu_Ke_Fundye


Context-Free Grammars and Push-Down Automata


Context Free Language

  • The languages which are generated by context-free grammars are called Context-Free Languages (CFLs). 
  • CFLs are accepted by Push down Automata.
  • CFLs are also called as non-deterministic CFL.
Definition: If v is one-step derivable from u, written u  v. If v is derivable from u, written u  if there is a chain of one derivations of the form u  u1  u2  …  v
Example: Consider the context free grammar G = ({s}, {0, 1}, P, S) where Productions are:
(i) S  0S1
(ii) S Îµ
Derivations are:
image003
Derivations:
A derivation of a string is a sequence of rule applications. The language defined by a context-free grammar is the set of strings derivable from the start symbol S (for sentence).
A string can be derived with any of the following derivations.
Left Most Derivation:
  • In each sentential form, left most non-terminal substituted first to derive a string from the starting symbol.
  • A derivation is left most, if at each step in the derivation a production is applied to the left most non-terminal in the sentential form.
Right Most Derivation: 
  • In each sentential form, right most non-terminal substituted first to derive a string from the starting symbol.
  • A derivation is left most, if at each step in the derivation a production is applied to the left most non-terminal in the sentential form.
Example:
  • Every derivation corresponds to one derivation tree.
 AB
 aAAB
 aaAB
 aaaB
 aaab
  • Every derivation tree corresponds to one or more derivations.
 AB S  AB S  AB
 aAAB  Ab  AB
 aaAB  aAAb  aAAb
 aaaB  aAab  aaAb
 aaab  aaab  aaab

Derivation Tree (Parse Tree)

  • A derivation tree (or parse tree) can be defined with any non-terminal as the root, internal nodes are non-terminals and leaf nodes are terminals.
  • Every derivation corresponds to one derivation tree.
  • If a vertex A has k children with labels A1, A2, A3,…Ak, then A → A1 A2 A3…Ak will be a production in context-free grammar G.
Example:  
S → AB, A → aAA, A → aA, B → bB, B → b
image004

Ambiguous Grammar

A context-free grammar G is ambiguous if there is atleast one string in L(G) having two or more distinct derivation trees (or equivalently, two or more distinct left most derivations or two or more distinct right most derivations).
e.g., consider the context-free grammar G having productions E  E + E/a. The string a + a + a has two left most derivations.
Let’s see the derivations
 E + E  E + E + E  a + E + E  a + a + E  a + a + a
 E + E  a + E  a  E + E  a + a + E  a + a + a
and the derivation trees are
image006

CFG Simplification

The four main steps will be followed in CFG simplification
  • Eliminate ambiguity.
  • Eliminate useless symbols productions.
  • Eliminate ∧ productions: A  ∧
  • Eliminate unit productions: A  B

Eliminate the Ambiguity

We can remove the ambiguity by removing the left recursing and left factoring.

Left Recursion

A production of the context free grammar G = (VN, E, P, S) is said to be left recursive if it is of the form
 Aα
Where, A is a non-terminal and Î±(VN  E)*

Removal of Left Recursion

Let the variable A has left recursive productions as follows
(i) A  Aα1 |Aα2|Aα3|…|Aαn|β1|β2|β3|…|βn
Where β1, β2 ….. βn do not begin with A. Then we replace A production in the form of
(ii) A  Î²1A1 |β2A1|…|βmA1 where
A1  Î±1A1|α2A1|α3A1|…,|αnA1|∧

Left Factoring

Two or more productions of a variable A of the grammar G = (VN, E, S, P) are said to have left factoring, if the productions are of the form
 Î±Î²1|αβ2|…αβn where β1,…βn­(VN Î£)

Removal of Left Factoring

Let the variable A has left factoring productions as follows
 Î±Î²1|αβ2|…|αβn|y1|y2|y3|…|ym
where, β1, β2…..,βn have a common factor
α and y1, y2,….ym does not contain a as a prefix, then we replace the production into the form as follows
 Î±A1|Y1Y2|…..|YM, where
A1  Î²1|β2|…..|βn

Eliminate the Useless Productions/Symbols

The symbols that cannot be used in any productions due to their unavailability in the productions or inability in deriving the terminals, are known as useless symbols.
e,g., consider the grammar G with the following production rules
 aS |A| C
 a
 aa
 ab
Step 1 Generate the list of variables those produce terminal symbols
U = {A, B, S}
Because C does not produce terminal symbols so this production will be deleted. Now the modified productions are
 aS |A
 a
 aa
Step 2 Identity the variables dependency graph
 AB
In this graph, B variable is not reachable from S so it will be deleted also. Now the productions are
 aS |A
 a

Eliminate Null Productions

If any variable goes to ∧ then that is called as nullable variable.
e.g., A  ∧, then variable A is said to be nullable variable
Step 1 Scan the nullable variables in the given production list.
Step 2 Find all productions which does not include null productions.
e.g., consider the CFG has following productions
 ABaC
 BC
 b|∧
 D|∧
 d
solve step find the nulable variables firstly the set is empty
N = {}
N = {B, C}
N = {A, B, C}
Due to B, C variables, A will also be a nullable variable.
Step 3  {Null productions}
 BaC | AaC | ABa | aC | Ba | Aa | a
 B | C
 b
 D
 d
The above grammar is the every possible combination except ∧ Now put this new grammar with original grammar with null.
 ABaC | BaC | AaC | ABa | aC | Ba | Aa | a

Eliminate the Unit-Productions

A production of the type A  B, where A, B are variables is called unit productions.
Step 1 Using productions, we create dependency graph
image008
 B
∵ S image001 B & B image001 A
 S image001 A
Step 2 Now the production without unit productions
 Aa S  bb | a | bc
 bb + A  bb
 a | bc B  a | bc
Now the final grammar is
 Aa | bb | a | bc
 bb | a | bc
 a | bc | bb

Normal Forms of CFGs

Ambiguity is the undesirable property of a context-free grammar that we might wish to eliminate. To convert a context-free grammar into normal form, we start by trying to eliminate null productions of the form A  ∧ and the unit productions of the form B  C.
There are two normal forms
  1. Chomsky Normal Form (CNF)
  2. Greibach Normal Form (GNF)

Chomsky Normal Form (CNF)

A context-free grammar G is said to be in Chomsky Normal Form, if every production is of the form either A  a, (exactly a single terminal in the right hand side of the production) or A  BC (exactly two variables in the right hand side of the production).
e.g., the context-free grammar G with productions S  AB, A  a, B  b is in Chomsky normal form.

Chomsky Normal Form Properties

  • The number of steps in derivation of any string ω of length n is 2n – 1, where the grammar should be in CNF.
  • The minimum height of derivation tree of any ω of length n is [log2 n] + 1.
  • The maximum height of derivation tree of any ω of length n = n.

Greibach Normal Form (GNF)

A context-free grammar is said to be in Greibach Normal Form, if every production is of the form
 aα
where, aΣ, AVN and αimage010

Deterministic Context-Free Language (DCFL)

The set of deterministic context-free languages is a proper subset of the set of context-free languages that possess an unambiguous context-free grammar.
image011

Key Points

  • The problem of whether a given context-free language is deterministic is undividable.
  • Deterministic context-free languages can be recognized by a deterministic turning machine in polynomial time and O (log2 n) space.
  • The language of this class have great practical importance in computer science as they can be passed much more efficiently then non deterministic context-free languages.

Pushdown Automata (PDA)

A Pushdown Automata (PDA) is essentially an NFA with a stack. A PDA is inherently non-deterministic. To handle a language like {an bn |n ≥ 0}, the machine needs to remember the number of a’s and b’s. To do this, we use a stack. So, a PDA is a finite automaton with a stack. A stack is a data structure that can contain an number of elements but for which only the top element may be accessed.

Definition of PDA

A Pushdown Automaton (PDA) is defined as 7-tuple.
M = (Q, Σ, Γ, δ, q0, Z,F)
where, Q is a finite set of sates
Σ is the input alphabet
Γ is the stack alphabet
δ is the transition function which maps
(Q × (Σ  {ε}) × (Γ {ε}) = (Q × (Γ  {ε}))
q0 is the start state and ε denotes the empty string.
q0Q is start state
Γ is the initial stack symbol
F ⊆ Q is the set of final or accepting states.

Acceptance of PDA

Tape is divided into finitely many cells. Each cell contains a symbol in an alphabet L. The stack head always scans the top symbol of the stack as shown in figure.
It performs two basic operations
Push add a new symbol at the top.
Pop read and remove the top symbol.
δ (q, a, v) = (p, u)
It means that if the tape head reads input a, the stack head read v and the finite control is in state q, then one of the possible moves is that the next state is p, v is replaced by u at stack and the tape head moves one cell to the right.
image012
δ (q, ε, v) = (p, u)
It means that this is a ε -move (null move)
δ (q, a, ε) = (p, u)
It means that a push operation performs on stack.
δ (q, a, v) = (p, ε)
It means that a pop operation performs on stack.
PDA can accept a string in three ways:
  • PDA acceptance by Empty Stack: If the stack is empty after reading the entire input string then PDA accepted the given string, otherwise rejected.
  • PDA acceptance by Final State: If the stack reaches final state after reading the input string then PDA accepted the given string, otherwise rejected.
  • PDA acceptance by Final State and Empty Stack: If the stack reaches final state and also stack is empty after reading the entire input string then PDA accepted the given string, otherwise rejected.
Non-deterministic PDA: Like NFA, Non-deterministic PDA (NPDA) has number of choices for its inputs. An NPDA accepts an input, if sequence of choices leads to some final state or causes PDA to empty its stack.

Deterministic PDA

Deterministic PDA (DPDA) is a pushdown automata whose action is an situation is fully determined rather than facing a choice between multiple alternative actions. DPDAs cannot handle languages or grammars with ambiguity. A deterministic context-free language is a language recognised by some deterministic pushdown automata.
Following languages are DCFLs
  • L = {anbn : n ≥ 0}
  • L = {ancb2n : n ≥ 0}
  • L = {ωcωR : ω(a + b) * but not L = {ωωR : (a + b)*}
  • For every regular set, there exists a CFG e such that L = L (G).
  • Every regular language is a CFL.
  • Let G1 and G2 be context-free grammars. Then, G1 and Gare equivalent if and only if L (G1) = L (G2).
image013
  • The intersection of a context-free language and a regular language is a context-free language.
  • The reverse of a context-free language is context-free.
  • A DFA can remember only a finite amount of information whereas a PDA can remember an infinite amount of information.
  • For every PDA, there is a context-free grammar and for every context-free grammar, there is a PDA.
  • If L1 is a DCFL and L2 is regular then, L1  L2 is also DCFL.
  • If L1 is a DCFL and L2 is a regular language, then L1  L2 is also DCFL.
  • Every regular language is DCFL.
  • The power of non-deterministic pushdown automata and deterministic pushdown automata is not same. But the power of non-deterministic pushdown automata and deterministic pushdown is same.
  • A FSM (Finite State Machine) with one stack is more powerful than FSM without stack.
  • If left recursion or left factoring is present, it is not sure that the grammar is ambiguous but there may be chance of ambiguity.

Closure Properties of CFLs

CFLs are closed under following properties:
  • Union
  • Concatenation
  • Kleene closure
  • Positive closure
  • Substitution
  • Homomorphism
  • Inverse homomorphism
  • Reversal
  • Intersection with regular
  • Union with regular
  • Difference with regular
CFLs are not closed under following properties :
  • Intersection
  • Complementation
  • Difference

Decision Properties of CFL’s

Decidable Problems:

  • Emptyness Problem: Is a given L(G) accepts empty?
  • Membership problem: Is a string w accepted by given PDA?
  • Finiteness Problem: Is a given L(G) accepts finite language?

Undecidable Problems:

  • Equivalence problem: Are two CFLs the same?
  • Ambiguity problem: Is a given CFG ambiguous?
  • Is a given CFL inherently ambiguous?
  • Is the intersection of two CFLs empty?
  • Totality problem: Is a given L(G) equal to ∑* ?
Regards 
Amrut Jagdish Gupta 

Comments

Popular posts from this blog

Undecidability

Amu_Ke_Fundye Undecidability Decidable Problem If there is a Turing machine that decides the problem, called as Decidable problem. A decision problem that can be solved by an algorithm that halts on all inputs in a finite number of steps. A problem is decidable, if there is an algorithm that can answer either yes or no. A language for which membership can be decided by an algorithm that halts on all inputs in a finite number of steps. Decidable problem is also called as totally decidable problem, algorithmically solvable, recursively solvable. Undecidable Problem A problem that cannot be solved for all cases by any algorithm whatsoever. Equivalent Language cannot be recognized by a Turing machine that halts for all inputs. The following problems are undecidable problems: Halting Problem:  A halting problem is undecidable problem. There is no general method or algorithm which can solve the halting problem for all possible inputs. Emptyness Problem:...

ALU and Data Path

Amu_Ke_Fundye ALU and Data Path The CPU can be divided into a data section and a control section. The   data section , which is also called the   datapath. Datapath The registers, the ALU, and the interconnecting bus are collectively referred to as the datapath.  Each bit in datapath is functionally identical.  The datapath is capable of performing certain operations on data items. The  control section  is basically the  control unit ,  which issues control signals to the datapath . Bus : A Bus is a collection of wires or distinct lines meant to carry data, address and control information. Data Bus : it is used for transmission of data. The number of data lines corresponds to the number of bits in a word. Address Bus : it carries the address of the main memory location from where the data can be accessed. Control Bus : it is used to indicate the direction of data transfer and to coordinate the timing of ...

Funny Shortcut to Remember Periodic Table

Amu_Ke_Fundye Funny Shortcut to Remember Periodic Table THE PERIODIC TABLE The periodic table is a tabular display of the chemical elements, organized on the basis of their atomic number's , electron configurations, and chemical properties.In order to appreciate the general trends in chemistry and to explain the deviations of some from these general trends we need to to have good knowledge about the position of those elements in periodic table. This knowledge is also extremely important for competitive exams. So, here I present a way to memorize this using some funny easy shortcut sentences. THE FIRST 18 ELEMENTS (H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar)  H ie  He   Li es  Be cause  B oron  C an N ot  O xide  Fl orine.. N e w  Na tions  M ight  Al so  Si ng  P eaceful  S ong  Cl early  A gain  THE ...