hamburger

Study Notes on Context-Free Grammars and Push-Down Automata

By BYJU'S Exam Prep

Updated on: September 25th, 2023

Context-Free Language

  • The languages which are generated by context-free grammars are called Context-Free Languages (CFLs). 
  • CFLs are accepted by Pushdown Automata.
  • CFLs are also called non-deterministic CFL.

Definition

If v in one-step derivable from u, then written as  u ⇒ v. If v is derivable from u, written u  if there is a chain of one derivation of the form u ⇒ u1 ⇒ u2 ⇒ … ⇒ v

Example: Consider the context free grammar G = ({s}, {0, 1}, P, S) where Productions are:

(i) S → 0S1

(ii) S →ε

Derivations are:

image003

 

Derivations:

A derivation of a string is a sequence of rule applications. The language known a context-free grammar is the set of strings derivable from the start symbol S (for sentence).

A string can be derived with any of the given below derivations.

Left Most Derivation:

  • In each sentential form, the left-most non-terminal is substituted first to derive a string from the starting symbol.
  • A derivation is left most if at each step in the derivation a production is applied to the leftmost non-terminal in the sentential form.

Right Most Derivation: 

  • In each sentential form, the rightmost non-terminal is substituted first to derive a string from the starting symbol.
  • A derivation is left most if at each step in the derivation a production is applied to the leftmost non-terminal in the sentential form.

Example:

  • Every derivation corresponds to one derivation tree.

S ⇒ AB

⇒ aAAB

⇒ aaAB

⇒ aaaB

⇒ aaab

  • Every derivation tree corresponds to one or more derivations.

S ⇒ AB S ⇒ AB S ⇒ AB

⇒ aAAB ⇒ Ab ⇒ AB

⇒ aaAB ⇒ aAAb ⇒ aAAb

⇒ aaaB ⇒ aAab ⇒ aaAb

⇒ aaab ⇒ aaab ⇒ aaab

Derivation Tree (Parse Tree)

  • A derivation tree (or parse tree) can be defined with any non-terminal as the root, internal nodes are non-terminals and leaf nodes are terminals.
  • Every derivation corresponds to one derivation tree.
  • If a vertex A has k children with labels A1, A2, A3,…Ak, then A → A1 A2 A3…Ak will be a production in context-free grammar G.

Example:  

S → AB, A → aAA, A → aA, B → bB, B → b

image004

  

Ambiguous Grammar

A context-free grammar G is ambiguous if there is at least one string in L(G) having two or more distinct derivation trees (or equivalently, two or more distinct left-most derivations or two or more distinct right-most derivations).

e.g., consider the context-free grammar G having productions E E + E/a. The string a + a + a has two leftmost derivations.

Let’s see the derivations

E E + E E + E + E a + E + E a + a + E a + a + a

E E + E a + E a E + E a + a + E a + a + a

and the derivation trees are

image006

CFG Simplification

The four main steps will be followed in CFG simplification

  • Eliminate ambiguity.
  • Eliminate useless symbols productions.
  • Eliminate ∧ productions: A  ∧
  • • Eliminate unit productions: A B

Eliminate the Ambiguity

We can remove the ambiguity by removing the left recursing and left factoring.

Left Recursion

A production of the context free grammar G = (VN, E, P, S) is said to be left recursive if it is of the form

A

Where A is a nonterminal and

α(VN E)*

Removal of Left Recursion

Let the variable A has left recursive productions as follows

(i) A 1 |Aα2|Aα3|…|Aαn123|…|βn

Where β1, β2 ….. βn do not begin with A. Then we replace A production in the form of

(ii) A β1A12A1|…|βmA1 where

A1 α1A12A13A1|…,|αnA1|∧

Left Factoring

Two or more productions of a variable A of the grammar G = (VN, E, S, P) are said to have left factoring if the productions are of the form

A αβ1|αβ2|…αβn where β1,…βn­(VN Σ)

Removal of Left Factoring

Let the variable A has left factoring productions as follows

A αβ1|αβ2|…|αβn|y1|y2|y3|…|ym

where, β1, β2…..,βn have a common factor

α and y1, y2,….ym does not contain a as a prefix, then we replace the production into the form as follows

A αA1|Y1Y2|…..|YM, where

A1 β12|…..|βn

Eliminate the Useless Productions/Symbols

The symbols that cannot be used in any productions due to their unavailability in the productions or inability in deriving the terminals are known as useless symbols.

e,g., consider the grammar G with the following production rules

S aS |A| C

A a

B aa

C ab

Step 1 Generate the list of variables that produce terminal symbols

U = {A, B, S}

Because C does not produce terminal symbols so this product will be deleted. Now the modified productions are

S aS |A

A a

B aa

Step 2 Identify the variables dependency graph

S AB

In this graph, B variable is unreachable from S so it will be deleted too and  Now the productions are as-

S aS |A

A a

Eliminate Null Productions

If any variable goes to ∧ then that is called a nullable variable.

e.g., A ∧, then variable A is called a nullable variable

Step 1 -Firstly, Scan the nullable variables in the given production list.

Step 2 – Then find all productions which do not include null productions.

image007(Null productions)

e.g., consider the CFG has the following productions

S ABaC

A BC

B b|∧

C D|∧

D d

solve step find the nullable variables Firstly the set is empty

N = {}

N = {B, C}

N = {A, B, C}

Due to B, C variables, A will also be a nullable variable.

Step 3 image007 {Null productions}

S BaC | AaC | ABa | aC | Ba | Aa | a

A B | C

B b

C D

D d

The above grammar is every possible combination except ∧ Now put this new grammar with original grammar with null.

S ABaC | BaC | AaC | ABa | aC | Ba | Aa | a

Eliminate the Unit-Productions

Production of type A B, where A, B are variables is called unit productions.

Step 1 Using productions, we create a dependency graph

image008

S B

∵ S image001 B & B image001 A

S image001 A

Step 2 Now the production without unit productions

S Aa S bb | a | bc

B bb + A bb

A a | bc B a | bc

Now the final grammar is

S Aa | bb | a | bc

B bb | a | bc

A a | bc | bb

Normal Forms of CFGs

Ambiguity is the undesirable property of context-free grammar that we may wish to eliminate. To convert a context-free grammar into normal form, we start by trying to eliminate null productions of form A ∧ and the unit productions of form B C.

There are two normal forms

  1. Chomsky Normal Form (CNF)
  2. Greibach Normal Form (GNF)

Chomsky Normal Form (CNF)

A context-free grammar G is said to be in Chomsky Normal Form if every product is of the form either A a, (exactly a single terminal on the right-hand side of the production) or A BC (exactly two variables on the right-hand side of the production).

e.g., context-free grammar G with productions like S AB, A a, B b is in Chomsky normal form(CnF)form.

Chomsky Normal Form Properties

  • The number of steps in the derivation of any string ω of length n is 2n – 1, where the grammar should be in CNF.
  • The minimum height of the derivation tree of any ω of length n is [log2 n] + 1.
  • The maximum height of the derivation tree of any ω of length n = n.

Greibach Normal Form (GNF)

A context-free grammar is said to be in Greibach Normal Form if every product is of the form

A

where, aΣ, AVN and αimage010

Deterministic Context-Free Language (DCFL)

The set of deterministic context-free languages(DCFL) is a proper subset of the set of context-free languages that possess/bears an unambiguous context-free grammar.

image011

Key Points

  • The problem of whether a given context-free language is deterministic is undividable.
  • Deterministic context-free languages can be recognized by a deterministic turning machine in polynomial time and O (log2 n) space.
  • The language of this class has great practical importance in computer science as it can be passed much more efficiently than non-deterministic context-free languages.

Pushdown Automata (PDA)

A Pushdown Automata (PDA) is generally an NFA with a stack. A PDA is inherently non-deterministic. To handle a language like {an bn |n ≥ 0}, the machine needs to remember the number of a’s and b’s. To do this, we use a stack. So, a PDA is said a finite automaton with a stack. A stack is a data structure that can contain a number of elements but for which only the top element may be accessed.

Definition of PDA

A Pushdown Automaton (PDA) is defined as 7-tuple.

M = (Q, Σ, Γ, δ, q0, Z,F)

where Q is a finite set of states

Σ is the input alphabet

Γ is the stack alphabet

δ is the transition function that maps

(Q × (Σ {ε}) × (Γ {ε}) = (Q × (Γ {ε}))

q0 is the start state and ε denotes the empty string.

q0Q is the start state

Z Γ is the initial stack symbol

F ⊆ Q is the set of final or accepting states.

Acceptance of PDA

Tape is divided into finitely many cells. Each cell contains a symbol in an alphabet L. The stack head always scans the top symbol of the stack as shown in the figure.

It performs two basic operations

Push add a new symbol at the top.

Pop read and remove the top symbol.

δ (q, a, v) = (p, u)

It means that if the tape head reads specific input a, the stack head read v and the finite control is in state q, then one of the possible moves might be that the next state is p, v is replaced by u at the stack, Thus tape head moves one cell to the right.

image012

δ (q, ε, v) = (p, u)

It thus means that this is a ε -move (null move)

δ (q, a, ε) = (p, u)

It means that a push operation performs on a stack.

δ (q, a, v) = (p, ε)

It means that a pop operation performs on a stack.

PDA can accept a string in three ways:

  • PDA acceptance by Empty Stack: If the stack is empty after reading the entire input string then PDA accepted the given string, otherwise rejected.
  • PDA acceptance by Final State: If the stack reaches the final state after reading the input string then PDA accepted the given string, otherwise rejected.
  • PDA acceptance by Final State and Empty Stack: If the stack reaches the final state and also stack is empty after reading the entire input string then PDA accepted the given string, otherwise rejected.

Non-deterministic PDA: Like NFA, Non-deterministic PDA (NPDA) has a number of choices for its inputs. An NPDA accepts input if the sequence of choices leads to some final state or causes PDA to empty its stack.

Deterministic PDA

Deterministic PDA (DPDA) is a pushdown automaton whose action is a situation is fully determined rather than facing a choice between multiple alternative actions. DPDAs cannot handle languages or grammar with ambiguity. A deterministic context-free language is a language recognised by some deterministic pushdown automata.

Following languages are DCFLs

  • L = {anbn : n ≥ 0}
  • L = {ancb2n : n ≥ 0}
  • L = {ωcωR : ω(a + b) * but not L = {ωωR : (a + b)*}
  • For every regular set, there exists a CFG e such that L = L (G).
  • Every regular language is a CFL.
  • Let G1 and G2 be context-free grammars. Then, G1 and G2 are equivalent if and only if L (G1) = L (G2).

image013

  • The intersection of a context-free language and a regular language is a context-free language.
  • The reverse of a context-free language is context-free.
  • A DFA can remember only a finite amount of information whereas a PDA can remember an infinite amount of information.
  • For every PDA, there is context-free grammar and for every context-free grammar, there is a PDA.
  • If L1 is a DCFL and L2 is regular then, L1 L2 is also DCFL.
  • If L1 is a DCFL and L2 is a regular language, then L1 L2 is also DCFL.
  • Every regular language is DCFL.
  • The power of nondeterministic pushdown automata and deterministic pushdown automata is not the same. But the power of nondeterministic pushdown automata and deterministic pushdown is the same.
  • An FSM (Finite State Machine) with one stack is more powerful than FSM without a stack.
  • If left recursion or left factoring is present, it is not sure that the grammar is ambiguous but there may be a chance of ambiguity.

Closure Properties of CFLs

CFLs are closed under the following properties:

  • Union
  • Concatenation
  • Kleene closure
  • Positive closure
  • Substitution
  • Homomorphism
  • Inverse homomorphism
  • Reversal
  • Intersection with regular
  • Union with regular
  • Difference with regular

CFLs are not closed under the following properties :

  • Intersection
  • Complementation
  • Difference

Decision Properties of CFL’s

Decidable Problems:

  • Emptiness Problem: Is a given L(G) accepts empty?
  • Membership problem: Is a string w accepted by a given PDA?
  • Finiteness Problem: Is a given L(G) accepts finite language?

Undecidable Problems:

  • Equivalence problem: Are two CFLs the same?
  • Ambiguity problem: Is a given CFG ambiguous?
  • Is a given CFL inherently ambiguous?
  • Is the intersection of two CFLs empty?
  • Totality problem: Is a given L(G) equal to ∑*?

You can follow the detailed champion study plan for GATE CS 2021 from the following link:

Detailed GATE CSE 2021 Champion Study Plan

Candidates can also practice 110+ Mock tests for exams like GATE, NIELIT with BYJU’S Exam Prep Test Series check the following link:

Click Here to Avail GATE CSE Test Series! (100+ Mock Tests)

Get unlimited access to 21+ structured Live Courses all 112+ mock tests with Online Classroom Program for GATE CS & PSU Exams:

Click here to avail Online Classroom Program for Computer Science Engineering

Thanks

#DreamStriveSucceed

Download BYJU’S Exam Prep app for GATE CS Preparation here
Our Apps Playstore
POPULAR EXAMS
SSC and Bank
Other Exams
GradeStack Learning Pvt. Ltd.Windsor IT Park, Tower - A, 2nd Floor, Sector 125, Noida, Uttar Pradesh 201303 help@byjusexamprep.com
Home Practice Test Series Premium