Skip to main content

Regular Expressions and Finite Automata

Amu_Ke_Fundye


Regular Expressions and Finite Automata


Finite Automaton

Finite State Machine (FSM) or finite state automaton is an abstract machine used in the study of computation and language that has only a finite, constant amount of memory.
image001
Types of Finite Automaton:
  • Deterministic Finite Automata (DFA)
  • Non-Deterministic Finite Automata (NFA or NDFA)
  • NFA with Epsilon moves (epsilon-NFA)

Description of Finite Automaton

A finite automaton is defined as 5-tuples (Q, Σ, δ, q0, F). where, Q is finite non-empty set of states, Σ is finite non-empty set of input d alphabets, δ is transition function which maps Q × Σ into Q, q0 is initial state and q0 ∈ Q, F is set of final states and F  Q.
  • For DFA: Q × Σ → Q
  • For NFA: Q × Σ  2Q

Transition Diagrams

A transition diagram is a finite directed labelled graph in which each vertex represents a state and directed edges indicate the transition from one state to another. Edges are labelled with input/output. In this, the initial state is represented by a circle with an arrow towards it, the final state is represented by two concentric circles and intermediate states are represented by just a circle.
Example: Consider the following DFA.
image002
In the given transition diagram, vertex A is the initial state of the finite automata and C is the final state.

Transition Table

Transition table is the tabular representation of transition system. In this representation, initial (start) state is represented by an arrow towards it and a final state is represented by a circle or prefixed with star symbol. Following transition table is equivalent to the above given transition diagram.
image003
(Notations: → : initial state; *: final state)

Acceptability of a String by Finite Automata

A string ω is accepted by a finite automaton, M = (Q, Σ, δ, q0, F), if δ(q0, ω) = F, where q0 is initial (start) state and F is the final state. For every input symbol of string it takes one transition and starts with initial state and reaches final state by reading the complete string.

Transition Function

It takes two arguments i.e., a state and an input symbol. δ(q, a) is the transition function for the DFA (Deterministic Finite Automata) which is at the state q and when it receives the input a, DFA will move to the next state.

Extended Transition Function

It takes two arguments a state and an input string.
δ*(q, ω) = δ(δ(q, x), a)
where, ω is a string i.e., ω = xa, in which a is a single word and x is remaining string except the last symbol.

Equivalence of DFA and NDFA

In contrast to the NFA (NDFA), the Deterministic Finite Automata (DFA) has
  • No ε-transition (null transition).
  • For every (q, a) with q  Q and a  Σ at most one successor state.
  • Deterministic finite automata can simulate the behaviour of NFA by increasing the number of states.
  • In Deterministic Finite Automata (DFA), for each state, there is at most one transition for each possible input.
  • In non-deterministic finite automata, there can be more than one transition from a given state for a given possible input.
  • If a language L is accepted by an NFA, then there is a DFA that accepts L.
  • All DFA are NDFA, but not all NFA are DFA.
  • All NDFA and DFA have the same power.
  • Processing of an input string is more time consuming when NFA is used and less time consuming when DFA is used.

Finite Automata with Output Capabilities

There are two types of machines: Moore machine and Mealy machine.

Moore Machine
It is a finite automata in which output is associated with each state.
Moore machine is defined as 6-tuples (Q, Σ, Δ, δ, λ, q0)
where, Q is finite non-empty set of states.
  • Σ is set of input symbols.
  • Δ is output alphabet.
  • δ is transition function which maps δ(Σ × Q)  Q.
  • λ is output function which maps Q into Δ.
  • q0 is initial state.

Mealy Machine
It is a finite automata in which the output depends upon both the present input and present state. Mealy machine is also a 6-tuple (Q, Σ, Δ, δ, λ, q0), where all symbols except λ have the same meaning as in Moore machine. λ is the output function mapping Σ × Q into Δ.

Regular Expression

Regular expressions mean to represent certain sets of strings in some algebraic fashion. A regular expression over the alphabet Σ is defined as follows
  • ϕ is a regular expression corresponding to the empty language ϕ.
  • ε is a regular expression corresponding to the language {ε}.
  • For each symbol a ∈Σ a is a regular expression corresponding to the language {a}.

Regular Language

The languages accepted by FA are regular languages and these languages are easily described by simple expressions called regular expressions.
For any regular expression r and s over Σ corresponding to the languages Lr and Lsrespectively, each of the following is a regular expression corresponding to the language indicated.
  • (rs) corresponding to the language LrLs
  • (r + s) corresponding to the language Lr  Ls
  • r* corresponding to the language Lr.Some examples of regular expression are
    1. L (01) = {0, 1}
    2. L (01 + 0) = {01, 0}
    3. L (0 (1+ 0)) = {01, 00}
    4. L (0*) = {ε, 0, 00, 000, …}
    5. L ((0 + 10)* (ε + 1)) = all strings of 0’s and 1’s without two consecutive 1’s.
  • If L1 and L2 are regular languages in Σ*, then L1  L2, L1  L2, L1 – L2 and L1(complement of L1), are all regular languages.
  • Pumping lemma is a useful tool to prove that a certain language is not regular.

Regular Set

A set represented by a regular expression is called regular set e.g., If Σ = {a, b} is an alphabet, then
Different Regular Sets and their Expressions
image004
Identities for Regular Expressions 

The following points are the some identities for regular expressions.
  • ϕ + R = R + ϕ = R
  • ε R = R ε = R
  • R + R = R, where R is the regular expression.
  • (R*)* = R* ϕR = Rϕ = ϕ
  • ε * = ε and ϕ* = ε
  • RR* = R*R = R+
  • R*R* = R*
  • (P + Q)* = (P*Q*)* = (P* + Q*)*, where P and Q are regular expressions.
  • R (P + Q) = RP + RQ and (P + Q)R = PR + QR
  • P(QP)* = (PQ)*P 

Arden’s Theorem

  • If P and Q are two expressions over an alphabet Σ such that P does not contain ε, then the following equation R = Q + RP.
  • The above equation has a unique solution i.e., R = QP*. Arden’s theorem is used to determine the regular expression represented by a transition diagram.
  • The following points are assumed regarding transition diagrams
  • The transition system does not have any
  • It has only one initial (starting) stage.

Properties of Regular Language

Regular languages are closed under following properties
  1. Union
  2. Concatenation
  3. Kleene closure
  4. Complementation
  5. Transpose
  6. Intersection

Regards
Amrut Jagdish Gupta

Comments

Popular posts from this blog

Undecidability

Amu_Ke_Fundye Undecidability Decidable Problem If there is a Turing machine that decides the problem, called as Decidable problem. A decision problem that can be solved by an algorithm that halts on all inputs in a finite number of steps. A problem is decidable, if there is an algorithm that can answer either yes or no. A language for which membership can be decided by an algorithm that halts on all inputs in a finite number of steps. Decidable problem is also called as totally decidable problem, algorithmically solvable, recursively solvable. Undecidable Problem A problem that cannot be solved for all cases by any algorithm whatsoever. Equivalent Language cannot be recognized by a Turing machine that halts for all inputs. The following problems are undecidable problems: Halting Problem:  A halting problem is undecidable problem. There is no general method or algorithm which can solve the halting problem for all possible inputs. Emptyness Problem:...

ALU and Data Path

Amu_Ke_Fundye ALU and Data Path The CPU can be divided into a data section and a control section. The   data section , which is also called the   datapath. Datapath The registers, the ALU, and the interconnecting bus are collectively referred to as the datapath.  Each bit in datapath is functionally identical.  The datapath is capable of performing certain operations on data items. The  control section  is basically the  control unit ,  which issues control signals to the datapath . Bus : A Bus is a collection of wires or distinct lines meant to carry data, address and control information. Data Bus : it is used for transmission of data. The number of data lines corresponds to the number of bits in a word. Address Bus : it carries the address of the main memory location from where the data can be accessed. Control Bus : it is used to indicate the direction of data transfer and to coordinate the timing of ...

Funny Shortcut to Remember Periodic Table

Amu_Ke_Fundye Funny Shortcut to Remember Periodic Table THE PERIODIC TABLE The periodic table is a tabular display of the chemical elements, organized on the basis of their atomic number's , electron configurations, and chemical properties.In order to appreciate the general trends in chemistry and to explain the deviations of some from these general trends we need to to have good knowledge about the position of those elements in periodic table. This knowledge is also extremely important for competitive exams. So, here I present a way to memorize this using some funny easy shortcut sentences. THE FIRST 18 ELEMENTS (H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar)  H ie  He   Li es  Be cause  B oron  C an N ot  O xide  Fl orine.. N e w  Na tions  M ight  Al so  Si ng  P eaceful  S ong  Cl early  A gain  THE ...