M.J.Espino: The difference between Deterministic finite Automata and Non-Deterministic finita Automata

Deterministic Finite Automata

A DFA represents a finite state machine that recognizes a RE. For example, the following DFA:

recognizes (abc⁺)⁺. A finite automaton consists of a finite set of states, a set of transitions (moves), one start state, and a set of final states (accepting states). In addition, a DFA has a unique transition for every state-character combination. For example, the previous figure has 4 states, state 1 is the start state, and state 4 is the only final state.
A DFA accepts a string if starting from the start state and moving from state to state, each time following the arrow that corresponds the current input character, it reaches a final state when the entire input string is consumed. Otherwise, it rejects the string.
The previous figure represents a DFA even though it is not complete (ie, not all state-character transitions have been drawn). The complete DFA is:

but it is very common to ignore state 0 (called the error state) since it is implied. (The arrows with two or more characters indicate transitions in case of any of these characters.) The error state serves as a black hole, which doesn't let you escape.
A DFA is represented by a transition table T, which gives the next state T[s, c] for a state s and a character c. For example, the T for the DFA above is:

	a	b	c
0	0	0	0
1	2	0	0
2	0	3	0
3	0	0	4
4	2	0	4

Suppose that we want to build a scanner for the REs:

for - keyword	=	for
identifier	=	[a - z][a - z0 - 9]^*

The corresponding DFA has 4 final states: one to accept the for-keyword and 3 to accept an identifier:

(the error state is omitted again). Notice that for each state and for each character, there is a single transition.
A scanner based on a DFA uses the DFA's transition table as follows:

state = initial_state;
current_character = get_next_character();
while ( true )
{   next_state = T[state,current_character];
    if (next_state == ERROR)
       break;
    state = next_state;
    current_character = get_next_character();
    if ( current_character == EOF )
       break;
};
if ( is_final_state(state) )
   `we have a valid token'
else `report an error'

This program does not explicitly take into account the longest match disambiguation rule since it ends at EOF. The following program is more general since it does not expect EOF at the end of token but still uses the longest match disambiguation rule.

state = initial_state;
final_state = ERROR;
current_character = get_next_character();
while ( true )
{   next_state = T[state,current_character];
    if (next_state == ERROR)
       break;
    state = next_state;
    if ( is_final_state(state) )
       final_state = state;
    current_character = get_next_character();
    if (current_character == EOF)
       break;
};
if ( final_state == ERROR )
   `report an error'
else if ( state != final_state )
   `we have a valid token but we need to backtrack
         (to put characters back into the input stream)'
else `we have a valid token'

Is there any better (more efficient) way to build a scanner out of a DFA? Yes! We can hardwire the state transition table into a program (with lots of gotos). You've learned in your programming language course never to use gotos. But here we are talking about a program generated automatically, which no one needs to look at. The idea is the following. Suppose that you have a transition from state s₁ to s₂ when the current character is c. Then you generate the program:

s1: current_character = get_next_character();
    ...
    if ( current_character == 'c' )
       goto s2;
    ...
s2: current_character = get_next_character();
    ...

Non-Deterministic Finite Automata

An NFA is typically described using a directed graph as shown in Figure 11.8b, and is considered as a special kind of finite state machine. Each vertex of the graph represents a state, and edges represent possible transitions. An input string of finite length is read by the machine. Typically, the input string is a binary sequence of 0's and

's. The initial state is designated by an inward arrow that has no source vertex, as shown pointing into state

in Figure 11.8b. The machine starts in this state and reads the first symbol of the input string.

Based on its value, it makes appropriate transitions. For a DFA, the next state must be specified for each of the two inputs 0 and

from each state. From a state in an NFA, there may be any number of outgoing edges (including zero) that represent the response to a single symbol. For example, there are two outgoing edges if 0 is read from state

(the arrow from

actually corresponds to two directed edges, one for 0 and the other for

). There are also edges designated with a special $\epsilon$ symbol. If a state has an outgoing $\epsilon$ , the state may immediately transition along the edge without reading another symbol. This may be iterated any number of times, for any outgoing $\epsilon$ edges that may be encountered, without reading the next input symbol. The nondeterminism arises from the fact that there are multiple choices for possible next states due to multiple edges for the same input and $\epsilon$ transitions. There is no sensor that indicates which state is actually chosen.

The interpretation often given in the theory of computation is that when there are multiple choices, the machine clones itself and one copy runs each choice. It is like having multiple universes in which each different possible action of nature is occurring simultaneously. If there are no outgoing edges for a certain combination of state and input, then the clone dies. Any states that are depicted with a double boundary, such as state

in Figure 11.8, are called accept states. When the input string ends, the NFA is said to accept the input string if there exists at least one alternate universe in which the final machine state is an accept state.
The formulation usually given for NFAs seems very close to Formulation 2.1 for discrete feasible planning. Here is a typical NFA formulation [891], which formalizes the ideas depicted in Figure 11.8:

Formulation 11..2 (Nondeterministic Finite Automaton)

A finite state space .
A finite alphabet $\Sigma$ which represents the possible input symbols. Let $\Sigma_\epsilon = \Sigma \cup \{\epsilon\}$ .
A transition function, . For each state and symbol, a set of outgoing edges is specified by indicating the states that are reached.
A start state $x_0 \in X$ .
A set $A \subseteq X$ of accept states.

An NFA, similar to a DFA, consumes a string of input symbols. For each input symbol it transitions to a new state until all input symbols have been consumed.
Unlike a DFA, it is non-deterministic in that, for any input symbol, its next state may be any one of several possible states. Thus, in the formal definition, the next state is an element of the power set of states. This element, itself a set, represents some subset of all possible states to be considered at once.

An extension of the NFA is the NFA-lambda (also known as NFA-epsilon or the NFA with epsilon moves), which allows a transformation to a new state without consuming any input symbols. For example, if it is in state 1, with the next input symbol an a, it can move to state 2 without consuming any input symbols, and thus there is an ambiguity: is the system in state 1, or state 2, before consuming the letter a? Because of this ambiguity, it is more convenient to talk of the set of possible states the system may be in. Thus, before consuming letter a, the NFA-epsilon may be in any one of the states out of the set {1,2}. Equivalently, one may imagine that the NFA is in state 1 and 2 'at the same time': and this gives an informal hint of the powerset construction: the DFA equivalent to an NFA is defined as the one that is in the state q={1,2}. Transformations to new states without consuming an input symbol are called lambda transitions or epsilon transitions. They are usually labeled with the Greek letter λ or ε.

The notion of accepting an input is similar to that for the DFA. When the last input symbol is consumed, the NFA accepts if and only if there is some set of transitions that will take it to an accepting state. Equivalently, it rejects, if, no matter what transitions are applied, it would not end in an accepting state.

Formal definition

Two similar types of NFAs are commonly defined: the NFA and the NFA with ε-moves. The ordinary is defined as a 5-tuple, (Q, Σ, T, q₀, F), consisting of

a finite set of states Q
a finite set of input symbols Σ
a transition function T : Q × Σ → P(Q).
an initial (or start) state q₀ ∈ Q
a set of states F distinguished as accepting (or final) states F ⊆ Q.

Here, P(Q) denotes the power set of Q. The NFA with ε-moves (also sometimes called NFA-epsilon or NFA-lambda) replaces the transition function with one that allows the empty string ε as a possible input, so that one has instead

T : Q × (Σ ∪{ε}) → P(Q).

It can be shown that ordinary NFA and NFA with epsilon moves are equivalent, in that, given either one, one can construct the other, which recognizes the same language.

M.J.Espino

Lunes, Abril 25, 2011

The difference between Deterministic finite Automata and Non-Deterministic finita Automata

Formal definition

Walang komento:

Mag-post ng isang Komento