M.J.Espino: Marso 2011

Huwebes, Marso 31, 2011

Finite Automata

Another way to describe sets of character strings is with finite automata. Finite automata are also known as state machines, and we will use “automaton” and “machine” interchangeably.

As a simple example, here is a machine recognizing the set of strings matched by the regular expression a(bb)+a:

A finite automaton is always in one of its states, represented in the diagram by circles. (The numbers inside the circles are labels to make this discussion easier; they are not part of the machine's operation.) As it reads the string, it switches from state to state. This machine has two special states: the start state s₀ and the matching state s₄. Start states are depicted with lone arrowheads pointing at them, and matching states are drawn as a double circle.

The machine reads an input string one character at a time, following arrows corresponding to the input to move from state to state. Suppose the input string is abbbba. When the machine reads the first letter of the string, the a, it is in the start state s₀. It follows the a arrow to state s₁. This process repeats as the machine reads the rest of the string: b to s₂, b to s₃, b to s₂, b to s₃, and finally a to s₄.

The machine ends in s₄, a matching state, so it matches the string. If the machine ends in a non-matching state, it does not match the string. If, at any point during the machine's execution, there is no arrow for it to follow corresponding to the current input character, the machine stops executing early.

The machine we have been considering is called a deterministic finite automaton (DFA), because in any state, each possible input letter leads to at most one new state. We can also create machines that must choose between multiple possible next states. For example, this machine is equivalent to the previous one but is not deterministic:

The machine is not deterministic because if it reads a b in state s₂, it has multiple choices for the next state: it can go back to s₁ in hopes of seeing another bb, or it can go on to s₃ in hopes of seeing the final a. Since the machine cannot peek ahead to see the rest of the string, it has no way to know which is the correct decision. In this situation, it turns out to be interesting to let the machine always guess correctly. Such machines are called non-deterministic finite automata (NFAs or NDFAs). An NFA matches an input string if there is some way it can read the string and follow arrows to a matching state.

Sometimes it is convenient to let NFAs have arrows with no corresponding input character. We will leave these arrows unlabeled. An NFA can, at any time, choose to follow an unlabeled arrow without reading any input. This NFA is equivalent to the previous two, but the unlabeled arrow makes the correspondence with a(bb)+a clearest:

Types of Proofs

Proof of Contruction

Proof by construction, or proof by example, is the construction of a concrete example with a property to show that something having that property exists. Joseph Liouville, for instance, proved the existence of transcendental numbers by constructing an explicit example.

Proof by Contradiction

In proof by contradiction (also known as reductio ad absurdum, Latin for "by reduction toward the absurd"), it is shown that if some statement were so, a logical contradiction occurs, hence the statement must be not so. This method is perhaps the most prevalent of mathematical proofs. A famous example of proof by contradiction shows that $\sqrt{2}$ is an irrational number:
Suppose that $\sqrt{2}$ were a rational number, so by definition $\sqrt{2} = {a\over b}$ where a and b are non-zero integers with no common factor. Thus, $b\sqrt{2} = a$ . Squaring both sides yields 2b² = a². Since 2 divides the left hand side, 2 must also divide the right hand side (as they are equal and both integers). So a² is even, which implies that a must also be even. So we can write a = 2c, where c is also an integer. Substitution into the original equation yields 2b² = (2c)² = 4c². Dividing both sides by 2 yields b² = 2c². But then, by the same argument as before, 2 divides b², so b must be even. However, if a and b are both even, they share a factor, namely 2.

Proof by Induction

A proof by induction is just like an ordinary proof in which every step must be justified. However it employs a neat trick which allows you to prove a statement about an arbitrary number n by first proving it is true when n is 1 and then assuming it is true for n=k and showing it is true for n=k+1. The idea is that if you want to show that someone can climb to the nth floor of a fire escape, you need only show that you can climb the ladder up to the fire escape (n=1) and then show that you know how to climb the stairs from any level of the fire escape (n=k) to the next level (n=k+1).

If you've done proof by induction before you may have been asked to assume the n-1 case and show the n case, or assume the n case and show the n+1 case. This is the same as what I'm explaining here but I will use a different letter because I think it avoids some confusion when trying to figure out what each case is.

Example 1: Prove 1+2+...+n=n(n+1)/2 using a proof by induction.
n=1: 1=1(2)/2=1 checks.

Miyerkules, Marso 30, 2011

Mathematical Notations and Terminology

A set is a collection of distinct objects, considered as an object in its own right. Sets are one of the most fundamental concepts in mathematics. Developed at the end of the 19th century, set theory is now a ubiquitous part of mathematics, and can be used as a foundation from which nearly all of mathematics can be derived. In mathematics education, elementary topics such as Venn diagrams are taught at a young age, while more advanced concepts are taught as part of a university degree.

A set is a finite or infinite collection of objects in which order has no significance, and multiplicity is generally also ignored (unlike a list or multiset). Members of a set are often referred to as elements and the notation is used to denote that is an element of a set. The study of sets and their properties is the object of set theory.

Older words for set include aggregate and set class. Russell also uses the unfortunate term manifold to refer to a set.

Historically, a single horizontal overbar was used to denote a set stripped of any structure besides order, and hence to represent the order type of the set. A double overbar indicated stripping the order from the set and hence represented the cardinal number of the set. This practice was begun by set theory founder Georg Cantor.

Symbols used to operate on sets include (which means "and" or intersection), and (which means "or" or union). The symbol is used to denote the set containing no elements, called the empty set.

The notation , where and are arbitrary sets, is used to denote the set of maps from to , For example, an element of would be a map from the natural numbers to the set . Call such a function , then , , etc., are elements of , so call them , , etc. This now looks like a sequence of elements of , so sequences are really just functions from to.This notation is standard in mathematics and is frequently used in symbolic dynamics to denote sequence spaces.

Sequences and Tuples

A sequence is an ordered list of objects (or events). Like a set, it contains members (also called elements or terms), and the number of terms (possibly infinite) is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence. A sequence is a discrete function.

For example, (C, R, Y) is a sequence of letters that differs from (Y, C, R), as the ordering matters. Sequences can be finite, as in this example, or infinite, such as the sequence of all even positive integers (2, 4, 6,...). Finite sequences are sometimes known as strings or words and infinite sequences as streams. The empty sequence ( ) is included in most notions of sequence, but may be excluded depending on the context.

Sequences are to calculus what at calculator is to a scientist. There are many ways to introduce sequences. Here we will follow a somewhat unorthodox way. Indeed, consider a scientist doing an experiment; he is collecting data, let us say, every day. So, put

to be the data collected the first day,

be the data collected the second day, and so on.... and

is the data collected after n days. Clearly, we are generating a set of numbers with a very special characteristic: there is an order on the number, that is, we naturally have the first number, the second number, and so on.... A sequence is by definition a set of real numbers with this natural order. We wil use the notation

,
to describe the sequence of numbers where

is the nth number.

A tuple is an ordered list of elements. In set theory, an (ordered) $n$ -tuple is a sequence (or ordered list) of

n

elements, where

n

is a positive integer. There is also one

0

-tuple, an empty sequence. An

n

-tuple is defined inductively using the construction of an ordered pair. Tuples are usually written by listing the elements within parentheses "

( )

" and separated by commas; for example,

(2,7,4,1,7)

denotes a

5

-tuple. Sometimes other delimiters are used, such as square brackets "

[ ]

" or angle brackets " $\langle\text{ }\rangle$ ". Braces "

{}

" are almost never used for tuples, as they are the standard notation for sets.

Tuples are often used to describe other mathematical objects. In algebra, for example, a ring is commonly defined as a 3-tuple $(E,+,\times)$ , where

E

is some set, and "

+

", and " $\times$ " are functions mapping the Cartesian product $E \times E$ to

E

with specific properties. In computer science, tuples are directly implemented as product types in most functional programming languages. More commonly, they are implemented as record types, where the components are labeled instead of being identified by position alone. This approach is also used in relational algebra.

Functions and Relations

A function, in a mathematical sense, expresses the idea that one quantity (the argument of the function, also known as the input) completely determines another quantity (the value, or the output). A function assigns exactly one value to each input of a specified type. The argument and the value may be real numbers, but they can also be elements from any given sets: the domain and the codomain of the function. An example of a function with the real numbers as both its domain and codomain is the function f(x) = 2x, which assigns to every real number the real number with twice its value. In this case, it is written that f(5) = 10.

In addition to elementary functions on numbers, functions

include maps between algebraic structures like groups and maps between geometric objects like manifolds. In the abstract set-theoretic approach, a function is a relation between the domain and the codomain that associates each element in the domain with exactly one element in the codomain. An example of a function with domain {A,B,C} and codomain {1,2,3} associates A with 1, B with 2, and C with 3.

There are many ways to describe or represent functions: by a formula, by an algorithm that computes it, by a plot or a graph. A table of values is a common way to specify a function in statistics, physics, chemistry, and other sciences. A function may also be described through its relationship to other functions, for example, as the inverse function or a solution of a differential equation. There are uncountably many different functions from the set of natural numbers to itself, most of which cannot be expressed with a formula or an algorithm.

Functions with numerical outputs may be added and multiplied, yielding new functions. Collections of functions with certain properties, such as continuous functions and differentiable functions, usually required to be closed under certain operations, are called function spaces and are studied as objects in their own right, in such disciplines as real analysis and complex analysis. An important operation on functions, which distinguishes them from numbers, is the composition of functions.

A relation is any subset of a Cartesian product. For instance, a subset of

, called a "binary relation from

," is a collection of ordered pairs

with first components from

and second components from

, and, in particular, a subset of

is called a "relation on

." For a binary relation

, one often writes

to mean that

is in

Graphs

A graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines or curves for the edges. Graphs are one of the objects of study in discrete mathematics.

The edges may be directed (asymmetric) or undirected (symmetric). For example, if the vertices represent people at a party, and there is an edge between two people if they shake hands, then this is an undirected graph, because if person A shook hands with person B, then person B also shook hands with person A. On the other hand, if the vertices represent people at a party, and there is an edge from person A to person B when person A knows of person B, then this graph is directed, because knowing of someone is not necessarily a symmetric relation (that is, one person knowing of another person does not necessarily imply the reverse; for example, many fans may know of a celebrity, but the celebrity is unlikely to know of all their fans). This latter type of graph is called a directed graph and the edges are called directed edges or arcs; in contrast, a graph where the edges are not directed is called undirected.

Vertices are also called nodes or points, and edges are also called lines. Graphs are the basic subject studied by graph theory. The word "graph" was first used in this sense by James Joseph Sylvester in 1878.

Strings and Language

A string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and/or the length changed, or it may be fixed (after creation). A string is generally understood as a data type and is often implemented as a byte (or word) array that stores a sequence of elements, typically characters, using some character encoding. A string may also denote more general array data types and/or other sequential data types and structures; terms such as byte string, or more general, string of datatype, or datatype-string, are sometimes used to denote strings in which the stored data does not (necessarily) represent text.

Depending on programming language and/or precise datatype used, a variable declared to be a string may either cause storage in memory to be statically allocated for a predetermined max length, or it may employ dynamic allocation to allow it to hold chronologically variable number of elements. When a string appears literally in source code, it is known as a string literal and has a representation that denotes it as such.

Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication. The scientific study of language in any of its senses is called linguistics.

The approximately 3000–6000 languages that are spoken by humans today are the most salient examples, but natural languages can also be based on visual rather than auditive stimuli, for example in sign languages and written language. Codes and other kinds of artificially constructed communication systems such as those used for computer programming can also be called languages. A language in this sense is a system of signs for encoding and decoding information. The English word derives from Latin lingua, "language, tongue." This metaphoric relation between language and the tongue exists in many languages and testifies to the historical prominence of spoken languages.^[1] When used as a general concept, "language" refers to the cognitive faculty that enables humans to learn and use systems of complex communication.

The word "language" has two meanings: language as a general concept, and "a language" (a specific linguistic system, e.g. "French"). Languages other than English often have two separate words for these distinct concepts. French for example uses the word langage for language as a concept and langue as the specific instance of language.

Boolean Logic

Boolean logic is a logical calculus of truth values, developed by George Boole in the 1840s. It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of conjunction x∧y, disjunction x∨y, and negation ¬x. The Boolean operations are these and all other operations that can be built from these, such as x∧(y∨z).

The laws of Boolean algebra can be defined axiomatically as certain equations called axioms together with their logical consequences called theorems, or semantically as those equations that are true for every possible assignment of 0 or 1 to their variables. The axiomatic approach is sound and complete in the sense that it proves respectively neither more nor fewer laws than the semantic approach.

Martes, Marso 29, 2011

Reasons of studying automata theory

The reasons studying automata theory is study of abstract machines or more approriately, abstract mathematical machines or system and the computational problems that can be solved using machines.These abstract machine are called automata.

The figure at right illustrates a finite state machine, which is one well-known variety of automaton. This automaton consists of states (represented in the figure by circles), and transitions (represented by arrows). As the automaton sees a symbol of input, it makes a transition (or jump) to another state, according to its transition function (which takes the current state and the recent symbol as its inputs).

Automata theory is also closely related to formal language theory, as the automata are often classified by the class of formal languages they are able to recognize. An automaton can be a finite representation of a formal language that may be an infinite set.

Computability Theory

Computability theory, also called recursion theory, is a branch of mathematical logic that originated in the 1930s with the study of computable functions and Turing degrees. The field has grown to include the study of generalized computability and definability. In these areas, recursion theory overlaps with proof theory and effective descriptive set theory.

The basic questions addressed by recursion theory are "What does it mean for a function from the natural numbers to themselves to be computable?" and "How can noncomputable functions be classified into a hierarchy based on their level of noncomputability?". The answers to these questions have led to a rich theory that is still being actively researched.

The field is also closely related to computer science. Recursion theorists in mathematical logic often study the theory of relative computability, reducibility notions and degree structures described in this article. This contrasts with the theory of subrecursive hierarchies, formal methods and formal languages that is common in the study of computability theory in computer science.

Lunes, Marso 28, 2011

Automata Theory

.
Automata are abstract mathematical models of machines that perform computations on an input by moving through a series of states or configurations. If the computation of an automaton reaches an accepting configuration it accepts that input. At each stage of the computation, a transition function determines the next configuration on the basis of a finite portion of the present configuration.

Turing machines are the most general automata. They consist of a finite set of states and an infinite tape which contains the input and is used to read and write symbols during the computation. Since Turing machines can leave symbols on their tape at the end of the computation, they can be viewed as computing functions: the partial recursive functions. Despite the simplicity of these automata, any algorithm that can be implemented on a computer can be modeled by some Turing machine.

Turing machines are used in the characterization of the complexity of problems. The complexity of a problem is determined by the efficiency of the best algorithm that solves it. Measures of an algorithm's efficiency are the amount of time or space that a Turing machine requires to implement the algorithm. A computation's time is the number of configurations involved in that computation, and its space corresponds to the number of positions on its tape that were used.

Automata theory has close ties to formal language theory, since there is a correspondence between certain families of automata and classes of languages generated by grammar formalisms. A language is accepted by an automaton when it accepts all of the strings in the language and no others. The family of Turing machines accept exactly the class of languages produced by Type 0 grammars. Turing machines whose space cannot be more than that occupied by the input (linear-bounded) accept the class of languages generated by context-sensitive (Type 1) grammars. The most restricted family of automata are finite automata consisting of only a finite number of states and a "read-only" tape containing the input to be read in one direction. Finite automata recognize the class of languages generated by regular (Type 3) grammars. These automata can be given a limited amount of extra power with the addition of certain forms of storage. For example, pushdown automata involve a pushdown store: a sequence in which symbols can only be added and removed from one end, with the effect that the first symbols in, are the last ones out. Pushdown automata accept the languages generated by context-free (Type 2) grammars.

Automata theory gave rise to the notion of deterministic computation, hence deterministic languages. In a deterministic computation each configuration of the machine has only one possible successor. For some families of automata (e.g., finite automata and Turing machines) deterministic and nondeterministic automata are equivalent. For others (e.g., pushdown automata) there are languages that can be accepted by a non-deterministic automata of that family but cannot be accepted by any deterministic automata.