Chapter 7 2 min read
Save

Formal Languages and Automata

Mathematical Foundation of Computer Science · BCA · Updated Apr 23, 2026

Table of Contents

Formal Languages and Automata

Automata theory studies abstract machines and the problems they can solve. It defines what computation is and what its limits are. Formal languages are sets of strings defined by grammars or recognised by automata.

Alphabet, Strings, and Languages

An alphabet Σ is a finite set of symbols. A string is a finite sequence of symbols from Σ. The empty string is ε. A language is a set of strings over Σ. Operations include concatenation, union, intersection, complement, and Kleene star (L* = {ε} ∪ L ∪ LL ∪ ...).

Finite Automata

A Deterministic Finite Automaton (DFA) is (Q, Σ, δ, q₀, F) — states, alphabet, transition function, start state, and accepting states. It reads input one symbol at a time and transitions deterministically. A Nondeterministic FA (NFA) allows multiple transitions from a state on the same input. DFA and NFA recognise the same class of languages.

Regular Expressions

Regular expressions define regular languages using union (|), concatenation, and Kleene star (*). Example: (a|b)*abb matches strings over {a,b} ending in "abb". Every regular expression has an equivalent NFA and vice versa. Regular expressions are used in text processing, lexical analysis, and pattern matching.

Context-Free Grammars

A Context-Free Grammar (CFG) G = (V, T, P, S) has variables, terminals, productions, and a start symbol. Productions have the form A → α. CFGs generate context-free languages (CFLs), which include programming languages. Parse trees represent derivations. Ambiguous grammars have multiple parse trees for a string.

Pushdown Automata

A Pushdown Automaton (PDA) is a finite automaton augmented with a stack. It recognises context-free languages. The stack provides memory for matching nested structures (parentheses, HTML tags). Deterministic PDAs recognise a subset of CFLs used in most programming language parsers.

Turing Machines

A Turing Machine (TM) has an infinite tape, a head that reads/writes/moves, and states. It recognises recursively enumerable languages and decides recursive languages. The Church-Turing thesis states that anything intuitively computable is Turing-computable.

Chomsky Hierarchy

The Chomsky hierarchy classifies languages: Type 3 (regular, FA), Type 2 (context-free, PDA), Type 1 (context-sensitive, linear-bounded automaton), Type 0 (recursively enumerable, TM). Each level strictly contains the ones below it.

Summary

Automata and formal languages form the theoretical backbone of computer science. They define the capabilities and limitations of computation, underpin compiler design, and guide algorithm development.

Related Notes

Discussion

0 comments

Join the discussion

Log in to share your thoughts and help fellow students.

Log in to comment

No comments yet. Be the first to share your thoughts!