LANGUAGES

Our concept of language is an abstraction of the concept of a natural language.


LANGUAGE CONCEPTS AND NOTATION

alphabet, denoted by languages1.jpg
a nonempty set of symbols

By convention, we use lowercase letters a, b, c, ... to represent elements of languages2.jpg .

For example, languages3.jpg

strings
finite sequences of symbols from the alphabet

By convention, we use lowercase letters ..., u, v, w, x, y, z to represent strings.

For example, languages4.jpg is a string from the above alphabet.

concatenation of strings u and v
appending the symbols of v to the right end (i.e., after) the symbols of u, denoted by uv

If languages5.jpg and languages6.jpg , then languages7.jpg

associativity of string concatenation
languages8.jpg

Thus normally just write uvw without parentheses.

string concatenation is not commutative.
That is, languages9.jpg .

reverse of a string w, denoted by languages10.jpg .
string with same symbols, but with the order reversed

If languages11.jpg , then languages12.jpg .

length of a string w, denoted by languages13.jpg
the number of symbols in string w

empty string, denoted by languages14.jpg
the string with no symbols, languages15.jpg

The empty string is the identity element for concatenation, languages16.jpg


FORMAL INTERLUDE: INDUCTIVE DEFINITIONS AND INDUCTION

Inductive Definition of Length
Note: differs from textbook -- here begin with the empty string

  1. languages17.jpg
  2. languages18.jpg

Using the fact that languages19.jpg is the identity element and the above definition:
languages20.jpg

Prove: languages21.jpg
Noting the definition of length above, we choose to do an induction over string v (or, if you prefer, over the length of languages22.jpg , basing induction at 0):

Base case languages23.jpg (that is, length is 0)

languages24.jpg
= { identity for concatenation }
languages25.jpg
= { identity for + }
languages26.jpg
= { definition of length }
languages27.jpg

Inductive case languages28.jpg (that is, length is greater than 0)
induction hypothesis: languages29.jpg

languages30.jpg
= { associativity of concatenation }
languages31.jpg
= { definition of length }
languages32.jpg
= { induction hypothesis }
languages33.jpg
= { associativity of + }
languages34.jpg
= { definition of length (right to left)}
languages35.jpg

Thus we have proved languages36.jpg


LANGUAGE CONCEPTS AND NOTATION (CONTINUED)

substring of a string w
any string of consecutive symbols in w

If languages37.jpg , then the substrings are languages38.jpg .

If w = vu, then v is a prefix of w; u is a suffix.

If languages39.jpg , then the prefixes are languages40.jpg .

languages41.jpg , for any string w and languages42.jpg
denotes n repetitions of string (or symbol) w

languages43.jpg , for alphabet languages44.jpg
denotes the set of all strings obtained by concatenating zero or more symbols from the alphabet

languages45.jpg

language
a subset of languages46.jpg , for some alphabet languages47.jpg

sentence of some language L
any string from L (i.e., from languages48.jpg )

Example Languages

Let languages49.jpg .


OPERATIONS ON LANGUAGES

Languages are represented as sets. Operations on languages can be defined in terms of set operations.

union, intersection, and difference
directly as matching operations on sets

complementation with respect to languages55.jpg
languages56.jpg

reverse
languages57.jpg -- reverse all strings

concatenation
languages58.jpg

languages59.jpg
L concatenated with itself n times

languages60.jpg and languages61.jpg

star-closure
languages62.jpg

positive closure
languages63.jpg

Language Operation Examples

Let languages64.jpg .

How would we express in languages68.jpg and languages69.jpg ?

Although set notation is useful, it is not a convenient notation for expressing complicated languages.


[Next]


Copyright © 1999, H. Conrad Cunningham
Last modified: Sun Aug 22 18:40:35 CDT 1999