ContextFree Grammars
In this section we will review some "wellknown" results about the relationship
between contextfree grammars and algebras. These results underly
the algebraic approach to the semantics of programming languages.
It is "wellknown" that every contextfree grammar, G,

corresponds to a many sorted signature, (G),
with a sort for each nonterminal and an operator for each production in
G,

that the free algebra for (G)
corresponds to the abstract syntax of G

that the concrete syntax is a (G)algebra..
We will show how the programs given in the section Signatures,
Algebras and Homomorphisms can be used to explore these results.
Outline:
Contextfree grammars where originally developed by Noam Chomsky
as a means for describing the underlying syntax (grammar) of natural languages.
It was first used in computer science to describe the grammar of the programming
language ALGOL 60. For a while computer scientists used the term
"ALGOLlike languages" rather than Chomsky's term, "contextfree language".
However it was eventually realized that ALGOL60 was actually not an "ALGOLlike
language" and so the term "ALGOLlikelanguage" ceased to be used.
We will later give an example of a programming language, the language LANG1,
that does have a contextfree grammar. However, most programming
languages (and all natural languages) do not have contextfree grammars.
Despite this fact, contextfree grammars are still very useful and
reflect the underlying algebraic structure of both kinds of languages.
Later we will look further into the question of the relationship between
the syntax of programming languages and contextfree grammars.
Definition:
A contextfree grammar G consists of

T, a set (of terminal symbols)

N, a set (of nonterminal symbols)
where N
T =.

PNx(N
T)* (a set of productions)

S
N (the start symbol)
We can write a production as a pair p
= <A_{0}, a_{0}A_{1}a_{1}...A_{n}a_{n}>
where 0n,
each A_{i}N,
and each a_{i}T*.
Sometimes a production is written in the form
p A_{0 }
a_{0}A_{1}a_{1}...A_{n}a_{n}
A contextfree grammar G
= <T, N, P, S> defines a subset L(G)
of T* calledthe
context free language generated by G.
The set L(G)
is defined using the relation _{G}:(NT)* (N
T)* where w_{G}u
iff there exists a production p = <L,R>P
and a,b(N
T)* such that w=aLb
and u = aRb.
Let _{G}
be the transitive closure of _{G}.
Then L(G) is defined to be
{ u T*  S_{G}
u }.
end
Examples:
Example: Let the grammar G
be such that T = {0, 1}, let N
= {A}, let P = {<A, 0>,
<A, 1A1>}, and let A
also be the start symbol. Then L(G) =
{0, 101, 11011, ... 1^{n}01^{n}...}
end
Example: Logical Expressions:

T = {true, false, and, or, not, (, ) },

N = {E},

P = {<E, true>, <E, false>, <E, (E and E)>,
<E, (E or E)>, <E, not(E) > },

and the start symbol is E.
The resulting contextfree language, L(G), will be the set of all wellformed
logical expressions without variables. Examples of elements of L(G)
include

true

(true or false)

(true and (false or not(false)))
end.
Example: Arithmetic Expressions
I:

T = { 0,1,2,3,4,5,6,7,8,9,+,*,,/,(,) }

N = { Digit, Pint, Int, Exp }

P = { <Digit, 0>, <Digit, 1>, <Digit, 2>, <Digit,
3>, <Digit, 4>, <Digit, 5>, <Digit, 6>, <Digit, 7>, <Digit,
8>, <Digit, 9>,
<PInt, 1>, <PInt, 2>, <PInt, 3>, <PInt, 4>, <PInt, 5>, <PInt,
6>, <PInt, 7>, <PInt, 8>, <PInt, 9>, <PInt, PInt.Digit>,
<Int, PInt>, <Int, 0>, <Int, PInt>,
<Exp, Int>, <Exp, (Exp + Exp)>,<Exp, (Exp * Exp)>,<Exp,
(Exp  Exp)>,<Exp, (Exp / Exp)>}
end.
Example: A simple subgrammar of English.
yet to come
Example: LANG1  click
here to see the contextfree grammar of a simple programming language.
end.
The Signature of a Contextfree Language
Given a contextfree grammar G = <T, N, P, S>
define Sig(G), the signature of G,
to be the signature with

a sort n for each nonterminal nN.

an operator for each production in P
Specifically, if p = <A_{0},
a_{0}A_{1}a_{1}...A_{n}a_{n}>
is a production in P then the corresponding
operator is p:A_{1}x...x.A_{n}
A_{0}.
Example: Let the grammar G
be such that T = {0, 1}, let N
= {A}, let P = {<A, 0>,
<A, 1A1>}, and let A
also be the start symbol. Then the corresponding signature, Gram1,
is:
The Signature Gram1 in mathematical form:
name: Gram1
sorts: {A}
operators:
Gram1(A) = {p0}
Gram1(A,A) = {p1}
End of signature Gram1.
Note that, with this signature the only Gram1terms
that we can generate are of the form p1^{n}(p0(
)) (where p1^{0}(p0(
)) = p0( ) and p1^{n+1}(p0(
)) = p1( p1^{n}(p0(
))) ).
end.
Example:
To see the signature corresponding to the example of a contextfree
grammar for arithmetic expressions click here.
end.
The Algebras of a contextfree Grammar
Information is lost in the process of going from a grammar to its signature
 that is, you can not reconstruct the complete grammar from the signature
because the operators lack the strings of terminal symbols found in the
productions and the signature does not contain any information about the
start symbol. The missing information given by the strings of terminal
symbols corresponds to a Sig(G)algebra, the
choice of the start symbol is just the designation of one of its sorts.
The grammarasawhole then corresponds to the unique homomorphism h_{G}:T_{Sig(G)}
Alg(G) from the initial Sig(G)algebra
to the algebra Alg(G). Our claim
is that, if sS
is
the start symbol for G then the image of (T_{Sig(G)})_{s}_{ }
under the homomorphism
h_{G }is exactly
the set,
L(G), i.e., the context free language
generated by G.
In terms more familiar to computer scientists, a Sig(G)term,
t,
is the abstract syntax of the string h_{G
}(t)
which, in turn, is a concrete syntax for t.
If there are two terms t,t' (T_{Sig(G)})_{s}_{ }
such that tt but h_{G
}(t)
= h_{G
}(t') then we say
that the grammar G is ambiguous.
Since the real interest is in the image of (T_{Sig(G)})_{s}_{ }
under the of h_{G} , we have some
freedom in the definition of Alg(G):

there is no harm in adding "extraneous elements" to the carriers of Alg(G).
Indeed
there is real benefit if the "extraneous elements" simplify the presentation
of the algebra.

it doesn't matter what the operations do when the "extraneous elements"
appear as arguments
This being the case we can give the following definition for Alg(G)

For each nN
let Alg(G)_{n} = T*,
the set of all strings on the set T of terminal
symbols from G.

For each pP,
if p = <A_{0}, a_{0}A_{1}a_{1}...A_{n}a_{n}>,
then let p_{Alg(G)}:(T*)^{n}
T* such that, for all <t_{1},...,t_{n}> (T*)^{n}
, we have p_{Alg(G)}
(t_{1},...,t_{n} )
= a_{0}t_{1}a_{1}...t_{n}a_{n}
the concatenation of the strings in the indicated order.
This definition for p_{Alg(G)}
can be directly translated into a JavaScript expression suitable
for use with our JavaScript representation of algebras as
p_{Alg(G)} (t_{1},...,t_{n})
= a_{0 }+ arg[0] + a_{1} + ...+ arg[n] + a_{n}
.
Example: Let the grammar G
be such that T = {0, 1}, let N
= {A}, let P = {<A, 0>,
<A, 1A1>}, and let A
also be the start symbol. Then the corresponding algebra, alg_Gram1,
is the Galgebra with

carrier {0, 1}*

operations
p0_{alg_Gram1}: {0,
1}*, p0_{alg_Gram1} (
) = 0
p1_{alg_Gram1}: {0,1}* {0,
1}*, p1_{alg_Gram1} (
t ) = 1t1
The corresponding algebra in JavaScript is given
by alg_Gram1
The algebra alg_Gram1
The signature of alg_Gram1 is: alg_Gram1
The carrier of alg_Gram1 is the set , {1, 0}*
. of all strings of zeros and ones. (but is not actually specified in the
UACalculator version).
The operations of the algebra alg_Gram1 are:
["p0", "'0'"]
["p1", "'1'+arg[1]+'1'"]
End of algebra alg_Gram1
end