Sponsored By: Infinity Foundation

The Panini-Backus Form in Syntax of Formal Languages

The Panini-Backus Form in Syntax of Formal Languages
by T.R.N. Rao, PhD

Copyright 1998, T.R.N. Rao and Subhash Kak, reprinted with permission

Vyaas Houston (1991), in one of his writings, mentions his discovery of the world’s oldest living language: Sanskrit, the language of ancient India and Vedic civilization. He states thus:

“It was perfectly clear to me that I had come upon a perfect language, a language that invokes the spirit, an inexhaustible wellspring of spiritual inspiration. The ancients called it devavani, the language of gods. Where did it come from? – A language infinitely more sophisticated than any of our modern tongues.”

The sophistication Houston refers to is about the formalism and structure of the language. For computer scientists, in the theory of formal languages, the word “formal” refers to the fact that all the rules for the language are explicitly stated in terms of what strings of symbols could occur, without any ambiguity and the need for interpretations based on mental skills. Sanskrit not only has a very rich inflectional structure but this fact was recognized early by grammarians and it has contributed to the mystique of the language.

A famous grammar of Sanskrit was compiled by Panini, who flourished around 500 B.C., and his work Astadhyayi has been studied in India for centuries, inspiring many commentaries. The prestige of Panini’s grammar is so great that the earlier grammars of the language were lost. Panini’s grammar uses a variety of formal techniques including recursion, transformations, and metarules. Here we examine one specific feature of his structure that has been used also in the representation of high-level languages.

The formal structure of computer programming languages was introduced in the 1958-60 period by eminent scientists John Backus (1958), and Peter Naur (1963). They headed UNESCO conferences on International algorithmic language ALGOL 60, a language “suitable for expressing a large class of numerical processes in a form sufficiently concise for direct automatic translation into the language of programmable automatic computers.”

What is BNF notation?
BNF is an acronym originally for “Backus Normal Form” that was later changed to Backus-Naur Form. BNF notation can be found in any book on programming languages.

The following, taken from Marcotty and Ledgard (1986), explains the meta-symbols of BNF.

The meta-symbols of BNF are:

::= meaning “is defined as”
| meaning “or”
< > angle brackets used to surround category names.

The angle brackets distinguish syntax rules names (also called non-terminal symbols) from terminal symbols which are written exactly as they are to be represented. A BNF rule defining a nonterminal has the form:

nonterminal ::= sequence_of_alternatives consisting of strings of
terminals or nonterminals separated by the meta-symbol |

For example, the BNF production for a mini-language is:

<program> ::= program
<declaration_sequence>
begin
<statements_sequence>
end ;

This shows that a mini-language program consists of the keyword “program” followed by the declaration sequence, then the keyword “begin” and the statements sequence, finally the keyword “end” and a semicolon. Several authors have used slight extensions of BNF for clarification or ease of use which we will not go into.

A point of interest here is a correspondence in ACM Communications from Donald Knuth (1964) arguing on behalf of the acronym BNF to represent Backus-Naur form rather than Backus Normal Form and gives three reasons for that:

It gives proper credit to both Backus and Naur for their contributions;
It preserves the often used abbreviation “BNF”;
BNF is not really a “normal form” in any conventional sense or (a special or canonical form) and hence it is just a “form”;

Knuth’s suggestion prevailed and BNF has been taken to stand for Backus-Naur Form.

Another major point of interest for us is another correspondence in ACM Communications titled “Panini-Backus Form” by P.Z. Ingerman (1967), which we reproduce here verbatim.

Knuth (1964), in a Letter to the Editor of CACM, makes the point that the metasyntactic notation used in, e.g., the ALGOL 60 report (Naur 1963) should be renamed. In particular, he observes the well-acceded fact that the so-called Backus Normal Form is, indeed, not a normal form in any sense. The purpose of this letter is to observe that Backus was not the first to use the form with which his name has become associated, although he did, indeed, discover it independently.

Dr. Alexander Wilhelmy has called to my attention a work by Panini. , Panini was a scholar who flourished between 400 B.C. and 200 B.C.; perhaps his most significant work was the compilation of a grammar of Sanskrit. In order to describe the (rather complicated) rules of grammar, he invented a notation which is equivalent in its power to that of Backus, and has many similar properties: given the use to which the notation was put, it is possible to identify structures equivalent to the Backus “|” and to the use of the meta-brackets “<” and “>” enclosing suggestive names. Panini avoided the necessity for the character “::=” by writing the meta-result on the right rather than the left [see, or Ingerman (1996) for a similar notation].

Since it is traditional in professional circles to give credit where credit is due, and since there is clear evidence that Panini was the earlier independent inventor of the notation, may I suggest the name “Panini-Backus Form” as being a more desirable one? Not only does it give due credit, but it also avoids the misuse of the word “Normal”.

Summary
The above makes the powerful plea that Backus-Naur Form (BNF) should be truly called Panini-Backus Form (PBF), as “we must give credit where credit is due.” Paninian grammars, which consisted of over 4,000 algebraic rules and metarules have been studied by a number of scholars. Kak (1987), reviews the Paninian approach to natural language processing (NLP) and compares it with the current knowledge representation systems of Artificial Intelligence, and argues that Paninian-style generative rules and metarules could assist in further advances in NLP. Another article by Staal (included in this book) discusses the consistency of the system of rules of Panini, as tested by Fowler’s Automaton. These are among the marvelous contributions of ancient India to computing sciences.

References

Backus, John (1959). The syntax and semantics of the proposed international algebraic language of the Zürich ACM-GAMM conference. Proc. Internat. Conf. Inf. Proc., UNESCO, Paris

Houston, Vyaas (1991). Foreword to “Gods, Sages and Kings” by David Frawley, Passage Press, Salt Lake City, Utah.

Ingerman, P.Z. (1966). A Syntax-Oriented Translator, Academic Press, New York.

—-1967 “Panini-Backus Form Suggested,” Comm. ACM 10, 3, p. 137

Kak, S.C. (1987). The Paninian Approach to Natural Language Processing. International Journal of Approximate Reasoning; 1:117-130.

Knuth, Donald (1964) Backus normal form vs. Backus Naur form. Comm. ACM 7, 12, 735-736

Marcotty, M. & Ledgard, H. (1986). The World of Programming Languages, Springer-Verlag, Berlin., p. 41.

Naur, Peter (Ed.) (1963). Revised report on the algorithmic language ALGOL 60, Comm. ACM 6, 1, 1-17

Vasu, S.C. (1962). The Astadhyayi. Motilal Banarsiddass, Delhi, India

T.R.N. Rao, PhD
Center for Advanced Computer Studies, University of Southwestern Louisiana
Lafayette, Louisiana

Copyright 1998, T.R.N. Rao and Subhash Kak, reprinted with permission