chomsky normal form in computer science, a formal grammar is in chomsky normal form if and only if all production rules are of the form: a ? bc or a ? a or s ? e where a, b and c are nonterminal symbols, a is a terminal symbol (a symbol that represents a constant value), s is the start symbol, and e is the empty string. also, neither b nor c may be the start symbol. every grammar in chomsky normal form is context-free, and conversely, every context-free grammar can be efficiently transformed into an equivalent one which is in chomsky normal form. with the exception of the optional rule s ? e (included when the grammar may generate the empty string), all rules of a grammar in chomsky normal form are expansive; thus, throughout the derivation of a string, each string of terminals and nonterminals is always either the same length or one element longer than the previous such string. the derivation of a string of length n is always exactly 2n1 steps long. furthermore, since all rules deriving nonterminals transform one nonterminal to exactly two nonterminals, a parse tree based on a grammar in chomsky normal form is a binary tree, and the height of this tree is limited to at most the length of the string. because of these properties, many proofs in the field of languages and computability make use of the chomsky normal form. these properties also yield various efficient algorithms based on grammars in chomsky normal form; for example, the cyk algorithm that decides whether a given string can be generated by a given grammar uses the chomsky normal form. the chomsky normal form is named after noam chomsky, the us linguist who invented the chomsky hierarchy. alternative definition some sources define chomsky normal form in the following slightly different way: a formal grammar is in chomsky normal form if and only if all production rules are of the form: a ? bc or a ? a where a, b and c are nonterminal symbols, and a is a terminal symbol. when using this definition, b or c may be the start symbol. this definition differs from the previous one in that it precludes the possibility that the grammar will generate the empty string, e. it remains true that any context-free accepting a language l can be efficiently transformed into a grammar in chomsky normal form that accepts l - {e}. the principle advantage of this later definition is that proofs are generally marginally simpler, due to the fact that each step in a derivation never decreases the length of the resulting string. its disadvantage, of course, is that special consideration is needed if the original grammar generated e.
-------------------------------backus normal form (bnf) the syntax definitions on this site use a variant of backus normal form (bnf) that includes the following: [ ] brackets enclose optional items. { } items of which only one is required. | a vertical pipe separates alternatives (or) underlined the default option ... the preceding item can be repeated (as many times as needed) upper case command that should be entered as shown lowercase&italic variable that should be replaced with an appropriate value where a command is too long to fit on a single line it is wrapped with each subsequent line indented: command option option option option option option option option option option option option option option option option option option option option where there are several optional items and you can choose any,several or all these are shown in a list, all indented to the same degree: command option option option option if further command options follow they will be indented: command option option option option option option option if the options have sub options they are individually indented: command option option option option option option option option option option option option where there are several optional items and you can choose only one from the list, they are indented and separated by a blank line command option option option option this is the same as command option {option | option | option} where the layout makes it possibe, braces and the pipe symbol are the preferred method to show alternative options.
this syntax is not always followed rigourously e.g. on a page where almost everything is a variable, to improve readability some variables may be left as lower case but not italic. similarly option [,option] [,...] may be abbreviated to just option ,... this is to avoid having a surfeit of brackets around the more complex commands. bold text is used to highlight some key items but does not have any syntactic importance. in some older texts bnf is also referred to as backus-naur form. backus�naur form from wikipedia, the free encyclopedia jump to: navigation, search the backus�naur form (also known as bnf, the backus�naur formalism, backus normal form, or panini�backus form) is a metasyntax used to express context-free grammars: that is, a formal way to describe formal languages. bnf is widely used as a notation for the grammars of computer programming languages, instruction sets and communication protocols, as well as a notation for representing parts of natural language grammars (for example, meter in sanskrit poetry.) most textbooks for programming language theory and/or semantics document the programming language in bnf. there are many extensions of and variants on bnf. history john backus created the notation in order to express the grammar of algol. at the first world computer congress, which took place in paris in 1959, backus presented "the syntax and semantics of the proposed international algebraic language of the zurich acm-gamm conference", a formal description of the ial which was later called algol 58. the formal language he presented was based on emil post's production system. generative grammars were an active subject of mathematical study, e.g. by noam chomsky, who was applying them to the grammar of natural language.[1] [2] peter naur later simplified backus's notation to minimize the character set used, and, at the suggestion of donald knuth[3], his name was added in recognition of his contribution. the backus�naur form or bnf grammars have significant similarities to pa?ini's grammar rules, and the notation is sometimes also referred to as panini�backus form. introduction a bnf specification is a set of derivation rules, written as <symbol> ::= <expression with symbols> where <symbol> is a nonterminal, and the expression consists of sequences of symbols and/or sequences separated by the vertical bar, '|', indicating a choice, the whole being a possible substitution for the symbol on the left. symbols that never appear on a left side are terminals. as an example, consider this possible bnf for a u.s. postal address:
<postal-address> ::=
<street-address> ::= <eol> | <eol> ::= | "." <street-address> ::= <street-name> <eol> ::= "," <state-code> <eol> this translates into english as: * a postal address consists of a name-part, followed by a street-address part, followed by a zip-code part. * a name-part consists of either: a personal-part followed by a last name followed by an optional "jr-part" (jr., sr., or dynastic number) and end-of-line, or a personal part followed by a name part (this rule illustrates the use of recursion in bnfs, covering the case of people who use multiple first and middle names and/or initials). * a personal-part consists of either a first name or an initial followed by a dot. * a street address consists of an optional apartment specifier, followed by a house number, followed by a street name, followed by an end-of-line. * a zip-part consists of a town-name, followed by a comma, followed by a state code, followed by a zip-code followed by an end-of-line. note that many things (such as the format of a first-name, apartment specifier, or zip-code) are left unspecified here. if necessary, they may be described using additional bnf rules. bnf's syntax may be represented with a bnf like the following: <syntax> ::= | <syntax> ::= "<" ">" "::=" <expression> ::= " " | "" <expression> ::= <list> | <list> "|" <expression> ::= <eol> | <list> ::= | <list> ::= | "<" ">" ::= '"' '"' | "'" "'" this assumes that rule. <eol> to be and/or line-feed, to be substituted
no whitespace is necessary for proper interpretation of the the appropriate line-end specifier (in ascii, carriage-return depending on the operating system). and are with a declared rule's name/label or literal text, respectively.
variants there are many variants and extensions of bnf, generally either for the sake of simplicity and succinctness, or to adapt it to a specific application. one common feature of many variants is the use of regexp repetition operators such as * and +. the extended backus-naur form (ebnf) is a common one. in fact the example above is not the pure form invented for the algol 60 report. the bracket notation "[]" was introduced a few years later in ibm's pl/i definition but is now universally recognised. abnf is another extension commonly used to describe ietf protocols.
parsing expression grammars build on the bnf and regular expression notations to form an alternative class of formal grammar, which is essentially analytic rather than generative in character. many bnf specifications found online today are intended to be human readable and are non-formal. these often include many of the following syntax rules and extensions: * optional items enclosed in square brackets. e.g. [] * items repeating 0 or more times are enclosed in curly brackets. e.g. <word> ::= { } * items repeating 1 or more times are followed by a '+' * terminals may appear in bold and nonterminals in plain text rather than using italics and angle brackets * repetition of an item is signified by an asterisk placed after that item: �*� * alternative choices in a production are separated by the �|� symbol. e.g., | * where items need to be grouped they are enclosed in simple parenthesis --------------------------------------------greibach normal form from wikipedia, the free encyclopedia jump to: navigation, search in computer science, to say that a context-free grammar is in greibach normal form (gnf) means that all production rules are of the form: a \to \alpha x or s \to \epsilon where a is a nonterminal symbol, a is a terminal symbol, x is a (possibly empty) sequence of nonterminal symbols not including the start symbol, s is the start symbol, and e is the null string. observe that the grammar must be without left recursions. every context-free grammar can be transformed into an equivalent grammar in greibach normal form. (some definitions do not consider the second form of rule to be permitted, in which case a context-free grammar that can generate the null string cannot be so transformed.) this can be used to prove that every contextfree language can be accepted by a non-deterministic pushdown automaton. given a grammar in gnf and a derivable string in the grammar with length n, any top-down parser will halt at depth n. greibach normal form is named after sheila greibach.
---------------------------------