Grammars for programming languages

阿新 • • 發佈：2018-12-28

A Boolean grammar was constructed to specify syntax and static semantics (including scoping rules) of a programming language. This was apparently the first such specification by an efficiently parsable grammatical model. Because conjunction and negation operators work on entire strings, rather than merely being a lookahead mechanism, the mentioned grammar is quite knotty.

A new life for cross-references and scoping?

Boolean predicates & and ! in parsing expression grammars are, in fact, positive and negative lookahead predicates, respectively. In the rules for bullet lists in Markdown, predicate &BulletSymbol succeeds if the next symbol of the input is a bullet, and predicate !

BulletSymbol succeeds if the next symbol is not a bullet.

Neither of the predicates consume any input: they are only used to check the lookahead symbols in the input, and those lookahead symbols can be regarded as the right context of a string.

Drawing upon both parsing expression grammars and Boolean grammars,

grammars with contexts provide a built-in mechanism to specify what left and right contexts should be.

ValidIdentifier  :  ident  &  << it was declared beforeValidIdentifier  :  ident  &  >> it will be declared later

These two informal rules state that whenever an identifier is used in a program, its declaration should appear either to its left (<<) or to its right (>>).

Consider the following fragment of a program in an assumed C-like language.

int f() {   int ms, sec, min;   ...   return 60 * min;}

Let’s write it once again, this time horizontally.

To ensure that identifier min used in the assignment expression is declared, one can verify whether its left context contains a function header (int f() {), keyword int, other identifiers (ms, sec), a comma, and the declaration of identifier min, followed by any other constructs ( ; ... return 60 * ), all the way up to the use of min itself.

This can be expressed in a grammar with contexts almost verbatim.

ValidIdentifier  :    ident &  <<== Functions FuncHeader "int" Identifiers CopiedString

This rule finds the substring between two positions in the input: before the declaration of an identifier and after its use. To include the use of the identifier into this substring, a so called extended left context <<== is used (that is, extended context of an identifier is its left context concatenated with that very identifier). After the desired substring has been found by the rule, it remains to check whether it forms a copy language wcw (and copy language can be defined by a conjunctive grammar, so everything works).

The standard restriction that forbids redeclaration of identifiers can be now expressed by the following rules:

IntegerDeclaration  :  "int" InvalidIdentifier ";"InvalidIdentifier   :  ident & ¬ ValidIdentifier

It also becomes possible to distinguish between types of identifiers: the rule for a valid identifier breaks up into several rules, one for each type in the language.

ValidIntegerIdentifier  :     ident & <<== Functions FuncHeader "int" Identifiers CopiedString

The only difference between these rules is in the keyword that should occur in the left context of an identifier use.

ValidBooleanIdentifier  :  ident & <<== Functions FuncHeader "bool" Identifiers CopiedString

Because identifiers are now distinguished according to their type, it makes sense to embed type checking directly into a grammar with contexts.

Assignment  :  ValidIntegerIdentifier "=" IntegerExpression |               ValidBooleanIdentifier "=" BooleanExpression

These rules state that a variable of a certain type can only be assigned an expression of the same type.

Grammars for programming languages

A new life for cross-references and scoping?

Grammars for programming languages

The top 6 programming languages for IoT projects

C: release & debug differences (might apply to all programming languages)

Coursera課程 Programming Languages, Part B 總結

Comparison of programming languages

Coursera課程 Programming Languages 總結

Essential of Programming Languages學習註意事項

programming-languages學習筆記--第2部分

CS 314 Principles of Programming Languages

programming-languages學習筆記--第3部分

programming-languages學習筆記--第5部分

Ask HN: Resources about the fundamentals of programming languages?

programming-languages學習

Ask HN: Why have graphical programming languages never taken off?

Ask HN: Chinese programming languages?

Ask HN: What programming languages a CS undergrad should know?

Ask HN: Does anyone wear special glasses for programming?

programming-languages學習筆記--第6部分

programming-languages學習筆記--第7部分

programming-languages學習筆記--第8部分

Grammars for programming languages

A new life for cross-references and scoping?

相關推薦