|
BLD_doc.grm
|
Contents
GeneralitiesThis is a Jacc grammar for the so-called Presentation Syntax (PS) of the RIF Basic Logic Dialect (BLD) developed as a result of the activities of the RIF Working Group. This HTML file is the root of a hyperlinked documentation allowing one to explore the Jacc grammar for BLD via navigation through its elements - rules, terminal symbols, and non-terminal symbols. The comments accompanying some rules come from the original documents. It also contains the pure Yacc rules (i.e., without semantic actions), and the XML serialization mappings. This documentation is generated by Jacc from the Jacc grammar specified in file BLD.grm (i.e., with the command "jacc -doc BLD"). Along with this Jacc grammar file, there are other supporting source files.This Jacc grammar is a transcription of the EBNF for the canonical syntax of the RIF BLD. This syntax is canonical in that this EBNF defines the kernel constructs used for the BLD-to-XML transformation rules. In addition to the canonical BLD PS language, it has been proposed to allow a simpler syntax for writing RIF use cases. This simpler syntax extends the canonical syntax by allowing various shorthands for RIF constants and for common expressions such as arithmetic, etc. - the so-called Abridged PS. This additional syntax is not canonical PS in that it is just syntactic sugar that is desugared into the canonical form. Implementation conventionsThe Jacc grammar specification given here is a literal transcription of the BNF rules given in the above references, adapted to the need of the Jacc format. There are two sets of grammar rules:
Tokenizing(N.B.: See the source file of the tokenizer Tokenizer.java.)Recognized tokensThe terminal symbols are:
Important NotesImportant notes regarding lexical analysis
In the RIF specification of the EBNF the Rule Language, it is specified that:
IRIMETA ::= '(' IRICONST? (Frame | 'And' '(' Frame ')')? ')'
Frame ::= TERM '[' (TERM '->' TERM) ']'
TERM ::= IRIMETA? (Const | Var | Expr | 'External' '(' Expr ')')
Const ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT
SYMSPACE ::= ANGLEBRACKIRI | CURIE
where CONSTSHORT, ANGLEBRACKIRI, and CURIE
are defined (in the DTB
shorthand notation for RIF constants) by:
CURIE ::= PNAME_LN | PNAME_NS
CONSTSHORT ::= ANGLEBRACKIRI // shorthand for "..."^^rif:iri
| CURIE // shorthand for "..."^^rif:iri
| '"' UNICODESTRING '"' // shorthand for "..."^^xs:string
| NumericLiteral // shorthand for "..."^^xs:integer,xs:decimal,xs:double
| '_' LocalName // shorthand for "..."^^rif:local
where:
ANGLEBRACKIRI ::= '<' ([^<>"{}|^`]-[#x00-#x20]) '>'
PNAME_LN ::= PNAME_NS PN_LOCAL
PNAME_NS ::= PN_PREFIX? ':'
PN_LOCAL ::= (PN_CHARS_U | [0-9]) ((PN_CHARS|'.') PN_CHARS)?
PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS|'.') PN_CHARS)?
PN_CHARS_U ::= PN_CHARS_BASE | '_'
PN_CHARS ::= PN_CHARS_U
| '-'
| [0-9]
| #x00B7
| [#x0300-#x036F]
| [#x203F-#x2040]
PN_CHARS_BASE ::= [A-Z]
| [a-z]
| [#x00C0-#x00D6]
| [#x00D8-#x00F6]
| [#x00F8-#x02FF]
| [#x0370-#x037D]
| [#x037F-#x1FFF]
| [#x200C-#x200D]
| [#x2070-#x218F]
| [#x2C00-#x2FEF]
| [#x3001-#xD7FF]
| [#xF900-#xFDCF]
| [#xFDF0-#xFFFD]
| [#x10000-#xEFFFF]
The PS grammar's tokenizing is complexified due to not using
double-quoted strings around the IRI's that are arguments of
the pragmas Prefix and Base, which declare
shorthands for IRI's. The alternative would be to parse
IRI's - which is beyond our prototype's goal, besides being
unnecessary in this case. This is not so in the canonical PS, where
all such IRI's are double-quoted strings - which greatly
simplifies the tokenizing. It's as simple and as easy to do so for
the Prefix and Base pragmas - which is what our
prototype does.
ParsingRecognized PS constructsImportant notes regarding syntax analysisThe raw BNF'sThe two grammars for the BLD (Condition and Rule) languages expressed in Yacc form are given below.BLD Rule Language
The original EBNF is accessible in the specification of the BLD
Rule Language. It is reproduced here for convenience:
BLD Condition LanguageThe original EBNF is accessible in the specification of the BLD Condition Language. It is reproduced here for convenience:
FORMULA ::= ATOMIC
| IRIMETA? 'And' '(' FORMULA* ')'
| IRIMETA? 'Or' '(' FORMULA* ')'
| IRIMETA? 'Exists' Var+ '(' FORMULA ')'
| IRIMETA? 'External' '(' Atom | Frame ')'
ATOMIC ::= IRIMETA? (Atom | Equal | Member | Subclass | Frame)
Atom ::= UNITERM
UNITERM ::= Const '(' (TERM* | (Name '->' TERM)*) ')'
Equal ::= TERM '=' TERM
Member ::= TERM '#' TERM
Subclass ::= TERM '##' TERM
Frame ::= TERM '[' (TERM '->' TERM)* ']'
TERM ::= IRIMETA? (Const | Var | Expr | 'External' '(' Expr ')')
Expr ::= UNITERM
Const ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT
Name ::= UNICODESTRING
Var ::= '?' UNICODESTRING
SYMSPACE ::= ANGLEBRACKIRI | CURIE
IRIMETA ::= '(*' IRICONST? (Frame | 'And' '(' Frame* ')')? '*)'
The Jacc rules corresponding to this EBNF are given in BLC.grm.
Additional ad hoc rulesSome Jacc rules corresponding to temporary ad hoc implementation decisions for the sake of prototyping are given in AdHoc.grm.XML serialization annotationsThis version of the BLD grammar is annotated for simple XML serialization as per the scheme specified in the current BLD document. Each XML serialization annotation generates an HTML documentation file accessible by navigating through the grammar (e.g., that of the rule for Group). The effects of such annotations are summarized in the table of XML serialization mappings.Essentially, the format of a Jacc grammar is that of a Yacc grammar. As in Yacc, Jacc rules may be annotated with semantic actions in the form of Java code involving the rule's RHS constituents (denoted by $1, $2, ..., $n - the so-called pseudo-variables where the index n in $n refers to the order of RHS constituents. Such actions appear between curly braces ('{' and '}') wherever a symbol may appear in a rule's RHS. Jacc also allows an additional form of annotation in the RHS of a rule to indicate the XML serialization pattern of the abstract syntactic tree (AST) node corresponding to a derivation with this rule. This XML serialization meta-annotation comes between square brackets ('[' and ']') and is of the form described in a simple XML serialization annotation language.
For example, the annotated rule:
For example, see the two test files examples/Test1.bld and examples/Test2.bld. Running the command examples/bld on them produces the XML trees shown in examples/Test1.xml and examples/Test2.xml. |
This file was generated on Mon Nov 17 15:35:41 PST 2008 from file BLD_doc.grm
by the ilog.language.tools.Hilite Java tool written by Hassan Aït-Kaci