8. Pragmas

It’s possible to influence the behavior of the processor by placing pragmas in your grammar.

☝︎
Experimental

Pragmas are a separate feature; they are not part of Invisible XML 1.0. As of 4 September 2022, the pragma syntax accepted by CoffeeFilter (and CoffeePot) has been updated to the grammar described in Designing for change: Pragmas in Invisible XML as an extensibility mechanism presented at Balisage, 2022.

If you run CoffeePot with the --pedantic option, you cannot use pragmas.

A pragma begins with “{[” and is followed by a pragma name, pragma data (which may be empty), and closes with “]}”. The pragma name is a shortcut for a URI which provides the “real” identity of the pragma. This mechanism leverages URI space to achieve distributed extensibility.

The mapping from names to URIs is done with the “pragma” pragma at the top of your grammar. This, for example, declares the name “nineml” as the pragma identified by the URI “https://nineml.org/ns/pragma/”:

{[+pragma nineml "https://nineml.org/ns/pragma/"]}

CoffeePot ignores any pragmas it does not recognize. The rest of this document assumes that you have declared the pragma name “nineml” as shown above. You must do this in every grammar file where you use pragmas.

Pragmas can be associated with the entire grammar or with a rule, a nonterminal symbol, or a terminal symbol:

  1. A pragma placed before a symbol applies to the symbol that follows it:

    rule: {[pragma applies to “A”]} A,
          {[pragma applies to “b”]} 'b'.
    
  2. A pragma placed before a rule, applies to the rule that follows it:

    {[pragma applies to “rule”]}
    rule: {[pragma applies to “A”]} A,
          {[pragma applies to “b”]} 'b'.
    
  3. To apply a pragma to the entire grammar, it must precede the first rule and it must be followed by a full stop:

    1{[pragma applies to whole grammar]} .
     
    {[pragma applies to “rule”]}
    rule: {[pragma applies to “A”]} A,
    5      {[pragma applies to “b”]} 'b'.
    

More than one pragma can appear at any of those locations:

1{[pragma applies to whole grammar]} .
{[second pragma applies to whole grammar ]} .
 
{[pragma applies to “rule”]}
5{[second pragma applies to “rule”]}
rule:
   {[pragma applies to “A”]}
   {[second pragma applies to “A”]} A,
   {[pragma applies to “b”]}
10   {[second pragma applies to “b”]} 'b'.

If a pragma is not recognized, or does not apply, it is ignored. CoffeePot will generate debug-level log messages to alert you to pragmas that it is ignoring.

8.1 Grammar pragmas

There are three pragmas that apply to a grammar as a whole.

8.1.1 csv-columns

Identifies the columns to be output when CSV output is selected.

Usage:

{[nineml csv-columns list,of,names]}

Ordinarily, CSV formatted output includes all the columns in (roughly) the order they occur in the XML. This pragma allows you to list the columns you want output and the order in which you want them output.

If a column requested does not exist in the document, it is ignored. An empty column is not produced.

8.1.2 import

Usage:

{[nineml import "grammar-uri"]}

8.1.3 ns

Usage:

{[nineml ns "namespace-uri"]}

8.1.4 record-end

The record-end pragma enables record-oriented processing by default. It’s value is the regular expression that marks record ends. Unlike the other pragmas, this one has a different URI binding. It is a grammar option and must be placed at the beginning of your grammar:

Usage:

{[+pragma opt "https://nineml.org/ns/pragma/options/"]}
{[+opt record-end "\n([^ ])"]}

8.1.5 record-start

The record-start pragma enables record-oriented processing by default. It’s value is the regular expression that marks record starts. Unlike the other pragmas, this one has a different URI binding. It is a grammar option and must be placed at the beginning of your grammar:

Usage:

{[+pragma opt "https://nineml.org/ns/pragma/options/"]}
{[+opt record-start "([^\\])\n"]}

8.2 Rule pragmas

There are four pragmas that apply to a rules.

8.2.1 csv-heading

Usage:

{[nineml xmlns "Heading Title"]}

8.2.2 discard-empty

Usage:

{[nineml discard empty]}

8.2.3 combine

Usage:

{[nineml combine]}

8.2.4 regex

Usage:

{[nineml regex "regular expression"]}

8.3 Symbol pragmas

There are two pragmas that apply to a symbols.

8.3.1 rename

Usage:

{[nineml rename newname]}

8.3.2 rewrite

Usage:

{[nineml rewrite "new literal"]}