5. Ambiguity

Ambiguity is not an error. The fact that Invisible XML grammars allow ambiguity is a feature. It’s also generally observed as the combination of a grammar and an input. Consider this grammar:

product: dessert-topping ; floor-wax .
dessert-topping: "custard" ; "cream" ; "Shimmer" .
floor-wax: "paste-wax"; "quick-shine"; "Shimmer" .
Example 5.1 SNL™ Shimmer Sketch

(If you aren’t familiar with the Saturday Night Live “Shimmer Floor Wax” sketch, now would be the time to go search the web.)

Parsed against the input “custard”, it says dessert topping:

$ coffeepot -pp -g:examples/ambig01.ixml custard
<product>
   <dessert-topping>custard</dessert-topping>
</product>

Parsed against the input “paste-wax”, it says floor wax:

$ coffeepot -pp -g:examples/ambig01.ixml paste-wax
<product>
   <floor-wax>paste-wax</floor-wax>
</product>

Parsed against “Shimmer”, it says:

$ coffeepot -pp -g:examples/ambig01.ixml Shimmer
There are 2 possible parses.
<product xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
   <floor-wax>Shimmer</floor-wax>
</product>

This is an example of an essential ambiguity; you can’t “fix” this grammar. But let’s dig a little deeper anyway. For a small number of parses, one way to investigate the ambiguity is to simply list them all with --parse-count:all:

$ coffeepot -pp -g:examples/ambig01.ixml --parse-count:all Shimmer
<ixml parses='2' totalParses='2'>
<product xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
   <floor-wax>Shimmer</floor-wax>
</product><product xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
   <dessert-topping>Shimmer</dessert-topping>
</product></ixml>

Alternatively, we can ask the parser to describe the ambiguity with --describe-ambiguity:

$ coffeepot -pp -g:examples/ambig01.ixml --describe-ambiguity --no-output Shimmer
Found 2 possible parses.
Ambiguity:
At 
	X dessert-topping «0-7» => 'S', 'h', 'i', 'm', 'm', 'e', 'r'
	  floor-wax «0-7» => 'S', 'h', 'i', 'm', 'm', 'e', 'r'

This indicates that the characters from 0 to 7 in the input can be matched as “dessert-topping” or “floor-wax”. This is another case where the forest graph can be useful.

Figure 5.1 The parse forest for “Shimmer”

It is possible for grammars to be infinitely ambigous. Consider this trivial grammar:

expr: expr ; 'a' .
Example 5.2 An infinitely ambiguous grammar

There’s no practical way for coffeepot to enumerate infinitely many parses, so it essentially ignores those edges in the graph. Parsing “a” yields:

$ coffeepot -pp -g:examples/ambig02.ixml a
There are 2 possible parses.
<expr xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
   <expr>a</expr>
</expr>

There are 2 parses because that’s what can be rendered. A description of the ambiguity, reveals that there are infinitely many parses:

$ coffeepot -pp -g:examples/ambig02.ixml --describe-ambiguity --no-output a
Found 2 possible parses (of infinitely many).
Infinite ambiguity:
At
	X 'a', 0, 1
	  expr «0-1» => 'a'

This is also evident in the graph:

Figure 5.2 The forest for an infinitely ambiguous parse

That loop is the source of infinite ambiguity. The parses that coffeepot will enumerate are:

$ coffeepot -pp -g:examples/ambig02.ixml --parse-count:all a
<ixml parses='2' totalParses='2'>
<expr xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
   <expr>a</expr>
</expr>
<expr xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">a</expr>
</ixml>

If you’re trying to eliminate ambiguity from a grammar that you think should be unambiguous, look for multiple ways to match “nothing”. For example, if you have a nonterminal that matches zero or more whitespace characters, make sure it isn’t possible for it to match in two different places.