OMeta is a parsing engine by Ian Piumarta and Alessandro Warth. It combines PEG rules with a syntax for naming matched components and executable actions that can refer to the named parts. A simple OMeta rule looks like this:
$' (~$' <ochar>)*:s $' => [(coerce s 'string)]
Here a string is matched starting with a single quote, followed by any character that is not a single quote and ending with a single quote.
The action for this rule turns the list of characters matched into a lisp string.
I created an OMeta implementation in Common Lisp by bootstrapping from the squeak implementation of OMeta.
To do this I first modified the OMeta parser to be able to read lisp forms in the action bodies. Next I modified the OMeta compiler to produce lisp forms instead of smalltalk code.
Using these generated forms in combination with hand-coded primitive rules (ometa-prim.lisp) I was able to use two new grammars ometa-parser.g and ometa-compiler.g to fully bootstrap the system.
My code is available from this mercurial repository:
hg clone http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/ometa/
To run it:
Then you can parse a grammar file into its AST:
To create an executable parser, first declare a subclass of ometa:
(defclass dnaparser (ometa) ())
Next write the production rules in a grammar file:
$A | $T | $G | $C^L
<base>*:s => [ (coerce s 'string) ]^L
and generate the parser from the grammar:
(generate-parser 'dnaparser "example.g" "example.lisp")
This reads the grammar file “example.g” and produces lisp defmethods for the class ‘dnaparser’. These lisp forms are written to “example.lisp”.
After loading the parser, you can run the productions like this:
CL-USER> (let ((*ometa-class-name* 'dnaparser))
(run-production 'dsequence (coerce "GGCCGGGC" 'list)))
As an alternative to generate-parser, you can use install-grammar to load the rules without generating an intermediate file.
The squeak implementation of OMeta supports several extensions that I have not implemented:
– Memoization of previous parse results
– Support for left-recursive rules
– Ability to apply a rule in the super class
– Support for rules that call out to a “foreign” parser during the parse
– An optimizer to remove redundant AND and OR forms produced by the parser