Parsing

In January I added some optimizations to my OMeta implementation to compute FIRST and FOLLOW sets for certain (hard-coded) grammar rules to try and predict which parsing rule to apply when faced with many options. This needs to be extended to automatically select rules that will benefit from this optimization.

I was recently looking at the GLL parser described by Elizabeth Scott and Adrian Johnstone. Does anyone know how easy it is to implement a parser like this? How does it compare (in implementation effort) with something like OMeta?

That reminded me of Matthew Might’s post on parsing with derivatives. I had a look at it again and I am very much compelled by the elegance of the implementation. However, looking at it closely reminds me of how much I suffer from Scheme Envy. The 500 line Racket implementation uses streams (lazy lists) and makes heavy use of hygienic macros. It uses dynamic binding (make-parameter / parameterize), delay/force, the pattern matching facility and weak hash tables. How many potential implementation targets provide facilities like these?

A parsing issue that worries me a lot is how to parse languages that rely on indentation (eg python, haskell) and still maintain some kind of composability for user defined grammars.