Climbing trees

August 9th, 2010

Here is a solution to an ACM programming problem requiring one to analyse trees describing parental relationships.

I wrote this in Maude even though the classic tool for this kind of problem is Prolog.

The solution
The output

We start by importing integers and quoted identifiers (symbols):

mod CLIMBING is
protecting INT .
protecting QID .

Next we define the sorts used in the program. Rel means any kind of relationship, NamePair is a just pair of names. The sorts Parent, Sibling and Cousing are all relationship types.


sort Rel .

sorts NamePair .
sorts Parent Sibling Cousin RelType .
subsorts Parent Sibling Cousin < RelType .

The following operators (or constructors) are used to construct various relationship facts.


op Parent : Nat -> Parent .
op Cousin : Nat Nat -> Cousin .
op Sibling : -> Sibling .

op pair : Qid Qid -> NamePair .
op rel : NamePair RelType -> Rel .
op parent : Qid Qid -> Rel .

op Empty : -> Rel [ ctor ] .
op __ : Rel Rel -> Rel [ ctor assoc comm id: Empty ] .

The problem set is initialized with the following set of parent-child facts

op init : -> Rel .
eq init =
parent('alonzo.church, 'oswald.veblen)
parent('stephen.kleene, 'alonzo.church)
parent('dana.scott, 'alonzo.church)
parent('martin.davis, 'alonzo.church)
parent('pat.fischer, 'hartley.rogers)
parent('mike.paterson, 'david.park)
parent('dennis.ritchie, 'pat.fischer)
parent('hartley.rogers, 'alonzo.church)
parent('les.valiant, 'mike.paterson)
parent('bob.constable, 'stephen.kleene)
parent('david.park, 'hartley.rogers) .

We use the following variables to name various objects in our rules:


vars A B : Qid .
vars X Y Z : Qid .

vars N M O P : Nat .
vars R : RelType .
vars REST : Rel .

These four rules describe the different relationship types:


---- Note that the input is in (child, parent) format, but we want to output (parent, child)

rl [ parent ] : parent(A, B) => rel( pair(B, A), Parent(0)) .

rl [ grandparent ] :
rel( pair(X, Y), Parent(0))
rel( pair(Y, Z), Parent(N)) => rel( pair(X, Z), Parent(N + 1)) .

---- sibling is reflexive

rl [ sibling ] :
rel( pair(X, Y), Parent(0))
rel( pair(X, Z), Parent(0)) => rel( pair(Y, Z), Sibling) rel( pair(Z, Y), Sibling) .

---- check least ancestor
rl [ cousin ] :
rel( pair(X, Y), Parent(N))
rel( pair(X, Z), Parent(M)) => rel( pair(Y, Z), Cousin(min(N,M), abs(N - M))) .
endm

These search expressions describe the input queries (we could make an effort to parse these properly instead)


search [1] in CLIMBING : init =>+ rel (pair ('stephen.kleene, 'bob.constable), R) REST .
search [1] in CLIMBING : init =>+ rel (pair ('hartley.rogers, 'stephen.kleene), R) REST .

----- search [1] in CLIMBING : init =>+ rel (pair ('les.valiant, 'alonzo.church), R) REST .
----- swap this query to search for father instead of child

search [1] in CLIMBING : init =>+ rel (pair ('alonzo.church, 'les.valiant), R) REST .

search [1] in CLIMBING : init =>+ rel (pair ('les.valiant, 'dennis.ritchie), R) REST .
search [1] in CLIMBING : init =>+ rel (pair ('dennis.ritchie, 'les.valiant), R) REST .
search [1] in CLIMBING : init =>+ rel (pair ('pat.fischer, 'michael.rabin), R) REST .

quit

Loop macros

June 4th, 2010

I have released the latest version of Church-State here:
http://subvert-the-dominant-paradigm.net/~jewel/church/church-alpha-4.tar.gz

You can also get the source code via mercurial:

hg clone http://subvert-the-dominant-paradigm.net/repos/bootstrap

or browse it.

In the previous version of Church loop constructs were expanded in the Church compiler directly to the State “tagbody” form. I have removed this code and implemented it as a Church macro which expands to a new Church “tagbody” form.

The aim behind this is to make it easier to do optimization and flow analysis on Church code in the future.

User extensible parsing

March 29th, 2010

I have been experimenting with user-extensible parsers in Church, even though I don’t have a use for them yet.

I added a new language construct called “extend-grammar”, ie:


extend-grammar ometa testg {
test-rule = "test" ws+ cname:a -> << 42 >>
}

This test-rule will match the string “test”, followed by a name. The rule will ignore its input and return the value 42.

This grammar is processed during parse time, converted to church code and dynamically compiled and linked into the running process.

At present this happens after the whole file is parsed, so it’s currently not possible to add a new grammar rule and use it in the same source file.

To activate the rule I added another construct, “eval-when”, ie:


eval-when compile load
      church-add-parser-extension 'test-rule
                                                                                    

What this does is to execute this code when this file is compiled (load doesn’t work yet).

In this example we add ‘test-rule to the list of ometa functions to be called by a special new grammar rule called ‘user-form.

The new version of ‘user-form is compiled and linked into the process, immediately making it available to the parser.

For this test case I then load the following file:


dotest
        print "in dotest"
        print (test notanumber)

which prints 42 because the parser has intercepted what would normally be a method call.

Rude repl

March 14th, 2010

I have added eval and a repl (read eval print loop) to Church.

> + 3 4
7
> length "foobar"
6
> eval "* 19 3"
57
> load "church/test/hello.church"
true
> (main)
"hello"
true

Up to now I have only been using Church as a command-line compiler which produces executable files. Yet I have always preferred interactive language environments (lisp, smalltalk, python etc) to stop-compile-run languages.

All the machinery for writing a repl has been in Church for a while, including

  • Church parser available as a library
  • Church compiler available at runtime
  • Machine code generator and dynamic linker available at runtime
  • The ability to modify the dispatch table at runtime

This makes the repl easy to implement:

repl
        loop
                do
                        rstr = (read-line)
                        if (null? rstr)
                                return-from repl nil
                        else
                                print (eval rstr)

Eval is a little bit more messy because the compiler is designed to compile a whole method at a time and doesn’t know what to do with variables that are not either a global or local variable.

To implement eval I wrapped the eval string in a lambda that I return from a method which gets run after compilation.

set-eval-compiled-function
   eval-compiled-function = (fn -- eval_str)

This works for certain language expressions, but does not presently provide the ability to assign to “repl” variables.

Another drawback is that multiline statements are currently not possible (unlike python and lisp).

In the future I hope to experiment with a SLIME-like extension to emacs which communicates to Church across a socket, allowing interactive evaluation and compilation of source code from within the editor.

Prolog in Church

November 24th, 2009

As a step towards implementing an experimental type inferencing or type checking system I have built a prolog interpreter in Church. The interpreter is based on the example in Peter Norvig’s book, “Paradigms of Artificial Intelligence Programming”.

As a first step I wrote the grammar for parsing prolog programs. Here is a sample ‘parent’ program describing parent and grandparent relationships between Debian derivatives:


parent(ubuntu,kubuntu).
parent(ubuntu,edubuntu).

parent(debian, ubuntu).

grandparent(X,Z) :- parent(X,Y), parent(Y,Z).

and here is the OMeta grammar for parsing prolog programs:


ometa prolog <: ometa {

comment = "/*" (~cnewline anything)* cnewline -> << '_comment >>,

ws = $  | $	,
wsnl = ws | cnewline,

prolog-top-levels = (comment | prolog-rule)*,

prolog-rule = wsnl* prolog-clause:rhead ws* (":-" ws* prolog-or:body -> << body >>)* ws* $. wsnl*  -> <<`[rule ,rhead ,body]>>,

prolog-or = listof("prolog-impl", ";"):e  wsnl* -> << `[or ,e] >>,
prolog-impl = wsnl* prolog-clause:a wsnl* "->" wsnl* prolog-or:b wsnl* -> << `[implies ,a ,b] >> | prolog-and,
prolog-and = wsnl* listof("prolog-expr", ","):e wsnl* -> << `[and ,e] >>,

prolog-clause = wsnl* prolog-name:name wsnl* prolog-arg-list:args wsnl* -> <<`[,name | ,args]>>,
prolog-arg-list = $( listof("prolog-expr", ","):args wsnl* $) wsnl* -> << args >>,

prolog-expr = wsnl* (
	         $[ wsnl* $] wsnl* -> << `[]>> |
	      	 $[ wsnl* listof("prolog-expr", ","):list-head ws* ($| wsnl* prolog-expr)*:list-tail wsnl* $] wsnl* -> << (append! list-head (if list-tail (first list-tail) nil)) >> |
	      	 $( wsnl* prolog-or:e wsnl* $) wsnl* -> << e >> |
		 prolog-clause:c wsnl* -> << c >> |
                 prolog-number |
                 prolog-variable:v wsnl* -> << v >>),

prolog-number = "-"*:sign digit+:d -> << (convert-number sign d)>>,

prolog-variable = (letter | digit | $_)+:l -> << (intern (coerce l 'string)) >>,

prolog-name = (letter | digit | $_)+:l -> << (intern (coerce l 'string)) >>  }

The core of prolog lies in the unification and backtracking algorithms.

Unification will try to match its two inputs or else bind a variable to the corresponding value:


unify x y bindings
	cond
		(eq? bindings prolog-fail) prolog-fail
		(eq? x y) bindings
		(eq? x '_) bindings
		(eq? y '_) bindings
		(variable-symbol? x) (unify-variable x y bindings)
		(variable-symbol? y) (unify-variable y x bindings)
		(and (cons? x) (cons? y)) (unify (rest x) (rest y) (unify (first x) (first y) bindings))
		true prolog-fail

unify-variable var x bindings
	cond
		(get-binding var bindings) (unify (lookup var bindings) x bindings)
		(and (variable-symbol? x) (get-binding x bindings)) (unify var (lookup x bindings) bindings)
		(and occurs-check? (occurs-check var x bindings)) prolog-fail
		true (extend-bindings var x bindings)

Backtracking is achieved by allowing each clause to provide multiple solutions and trying all of these possibilities:


prove-all pi:prolog-interpreter goals bindings
	cond
		(eq? bindings prolog-fail) nil
		(null? goals) (list bindings)
		true
			mapcan (fn goal1-solution
				prove-all pi (rest goals) goal1-solution
) (prove pi (first goals) bindings)

As a test I used this map coloring program from a prolog tutorial:


member(X,[X|_]).
member(X,[_|List]) :- member(X,List).

adjacent(X,Y,Map) :-  member([X,Y],Map) ; member([Y,X],Map). 

find_regions([],R,R).
find_regions([[X,Y]|S], R,A) :-
 (member(X,R) ->
  (member(Y,R) -> find_regions(S,R,A)     ; find_regions(S,[Y|R],A))  ;
  (member(Y,R) -> find_regions(S,[X|R],A) ; find_regions(S,[X,Y|R],A) )). 

color(Map,Colors,Coloring) :-
        find_regions(Map,[],Regions),
        color_all(Regions,Colors,Coloring),
        not(conflict(Map,Coloring)).
color_all([R|Rs],Colors,[[R,C]|A]) :-
        member(C,Colors),
        color_all(Rs,Colors,A).
color_all([],_,[]). 

conflict(Map,Coloring) :- member([R1,C],Coloring),
        member([R2,C],Coloring),
        adjacent(R1,R2,Map). 

map1([[1,2],[1,3],[1,4],[1,5],[2,3],[2,4],[3,4],[4,5]]).

which yields the following colourings:


"query"
[[map1 M] [color M [red green blue yellow] Coloring]]

M = [[1 2] [1 3] [1 4] [1 5] [2 3] [2 4] [3 4] [4 5]]
Coloring = [[5 red] [4 green] [3 red] [1 blue] [2 yellow]]
;
M = [[1 2] [1 3] [1 4] [1 5] [2 3] [2 4] [3 4] [4 5]]
Coloring = [[5 red] [4 green] [3 red] [1 yellow] [2 blue]]
...

You can browse the source files for the prolog interpreter here.

jonesforth64

July 20th, 2009

I have ported jonesforth to 64-bit x86 code.

jonesforth is a tutorial-style implementation of forth which explains in detail how the compiler and runtime is implemented. Porting the code to a slightly different assembly language helped me to think carefully about what each primitive does and about how they are used in the runtime code.

As noted in the jonesforth comments, the original advantage of using direct-threaded code on a 16-bit machine is that calling each word can be encoded in two bytes instead of three. That’s a savings of 33%. On 32-bit x86, it’s four bytes versus five, saving 20%. In my 64-bit implementation I chose to extend the 4-byte addresses to 8-byte words. This actually results in wasting space rather than saving it because on x86-64 calls and branches are usually encoded with a 5-byte instruction using relative displacement.

The port was fairly straightforward, I mostly just replaced 32-bit registers (eax, esp, esi etc) with the 64-bit equivalents (rax, rsp, rsi etc) and changed every reference to the word-size from 4 to 8.

The biggest difference is that syscalls use different registers on 64-bit linux and these registers can be clobbered during the call.

You can get the code from the mercurial repository:

hg clone http://subvert-the-dominant-paradigm.net/repos/jonesforth64/

To compile it:


gcc -m64 -nostdlib -static -Wl,-Ttext,0 -Wl,--build-id=none -o jonesforth64 jonesforth64.S

To run it:


cat jonesforth64.f - | ./jonesforth64
JONESFORTH VERSION 45
6393 CELLS REMAINING
OK
1 2 3 4
.S
4 3 2 1
ROT
.S
3 2 4 1

I’ve tested most of the code in the .f file, but I haven’t yet implemented C strings, file-io or the built-in assembler.

I’ve tried to keep the comments intact, but haven’t updated them to reflect different word sizes or registers etc.

Hullabaloo in the Guava Orchard

July 1st, 2009

Hullabaloo in the Guava Orchard

This book and “The Inheritance of Loss” make Kiran Desai one of my favourite authors.

ICFP ‘09

June 30th, 2009

I had the pleasure of participating in the ICFP contest again this year. I think I joined a haskell team in 2002 (which is funny because I’m not proficient in haskell at all!). But since then I’ve only been part of a Lisp team in 2007. That time I failed to implement a ropes-like library in enough time to get us enough momentum.

So this year it was great to try again, our team ended up being just myself and another Lisp hacker from New Zealand. Since I’m in South Africa there was quite a time zone difference between us and we both stayed up till early hours during the last two days.

As always it didn’t take long to write up the initial code for the interpreter, but we spent several hours tracking down bugs, parsing the inputs and figuring out how to approach the problem.

My partner then took over, he wrote all the physics and modeling code while I tried to provide moral support and starting hacking a visualizer.

Initially we used a lisp library called cgn, which was ok for the initial runs but proved too slow later on. cgn writes out data points to a text file (and we had to patch it to write double-floats correctly) and then feeds this to gnuplot.

The writing and reading of these ascii files was too slow for the kinds of scenarios we were modeling (tens of thousands of data points), so I started over, trimming the dataset by only using every 100th data point and writing data files in binary format.

Since I had never really used gnuplot before I spent several hours reading documentation and poking around until I was able to generate scripts like the following to render our data:


set key bottom right
plot [-100000000:100000000] 'plots/ROCKET.dat' binary record=11951X11951 format="%float64%float64" title 'plots/ROCKET', 'plots/TARGET.dat' binary record=11951X11951 format="%float64%float64" title 'plots/TARGET'
pause -1

I was quite pleased with the results, but there are obviously better ways to display the trace if you have the time and create the right tools.

Satellite visualization

(Note the start and end labels are swapped in this picture)

In the end we never really got a solution for the 4th task, but managed to score about 2000 points with our solutions.

Church alpha 3

March 20th, 2009

This release fixes some bugs:

http://subvert-the-dominant-paradigm.net/~jewel/church/church-alpha-3.tar.gz

  • When I first wrote some of the church compiler passes and the sexp parser I used global variables to store state specific to that pass. I knew that only one thread would be running and that these passes didn’t have to be reentrant. Later I added functionality to compile macros on the fly and to compile dispatch matchers. When I did this I tried to rewrite all the modules that used global state, but I missed the sexp parser and this caused some strange bugs later on.
  • When allocating memory for dynamically allocated code, I was storing the address of the memory block in a fixnum. This worked fine until malloc started returning addresses from higher memory which used the most significant two bits in a word. These would get shifted out when tagging a fixnum and yield an invalid address. To work around this I now box the address by storing it in an array

Church alpha 2

March 11th, 2009

I have made a new release incorporating the performance work described in previous posts.

http://subvert-the-dominant-paradigm.net/~jewel/church/church-alpha-2.tar.gz