07.363 Logic Programming: Lecture 12
Writing Parsers in Prolog
Prolog syntax is flexible, but there are times when something more
sophisticated is needed.
Easy to write recursive descent parser in Prolog. Even easier when
using grammar rules (DCGs).
Anyone not taking 07.330?
Here is a crash course in lexical analysis and parsing.
A DCG for Evaluating Arithmetic Expressions
expr(Z) --> term(X), "+", expr(Y), { Z is X + Y }.
expr(Z) --> term(X), "-", expr(Y), { Z is X - Y }.
expr(X) --> term(X).
term(Z) --> number(X), "*", term(Y), { Z is X * Y }.
term(Z) --> number(X), "/", term(Y), { Z is X / Y }.
term(Z) --> number(Z).
number(C) --> "+", number(C).
number(C) --> "-", number(X), { C is -X }.
number(X) --> [C], { "0"=<C, C=<"9", X is C - "0" }.
DCG Expansion
?- expr(Z, "-2+3*5+1", [])
Z = 14
DCG translation does this:
p(X) --> q(X).
becomes
p(X, S0, S) :- q(X, S0, S).
More than one goal
p(X, Y) -->
q(X),
r(X, Y),
s(Y).
becomes
p(X, Y, S0, S) -->
q(X, S0, S1),
r(X, Y, S1, S2),
s(Y, S2, S).
Terminals
Inside a grammar rule body, a list indicates a terminal (``thing that
must literally appear'')
Recall: "thing" is a list of ASCII codes.
p(X) --> [go,to], q(X), [stop].
Is translated into 'C'/3
p(X, S0, S) :-
'C'(S0, go, S1),
'C'(S1, to, S2),
q(X, S2, S3),
'C'(S3, stop, S).
'C'/3 is predefined by the unit clause "'C'([XS],X,S)".
An Example
parse( program(Decls,Stmt) ) -->
decls(Decls),
stmt_seq(Stmt).
decls( [D|Decls] ) --> decl(D), decls(Decls).
decls( [] ) --> [].
decl( const(Ide,Val) ) --> [const, ide(Ide), =], expr(Val).
decl( var(Ide,Type) ) --> [var, ide(Ide), :], type(Type).
type( integer ) --> [integer].
type( boolean ) --> [boolean].
stmt_seq( [S|Stmts] ) --> stmt(S), opt_stmt_seq(Stmts).
opt_stmt_seq( [S|Stmts] ) --> [;], stmt(S), opt_stmt_seq(Stmts).
opt_stmt_seq( [] ) --> [].
stmt( assign(Ide,Expr) ) --> [ide(Ide), :=], expr(Expr).
stmt( write(Expr) ) --> [write], expr(Expr).
stmt( read(Ide) ) --> [read, '(', ide(Ide), ')'].
stmt( while(Cond,S) ) --> [while], expr(Cond),
[do], stmt(S).
stmt( if(Cond,S1,S2) ) --> [if], expr(Cond),
[then], stmt(S1),
[else], stmt(S2).
stmt( block(Stmt) ) --> [begin], stmt_seq(Stmt), [end].
expr( E ) --> lexpr( L ), expr_(L, E).
expr_( L, and(L,E) ) --> [and], expr(E).
expr_( L, or(L,E) ) --> [or], expr(E).
expr_( L, L ) --> [].
lexpr( not(L) ) --> [not], lexpr(L).
lexpr( B ) --> bexpr(B).
bexpr( B ) --> aexpr(A), bexpr_(A, B).
bexpr_( A, A>B ) --> [>], aexpr( B ).
bexpr_( A, A<B ) --> [<], aexpr( B ).
bexpr_( A, A=B ) --> [=], aexpr( B ).
bexpr_( A, A ) --> [].
aexpr( E ) --> term(T), aexpr_(T, E).
aexpr_( T, T+E ) --> [+], aexpr(E).
aexpr_( T1, E ) --> [-], term(T2), aexpr_( T1-T2, E ).
aexpr_( T, T ) --> [].
term( T ) --> factor(F), term_(F, T).
term_( F, F*T ) --> [*], term(T).
term_( F, F/T ) --> [/], term(T).
term_( F, F ) --> [].
factor( ide(Ide) ) --> [ide(Ide)].
factor( int(N) ) --> [int(N)].
factor( true ) --> [true].
factor( false ) --> [false].
factor( -F ) --> [-], factor(F).
factor( E ) --> ['('], expr(E), [')'].