Prolog for Software Engineering

A tutorial by Peter B. Reintjes for the 1994 International Conference on the Practical Applications of Prolog

Originally presented at the Royal Society of Arts, London on April 27, 1994.

The Prolog programming language offers several opportunities to tackle the fundamental problems of software engineering. By properly exploiting certain characteristics of Prolog, we can improve the clarity, robustness, and reliability of programs as well as improving communication between programmers.
But clear, reliable, and maintainable programs do not write themselves, even in Prolog. Therefore, this tutorial will focus on systematic development of small Prolog programs, the role of specifications, how to build interfaces to other languages and systems, standardized programming techniques and styles, approaches to performance measurement to increase efficiency, how to effectively rewrite for efficiency, and how to produce documentation in a literate programming style.
Finally, there will be a description of a performance monitor and a tool for handling formal languages in Prolog.

Outline

Software Engineering Problems
Rewrite! Rewrite! REWRITE!
The Gospel According to O'Keefe
No Silver Bullet . . . but
Using Prolog
The Program Equation
Efficient Data Structures: How Many Bits?
Arguments: Boon or Bane?
Literate Programming
The Elements of Prolog Style
Software Engineering Case Studies
Blackbox and Whitebox reusability
Case Study #1: L'Express
Case Study #2: Multi/Plex
Application Composition and Decomposition
Multi-Lingual Logic Synthesis
Performance Monitoring
The Programmer Who Knew Too Much.

Software Engineering Problems

Some Major Problems

Too much code. Either tens of thousands, or hundreds of thousands, or millions of lines of code for an ``Application''.
No one really understands exactly what the code is doing.
There is insufficient documentation. In fact, it is so unusual to have documentation, it is ignored when it does exist.
We have little chance of ever understanding all of the interactions between different parts of a large system.
The people who do have a chance of understanding the code, very often don't understand the application.
People do not reuse existing software either because they don't trust it or because it is not general enough.

All of these problems can be related to the fact that too many programs are written in languages which describe operations at too low a level. In other words, the expressibility of the programming language is far below the level of a natural description of the problem.

Programs dealing with human-scale problems should be written in some unambiguous subset of natural language such as predicate calculus, or at the very least, Prolog rather than C or C++.

The paralyzing dictum that ``We cannot afford to re-write the code'' is likely to prove false. There are many indications that we will soon be unable to afford the maintainence of old code.

Rewrite, Rewrite, REWRITE!

Traditional Model of Software Development

Specify => Build => Test => Tune

This model of software development assumes that you know exactly what you want to build.

In this model the entropy (amount of disorder in the software) only increases over the lifetime of the program.

When an artifact is too large and complex for a person to understand, adding features and fixing bugs consists of adding code and new pathways through existing code. The integrity of the program always suffers.

Uncertainty about the complete function of a piece of software forces the maintenance worker to frequently add, but almost never remove code.

The Evolutionary Model of Software Development

Steps (in no particular order) are:

Data-structure Analysis
Prototyping
Refinement
Adding Capability
Tuning

Organisms resist entropy by constantly re-building. Any organism that repaired itself by building new structures, without removing existing structure would die of self-poisoning. However, constant rewriting is often infeasable, since programs would more often be broken more often than working in development.

Punctuated Equillibrium: A Model

Stephen Jay Gould, the paleontologist, has proposed the theory of Puncutated Equllibrium which says that evolution does not always proceed smoothly with gradual changes, but that periods of gradual change are interrupted by very short-term upheavals which result in massive die-offs of large numbers of species.

For our purposes we might use the considerably less apocalyptic metaphor of ``Spring Cleaning''. We must be willing to periodically rip up and make major changes to our software, but without concise, high-level languages such as Prolog, such upheavals are nearly impossible.

The Gospel According to O'Keefe

By far the best book on Prolog Programming style is The Craft of Prolog by Richard A. O'Keefe.

This book is an absolute requirement for any serious Prolog programmer. There are many important things here that no other Prolog books even mention.

Using Prolog

The Program Equation

When we consider Kowalski's concise equation:

Algorithm = Logic + Control

It is easy to forget about the original equation from which it derives. Namely:

Programs = Algorithms + Data Structures

With the result that our programs seem to be dominated by list-traversal predicates of the form:

map([], []).
map([Item|Items], [X|Xs]) :-
       predicate(Item, X),
       map(Items, Xs).

To refocus our attention on real programs rather than just algorithms, we must substitute for Algorithm in the second equation to remind us that:

Programs = Logic + Control + Data Structures

Features of (Pure) Prolog

Unification
Backtracking
Memory Management
Pattern matching (Indexing)
Meta-programming,
Low procedure-call overhead.

Temptations of (Impure) Prolog

Meta-logical tests (var/1, nonvar/1)
Database operations (assert/1, retract/1)
Cuts!
If-then-else structures.

Extensive use of the impure features is a strong indication that something is wrong with your program.

Arguments: Boon or Bane?

Many people complain about the lack of global variables in Prolog because they have to write predicates with ten or twenty arguments. If you have this problem, either This relationship between twenty things is an important aspect of your application (unlikely).

You have not defined your data structures and relations properly (very likely).

In the rare event of case #1, you must keep, cherish, and document the twenty arguments since they represent an important and probably extremely subtle characteristic of your application.

If you are complaining because you find yourself adding two arguments one minute, and later deleting another argument or moving the fifteenth argument to the third place, the problem is not that of a large arity relationship, but poor program organization. You must appreciate an argument when it means something, and recognize when they are the result of poor design. Very rarely will you fruitfully replace an argument with a global assertion, though the temptation will be strong.

Turn away from the Dark Side, Luke - Obi Wan Kenobe

Efficient Data Structures: How Many Bits Do You Use?

Bruce Smith, who taught me Prolog programming in 1986 once commented on the following fragment of a program I had written.

display_list([]).
display_list([Item|Items]) :-
    display_item(Item),
    display_list(Items).

display(graphic(Object, Color)) :-
	set_color(Color),
	draw_object(Object).

display(text(X, Y, Text)) :-
	set_color(black),
	name(Text, Chars),
	write_chars(X, Y, Chars).

He pointed out that I was only using one bit of the atoms graphic and text. In fact, if the display/1 predicate had eight clauses, for the eight different objects that ultimately had to display (transistors, wires, etc.), at least I would be using three bits.

If we want to manipulate arrays of zeros, ones, and x (don't care) values, the obvious data-structure might be a list-of-lists (we will use this later in the circuit minimization example).

	   [[1,0,x,1,0],
	    [0,x,x,0,x],
	    [1,x,x,0,x],
	    [x,1,1,1,x],
	    [x,1,x,x,1]]

But while each of the twenty-five elements represent an information content of two-bits, Prolog uses 85 words (2720 bits on a 32-bit machine), rather than 50 bits.

If the row width is fixed (and less than 256) we can use row(1,0,x,1,0) to represent a rows and reduce this to 40 words, or 1280 bits. If the columns are fixed we can reduce the outer list overhead to 6 rather than 10, for a total of 1152 bits.

In the limit as the array gets larger our structure will use 1/3 of the space required by the naive list-of-lists.

But we have not even looked at the amount of space required for each two-bit data element. We need a way to use more of the bits in each integer. Now, in this particular application, the matrices are quite sparse, with many X or ``don't care'' elements. If we want to take advantage of the sparsity, we must have a way for each element to carry its column number (since all columns will not appear in the representation), and this directly suggests how to use more of the bits in the 32-bit integers that hold our data elements.

Operationally, we encode an element by left-shifting a variable's column index sufficiently to store its value into the lower bits. For boolean values we only shift the index one bit position. Thus a 4 represents a 0 at position 2 ( (2>>1) OR 0 => 4).

          X1 X2 X3 X4 -> X1 X2 X3 X4

           2 0 1 2           4  7
           0 0 2 2        2  4
           1 2 2 1        3       13
           2 2 2 2

Original Matrix and Encoding

The lesson is: Don't be afraid to twiddle bits in Prolog. (But of course, neither should you do it unnecessarily.)

Literate Programming

We really must take advantage of the conciseness of Prolog by writing readable programs. ``Self-documenting'' programs are a myth. There is no substitute for having clear, concise prose accompany an algorithm.

This is particularly worthwhile because well-written Prolog programs can serve as formal definitions of our applications and not simply as implementions. A logically cohesive description of an application will transcend a particular implementation and will ultimately be of much greater value.

Good mathematical descriptions have lifetimes measured in centuries. As programs become more like formal specifications of the problems they address, and less like collections of imperative code to handle special cases, their potential lifetimes may approach those of mathematical theorems.

Integrated Documentation

I use the following scheme in which documents are built directly from Prolog source code, since the need to edit and consult the program is more frequent than the need to produce the documentation.

/**
Incisive text in LaTex
(or another suitable markup language).
**/
meaningful_predicate(WellNamed, Arguments) :-
      well_chosen_word(Arguments, WellNamed).
/**
Riveting revelation about the above code...
**/

This is run through a filter (src2tex) to become:

Incisive text in LaTex (or another suitable markup language).

\begin{verbatim}
meaningful_predicate(WellNamed, Arguments) :-
      well_chosen_word(Arguments, WellNamed).
\end{verbatim}

Riveting revelation about the above code...

The Elements of Prolog Style

Well-constructed Prolog programs can have declarative and procedural readings which are very close to a natural language specification. Combined with the reduction in the size of source code, we have an opportunity to radically change the presentation of programs.

All programming languages have an interpretation in natural language, but if we examine the rules for good prose style, they can be considered as rules for good Prolog programming. This is true to an extent that is not true in other languages. These rules are taken from The Elements of Style with ``code'', ``clause'' and ``predicate'' substituted for ``words'', ``sentence'' and ``paragraph''.

Revise and Rewrite
Omit needless code
Keep related code together
Put statements in positive form
Choose a suitable design and hold to it
Make a predicate the unit of composition
Express co-ordinate ideas in similar form
Work with nouns and verbs
Use orthodox spelling
Be clear
Prefer the standard to the offbeat
Do not take shortcuts at the cost of clarity

One reason for this closeness between the values of good writing and good Prolog programming is the correspondence between clarity and efficiency that Prolog exhibits. It is more difficult to forsee improved software engineering in other languages where clarity and efficiency are more often at odds.

Software Engineering Case Studies

Blackbox and Whitebox Reusability

After maintainability, re-usability of code is an important goal of good Software Engineering. In the case studies which follow, we will pay particular attention to the two kinds of software reusability known as BLACKBOX and WHITEBOX reusability.

BLACKBOX reusability is the kind of reusability you get from a good tool. You can use it without knowing or caring about how it is implemented.

WHITEBOX reusability is the the sort that you can get from a source-code library, where a particular algorithm might be picked up in source code form.

Clearly BLACKBOX reusability comes about when the software has implemented a function of very general applicability. Good examples are, the C library, the standard SmallTalk methods, Prolog's term-expansion feature, and the UNIX LEX and YACC utilities.

WHITEBOX reusability depends crucially on the readability of code, which we believe is a particular strength of Prolog.

Case Study #1 has WHITEBOX and BLACKBOX characteristics.

L'Express: A Logical version of Espresso

A popular and important algorithm for the minimization of logic functions known as Espresso has been defined almost entirely in terms of matrix operations of an algebra of five values. Espresso is well established as the industry standard of two-level logic minimization technology. This algorithm is described in detail in Logic Minimization Algorithms for VLSI Synthesis by Brayton et al. (Kluwer Academic Press, 1984).

The space and time limitations of logic minimization problems coincide with large, but sparse, matrices. A direct implementation of this algorithm, in any language, is less than optimal if one does not take advantage of the sparsity of large logic function arrays. A sparse-matrix approach directly alleviates the space problem and may improve the time performance.

This chapter describes a ``logical'' version of Espresso. This program is not yet complete but describes the basic structure and pieces of the fundamental algorithms.

The primary motivations for developing this example are:

Software Engineering: The Prolog definitions of these algorithms are nearly as compact as the original pseudo-code.
Prototyping/Research: Since the Prolog description is compact and clearly defines the logic of the algorithms, it is more ammenable to experimentation and improvement.
Education: We demonstrate how an efficient implementation can be developed for an algorithm considered by many to be inappropriate for Prolog. In finding an efficient Prolog implementation for Espresso, we have established a general method for constructing efficient Prolog programs for sparse-matrix operations over finite algebras.

The Espresso Algorithm

Espresso-II is a widely used algorithm for the minimization of logic circuits in VLSI design.

The N-column matrix of boolean values (including don't cares) represents the function of N variables for which our combinatorial circuit must produce a true output. Each column translates directly into a network of inverters and AND gates with the number of inputs equal to the number of elements with 0 and 1 values. Each row corresponds to the input of an OR-gate in the final circuit.

Espresso uses a set of powerful heuristics to reduce these matrices to find the minimal circuit design for that function.

Briefly, there are three steps in each iteration of the minimization process, but a non-changing cost after any step directs the algorithm to enter the LAST_GASP phase which may move the current solution out of a local minimum.

The main, three-step, reduction algorithm is then re-tried until no minimization is possible in either the inner or by trying LAST_GASP.

The Espresso-II Algorithm, from page 56 of Logic Minimization Algorithms for VLSI Synthesis, by Brayton et. al., Kluwer Academic Publishers, 1984.

Procedure ESPRESSO-II (F, D)

/* Given F, a cover of {f,d,r} = (on-set, don't-care, off-set)
/* and D a cover of d, minimizes Phi(F)=(NPT,NLI,NLO)
/* where NPT is the number of cubes, NLI is the number of
/* input literals and NLO is the number of output literals.
/* Returns a minimized cover F and its cost Phi.

Begin
  F <= UNWRAP(F)
  R <= COMPLEMENT(F, D)
  Phi1 <= Phi2 <= Phi3 <= Phi4 <= COST(F)  /* Initialize Cost

LOOP1: (Phi,F) <= EXPAND(F,R)              /* F is prime and SCC-minimal
       if (First-Pass)                     /* Move essential primes
       (Phi,F,D,E) <= ESSENTIAL_PRIMES(F,D)/* into don't care set
       if (Phi == Phi1) goto OUT           /* Check termination criterion
       Phi1 <= Phi$
       (Phi, F) <= IRREDUNDANT_COVER(F,D)  /* F  is minimal cover
       if(Phi == Phi2) goto OUT            /* of prime implicants
       Phi2 <= Phi

 LOOP2: (Phi,F) <= REDUCE(F,D)             /* Each cube of F  replaced by
    if (Phi == Phi3) goto OUT              /* smallest cube containing
    Phi3 <= Phi                            /* its ``essential'' vertices.
    goto LOOP1

 OUT: if (Phi == Phi4) goto QUIT
    (Phi', F) <= LAST_GASP(F, D, R)        /* If no further improvement
    if (Phi == Phi') goto QUIT             /* terminate
    Phi1 <= Phi2 <= Phi3 <= Phi4 <= Phi'
    goto LOOP2

 QUIT: F <= union(F,E)                     /* Put essential primes E  back
    D <= D - E                             /* into cover and out of D
    (Phi, F) <= MAKE_SPARSE(F,D,R)         /* Concentrate on literals
 return(Phi, F)
 End

Espresso-II in Prolog

Global data and seven Go-tos may leave us wondering about correctness and termination. We can hide the (now local) state with DCGs.

costs(Phi1, Phi2, Phi3, Phi4)-pla(F,R,D,E)

Where each Phi is a cost(#ProductTerms, #Inputs, #Outputs) term and the call to lexpress/2 is simply:

lexpress(_-pla(F,_,D,_)), costs(_,_,_,Phi(min))-pla(Fm,_,_,_)

lexpress -->   unwrap,
               complement,
               init_cost,
               expand,
               essential_primes,
               iterate,
	       add_to_care,
               sub_from_dont_care,
               make_sparse.

 iterate   --> irredundant,
   ( cost_changed(1) -> reduction ; out).

 reduction --> reduce,
   ( cost_changed(2) -> expansion ; out).

 expansion --> expand, 
   ( cost_changed(3) -> iterate   ; out).

 out --> 
   ( cost_changed(4) -> last_gasp ; {true}).

 last_gasp -> reduce2,
   ( cost_changed(4) -> init_cost, iterate ; {true}).

Note: The actual code is smaller and clearer than the pseudo-code, but we're not happy with the use of if-then-else.

Espresso-II in Prolog (again)

lexpress -->   unwrap,
               complement,
               init_cost,
               expand,
               essential_primes,
               iterate,
	       add_to_care,
               sub_from_dont_care,
               make_sparse.


iterate   -->
     irredundant,
     ( decrease(irredundant) ->  reduce,
	 ( decrease(reduce)   ->  expand, 
	   ( decrease(expand)  ->  iterate   
	   ; out
	   )
	 ; out 
	)
     ; out
     ).

out --> 
    ( decrease(global) -> reduce2,
       ( decrease(global) -> init_cost, iterate
       ; {true}
       )
   ; {true}
   ).

This more clearly shows the three-step inner loop and the two-step outer loop, but the if-then-elses are worse than ever.

Cost Computations

decrease(reduce,      costs(I,R,E,G)-P,costs(N,R,E,G)-P) :- newcost(P,I,N).
decrease(expand,      costs(I,R,E,G)-P,costs(I,N,E,G)-P) :- newcost(P,R,N).
decrease(irredundant, costs(I,R,E,G)-P,costs(I,R,N,G)-P) :- newcost(P,E,N).
decrease(global,      costs(I,R,E,G)-P,costs(I,R,E,N)-P) :- newcost(P,G,N).

sum_costs(pla(F,_,_,_), Old, cost(NP, NI, NO)) :-
    sum_costs(PLA, 0, NP, 0, NI, 0, NO),
    cost(NP, NI, NO) @< Old.

sum_costs([], P, P, In, In, Out, Out).
sum_costs([c(Input,Output)|Cs],P0,P,In0,In,Out0,Out) :-
    P1 is P0 + 1,
    length(Input, LI), In1 is LI + In0,
    length(Output,LO), Out1 is LO + 0ut0,
    sum_costs(Cs, P1, P, In1, In, Out1, Out).

Espresso-II in Prolog (yet again)

lexpress -->   unwrap,
		complement,
		init_cost,
		expand,
		essential_primes,
		iterate,
		add_to_care,
		sub_from_dont_care,
		make_sparse.

iterate(done)     --> [].
iterate(Step)     -->
        step(Step),
        change_cost(Step, Next),
        iterate(Next).

step(reduce)      --> reduce.
step(expand)      --> expand.
step(irredundant) --> irredundant.
step(iterate)     --> init_cost, irredundant.
step(reduce2)     --> reduce2.

change_cost(reduce,     expand)      --> decrease(reduce), !.
change_cost(expand,     irredundant) --> decrease(expand), !.
change_cost(irredundant,reduce)      --> decrease(irredundant), !.
change_cost(iterate,    reduce)      --> decrease(global), !.
change_cost(reduce2,    iterate)     --> decrease(global), !.
change_cost(reduce2,    done)        --> [], !.
change_cost(      _,    reduce2)     --> [].

The termination of iterate//1 is now a little clearer, but we've gotten rid of the if-then-elses only to have a bunch of cuts (This is an obvious result when you consider how an if-then-else acts behaves like a local cut).

Espresso-II in Prolog (finally)

lexpress -->   unwrap,
               complement,
               expand,
               essential_primes,
               init_cost,
               iterate,
               dd_to_care,
               sub_from_dont_care,
               make_sparse.

iterate(done) -->  [].
iterate(Step) -->
        step(Step, Continue, Stop),
        check_cost(Step, CostChange),
        next_step(CostChange, Continue, Stop, Next),
        iterate(Next).

% step(CurrentStep, ContinueStep, StopStep) --> current_step.

step(irredundant,     reduce, reduce2) --> irredundant.
step(reduce,          expand, reduce2) --> reduce.
step(expand,     irredundant, reduce2) --> expand.
step(reduce2,        iterate,    done) --> reduce2.
step(iterate,         reduce,    done) --> init_cost,
					    irredundant.

check_cost(Step, CostChange) :-
    cost_value(Step, Previous, cost(NP, NI, NO), PLA),
    { sum_costs(PLA, cost(0, 0, 0), NewCost),
      compare(CostChange, Previous, NewCost)
    }.

next_step(=, Stop, _, Stop) --> [].
next_step(<, _, Cont, Cont) --> [].
next_step(>, _, Cont, Cont) --> [].

The five-step process (three inner and two outer) is represented as a state table, clearly showing the continuation and termination states. No cuts or if-then-elses!

Cost Computations

% cost_value(+Step, -Previous, -New, -PLA, +DataIn, -DataOut)

cost_value(reduce,     I, N, P, costs(I,R,E,G)-P,costs(N,R,E,G)-P).
cost_value(expand,     R, N, P, costs(I,R,E,G)-P,costs(I,N,E,G)-P).
cost_value(irredundant,E, N, P, costs(I,R,E,G)-P,costs(I,R,N,G)-P).
cost_value(iterate,    G, N, P, costs(I,R,E,G)-P,costs(I,R,E,N)-P).
cost_value(reduce2,    G, N, P, costs(I,R,E,G)-P,costs(I,R,E,N)-P).

sum_costs([], Cost, Cost).
sum_costs([c(Input,Output)|Cs], cost(P0,In0,Out0), Cost).
   length(Input, LI),
   length(Output,LO),
   P1 is P0 + 1,
   In1 is LI + In0,
   Out1 is LO + 0ut0,
   sum_costs(Cs, cost(P1,In1,Out1), Cost).

Prolog is able to abstract and clarify fragments of code at the very lowest levels of the algorithm as well as the highest. Some of the operations in this algorithm are trivial in Prolog.

The predicate add_to_care//0 computes the union of the logic function cover and the essential prime factors. What sounds like a complex and subtle computation turns out to be append/3 in the representation we have chosen.

add_to_care(C-pla(F0,R,D,E),C-pla(F,R,D,E)) :- append(F0, E, F).

And the cost initializer seen before is easily defined in terms of the sum_costs/7 predicate we have just seen.

init_cost(_-PLA, costs(C,C,C,C)-PLA) :-
    sum_costs(PLA, 0, NP, 0, NI, 0, NO),
    C = cost(NP, NI, NO).

More generally, Espresso computations involve things like measuring the Hamming distance between two rows (``cubes'' in boolean N-space), and computing intersections.

Cube Consensus

consensus(C, D, Consensus) :-
    distance(C, D, In, Out),
    consensus1(In, Out, C, D, Consensus).

consensus1(0, 0, C, D, Consensus) :-
    cube_intersect(C, D, Consensus).
consensus1(1, 0, C, D, Consensus) :-
    raised_intersection(C, D, Consensus).
consensus1(0, 1, C, D, Consensus) :-
    lower_outputs(C, D, Consensus).

Here we justify the use of if-then-else because in simple arithmetic tests, the creation of choice points can be completely avoided.

Co-Factors of a single cube.

An example of the most fundamental of the low-level operations is the computation of the Shannon co-factor of a matrix, relative to a particular positive or negative variable.

The computation of the co-factor corresponds to the following definition from [Brayton84] (They used 3 and 4 for the 0s and 1s in the output terms).

When the Factor p is just a single coordinate, this becomes:

co_cover([],_,[]).
co_cover([C|Cs],P,[X|Xs]) :-
   cofactor(C,P,X),
   !,
   co_cover(Cs,P,Xs).
co_cover([_|Cs],P,Xs) :-
   co_cover(Cs,P,Xs).

cofactor([],_,[]).
cofactor([C|Cs],P,Xs) :-
    ( C  =:=  P   -> Xs = Cs
    ; C>>1 > P>>1 -> Xs = [C|Cs]
    ; C>>1 < P>>1 -> Xs = [C|X1s],
		     cofactor(Cs,P,X1s)
    ).

Shannon Co-Factors of a ``Cover''

Our ``cover'' for a function is the entire matrix, where each row is a ``cube'' in the space of the boolean variables. We frequently need to compute both positive and negative Shannon co-factors of the entire matrix.

cofactors(Cover,Var,C1,C0) :-
    V1 is Var<<1 / 1,
    V0 is Var<<1 / 0,
    co_cover(Cover,V1,C1),
    co_cover(Cover,V0,C0).

And a more general version of the cofactor routine is provided to accept arbitrary cubes, rather than a single variable.

gen_cofactor([],_,[]) :- !.

gen_cofactor(_,[],[]) :- !.

gen_cofactor([C|Cs],[F|Fs],Xs) :-
    ( C>>1 > F>>1 -> evaluate(default,F,X),
		     gen_cofactor(Cs,F,X1s)
    ; C>>1 < F>>1 -> Xs = [C|X1s],
		     gen_cofactor(Cs,F,X1s)
    ; evaluate(C,F,X) -> Xs = [X|X1s],
			 gen_cofactor(Cs,F,X1s)
    ; gen_cofactor(Cs,Fs,Xs)
    ).

The purpose of this is not to dazzle you with terminology and algorithms from the world of logic minimization, but to show how we can create an (efficient) implementation which is still very close to a mathematical description of the problem. Once you understand the matrix encoding, there is very little implementation detail to stand in the way of understanding the algorithm.

This implementation of Expresso is described in less than 1000 lines of Prolog code, compared with 11,000 lines of C code.

End of Case Study #1

Lessons from L'EXPRESS Development

Bit-twiddling in Prolog is okay -- but there must be a real benefit. In particular, if it is ``bits'' that the application is talking about, then it is ``bits'' we want to see in the implementation.
Good Logic programs are state machines without state, or rather where the state is a parameter that is passed along with the point of computation. This can be done invisibly with DCG's. If you use global state then you are not really using Prolog.
Asserts and retracts are generally not okay. Except maybe in a first draft. It may be okay to be a bit lazy while prototyping, but if you let it go on to long, and you begin to feel that you have an ``investement'' in the poorly written code, then of course you will be correct. But it will be a bad investement.
Prolog re-implementations of C programs require between $1over 10$ and $1over 20$ as much source code, and are much easier to understand. This translates to between $1over 100$ and $1over 400$ as much maintenance effort.
If your code contains too much punctuation (e.g. !), it may become extinct the next time the Equilibrium is ``Punctuated''.

Case Study #2: BLACKBOX reusability

MULTI/PLEX: A Tool for Formal Languages

Wouldn't it be great if we could read and write logical forms of all formal languages as easily as we can read/1 and write/1 Prolog clauses.

Furthermore, suppose we could avoid writing the readers and writers for all the (formal) languages in the world and good get both input and output functionality from a single, declarative representation of each language's grammar.

MULTI/PLEX is a combination of two general-purpose tools which, when combined, result in a language-independent translation system. The first tool is a long-overdue version of the UNIX lex [Lesk75] program for Prolog. It is tempting, though misleading, to describe the second tool as the Prolog counterpart of YACC [Johnson78] program. The obvious objection that Prolog has little need for a parser generator is answered by pointing out a few additional features. From a single BNF-style specification of a language, MULTI creates both a parser and a pretty-printer. Furthermore, parser/generators can be constructed from the textual user specifications at run time, avoiding intermediate compilation steps.

By combining these tools in the program MULTI/PLEX, a language-independent translator is created which is driven only by the information in the user-provided language specification files. The three goals of this work can be summed up as:

Add a LEX-like tool to the Prolog development environment (PLEX).
Provide automatic transformation of a single grammatical description of a language into a parser and a pretty-printer (MULTI).
Develop a translation system which can operate from external high-level descriptions of the languages involved.

The MULTI/PLEX Program

Prolog can be at its best when used to manipulate formal languages, but with so many languages around, we will be spending a lot of time writing parsers and pretty-printers.

We begin by describing a complete application, consisting of only 12 lines of code, which uses the MULTI/PLEX module as a black box. This program constructs and then executes a translator for a pair of formal languages. The specifications (grammars) for these languages must construct identical parse-trees for this naive form of MULTI/PLEX to work correctly. The input to this program consists of high-level specifications which define the lexical and syntactical structure of the languages involved.

The seven phases of MULTI/PLEX

Consult specification and construct Parser
Construct a Tokenizer
Execute the Tokenizer
Execute the Parser
Consult specification and construct Printer
Execute the Printer

The MULTI/PLEX Translator

:- use_module(multi). % includes plex

main(InFile, OutFile) :-
    name_relation(InFile, Spec,  Lexer, Parser, _),
    consult(Spec),  % CREATES PARSER AND TOKENIZER
    see(Input),
    get_file(Chars),
    call(Lexer, Chars, Tokens),
    call(Parser, Data, Tokens, []),

  %%%% RECONCILE DIFFERENCES BETWEEN PARSE-TREES

    name_relation(OutFile, OutSpec,  _, _, Printer),
    consult(OutSpec), % THIS CREATES THE PRINTER
    call(Printer, Data, OutChars, []),
    write_list(OutChars, 0).

Prolog Tokenizers: The Problem

It is quite easy to write tokenizers in Prolog by following O'Keefe's recipe for defining simple finite-automata [OKeefe90]. However, this technique has two drawbacks. One, it is a repetitive task which must be re-verified (at least partially) for each new tokenizer that is created. Two, many of the predicates require cuts to remove unnecessary choice points or have lengthy if-then-else chains to distinguish characters.

If we were to write 128 clauses for every transition, deterministically indexing on the entire ASCII character set, we would avoid the creation of choice points and eliminate the need for cuts or if-then-else constructs. Unfortunately, manually writing 128 clauses for each character class is tedious to the point of being impractical.

Prolog Tokenizers: The Solution

A PLEX specification defines patterns, goals to call when the pattern has been recognized and a term representing the object to be passed back. If the atom text appears in a goal, it will be replaced by the list of characters matching the pattern (like yytext in LEX).

lang lexicon
 "[ tn]+" is    [];
 "begin"    is    begin ;
 "end"      is    end ;
 "."       is    '.';
 ";"        is    ';';
 "*"        is    '*';
 "+"        is    '+';
 "-"        is    '-';
 "[0-9]+"   is integer(N)
	     if name(N,text);

 "[0-9]+.[0-9]+([eE][+-]?[0-9]+)?" is float(F)
				     if name(F,text);

 "[a-zA-Z_][a-zA-Z_0-9]*"  is identifier(N)
			    if name(N,text).

Using Term Expansion

I did not want to solve the ``Tower of Babel'' problem by defining a new language, so I used Prolog term_expansion/2 to define a dialect of Prolog for lexical and BNF specifications. Although I use a few operator definitions, I strongly advice novice programmers to guard againts Operatitis, the disease of defining operators to make Prolog programs more like natural language. Adding operators for cosmetic reasons will make your code less readable by Prolog programmers.

term_expansion(lexicon(Lang,Rules),Module:Clauses):-
 ( plex:plex(Lang, Rules, Lexer) -> true
 ; user:message_hook('Lexicon error'(Lang), _, _)
 ),
 Clauses = [(:-no_style_check(discontiguous))|Lexer],
 strings:concat_atom([Lang,'_parse'], Module), 
 Compile = Module:Clauses.

term_expansion((A::=B), Clauses) :-
 ( multi:multiplex_expansion(A, B, Clauses) ->  true
 ; user:message_hook('Syntax error in BNF'(Lang),_,_)
 ),

Operators/Data Structures

Code to call rule compiler and establish goals to be called in the acceptor state.

setup_rules(X is Result if Goal, In, Out) :-
      rule(Classes,[end(Goal,Result)], X, []),
      append(Classes, Out, In).


setup_rules(X is Result, In, Out) :-
      rule(Classes,[end(true,Result)], X, []),
      append(Classes, Out, In).


setup_rules((R;Rs)) -->
      setup_rules(R),
      setup_rules(Rs).

Intermediate Forms

 "[0-9]+"  is integer(N) if name(N,text);

is turned into:

class("0123456789",+,[end(name(N,text),integer(N))])

and when converted to primitive classes this will be:

class("0123456789",one,
   [class("0123456789",*,
      [end(name(N,text), integer(N))]) 
   ])

Finite-State Automata for Recognizing Tokens

Current(C,[C2|Cs],[C|T],Text,Token,Rest) :- % On Set
     Next(C2, Cs, T, Text, Token, Rest).

Current(C, Cs, Accum, Text, Token, Rest) :- % Off Set
     Next(C, Cs, Accum, Text, Token, Rest).

Accept(C, Cs, [], Text, Token, [C|Cs]) :- Goal.

It is interesting to see this obvious and natural application of a state-transition design after struggling at length with a very different application and finally arriving at a simple state-transition model.

Next time, perhaps we should start by writing down the state-transition model for the program rather than ``random'' psuedo-code.

MULTI: Bi-Directional Grammars

xyz_file := [ library, Name ], cells,
	     update(type,library),
	     update(name,Name).


cell :=  [ Type, '(' ], arguments(Params),
      down(Name),
	      update(parameters,Params),
	  [ begin, Name ], newline,
	  indent,
	    statements,
	  undent,
	  [ end ], optional([Name]), [';'], newline,
      up.


statements := value_attribute, newline,
	       statements.

statements := cell, newline,
	       statements.

statements := [].

Bi-Directional Grammars (cont.)

value_attribute :=
      [ Name, '=' ],
	value(V),
      [ ';' ], newline,
      update(Name, V).

value(Vs) := [ '(' ], arguments(Vs).
value(V)  := [  V  ].

arguments([])     := [ ')'    ].
arguments(V)      := [  V, ')'].
arguments([V|Vs]) := [  V     ], more_values(Vs).

End of Case Study #2

Lessons from MULTI/PLEX

Prolog is the almost certainly the best programming language for manipulating languages.
The largest productivity gains have always come the development of good representations, whether this is a data structure or a language.
Prolog re-implementations of C programs require between 1/10 and 1/20 as much source code, and are much easier to understand. This translates to between 1/100 and 1/400 as much maintenance effort.
Isn't it great now that we can read and write logical forms of all languages (almost) as easily as we can read/1 and write/1 Prolog clauses.

Application Composition and Decomposition

An interesting thing about the two (fairly large) software projects examined is that there is no apparent connection between them. This is good, since it makes it more likely that they can be combined into a larger program.

The data-structures that these two programs must share appear in the L'EXPRESS source code and the language specifications which are independent of the MULTI/PLEX code.

Learn to think about the components of large applications as tools and think of existing tools as potential components of a large application. We can compose elements like this to get extremely powerful programs with excellent maintainability (Both L'Express and MULTI/PLEX exist as a stand-alone programs and as library modules).

Compose Programs for functionality

De-Compose for Maintenance it (Re-Writing)

Multi-Lingual Logic Synthesis

runtime_entry(start) :-
    unix(argv(CmdLine)),
    assert(type(fd)),     /* default */
    options(CmdLine, File),
    multplex_input(File, PLA0),
    type(Type),
    compute_other(Type, PLA0, PLA1),
    lexpress(_-PLA1, costs(_,_,_,Cost)-PLAMin),
    format(user_error,"Final PLA cost:~q~n",[Cost]),
    change_suffix(File, '.po', OutFile),
    multiplex_output(OutFile, PLAMin).

options([], _).
options([Op|Ops], File) :-
      option(Op, Ops, Rest),
      !,
      option(Rest, File).
options([File|Ops], File) :-
      option(Ops, _).

option('-Decho',   T,  T) :- assert(echo).
option('-eness',   T,  T) :- assert(ness).
option('-t', [Type|T], T) :- retract(type(_)),
			      assert(type(Type)).

Main Routine for L'Express (again)

runtime_entry(start) :-
    unix(argv(CmdLine)),
    options(CmdLine, File, fd, InType),
    multplex_input(File, PLA0),
    compute_other(InType, PLA0, PLA1),
    lexpress(_-PLA1, costs(_,_,_,Cost)-PLAMin),
    format(user_error,"Final PLA cost:~q~n",[Cost]),
    change_suffix(File, '.po', OutFile),
    multiplex_output(OutFile, PLAMin).


options([], _) --> [].
options([Op|Ops], File) -->
      option(Op, Ops, Rest),
      !,
      option(Rest, File).
options([File|Ops], File) -->
      option(Ops, File).

option('-Decho',   R, R) --> {assert(echo)}.
option('-eness',   R, R) --> {assert(ness)}.
option('-t', [Type|R],R, _, Type).

Performance Monitoring

A Poor Man's Profiler

The best performance monitors are those that are built into Prolog Systems. However, here is another way to measure the system resources used by a particular call.

:- op(900, fx, '$').

$ G :-
      init_measure(G),
      ( G
      ; finish_measure(G),
	 fail
      ),
      backtrack_measure(G),
      finish_measure(G).

backtrack_measure(_).
backtract_measure(G) :- 
      init_measure(G),
      fail.

init_measure(G) :-
     initial_values(G, Values),
     findall(stat(Type,Value), 
	      (statistic_type(Type,_),
	       get_statistics(Type, Value)),
	      Before),
     assert(measurement(G, Values, Before).

initial_values(G, Values) :-
     retract(measurement(G, Values,_),
     !.
initial_values(G, Values) :-
     findall(stat(Type,0),
	      statistic_type(Type,_),
	      Values).

finish_measure(G) :-
     findall(stat(Type,Value), 
	      (statistic_type(Type,_),
	       get_statistics(Type, Value)),
	      After),
     retract(measurement(G, SoFar, Before).
     combine_statistics(Before, After, SoFar, Total),
     assert(measurement(G, Total, After)).

combine_statistics([], [], [], []).
combine_statistics([A|As],[B|Bs],[P|Ps],[T|Ts]) :-
     combine_statistic(A, B, P, T),
     combine_statistics(As, Bs, Ps, Ts).

combine_statistic(stat(T,Before), stat(T,After),
		   stat(T,Prev),  stat(T, Total)) :-
      statistic_type(T, Op),
      combine(Op, Before, After, Prev, Total).



combine(add, Before, After, Prev, Total) :-
     Total is Prev + (After - Before).
combine(max, _, After, Prev, Total) :-
     compare(Op, After, Prev),
     maximum(Op, After, Prev, Total).



maximum(<, A, B, B).
maximum(=, A, _, A).
maximum(>, A, _, A).

statistic_type(runtime,            add).
statistic_type(global_stack,       max).
statistic_type(memory,             max).
statistic_type(local_stack,        max).
statistic_type(trail,              max).
statistic_type(garbage_collection, add).


get_statistic(runtime, Value) :-
      statistics(runtime, [_,Value]),
      !.
get_statistic(_, Value) :-
      statistics(runtime, [Value|_]).


program :- 
	$ work(X),
	  write(X), nl,
	$ work(Y),
	  write(Y), nl.

The Prolog Programmer Who Knew Too Much

We are going to consider how a typical Prolog programmer might spend his day. Like most Prolog programmers, ours has created a predicate which correctly specifies and performs the necessary function, and he would now like to improve its performace.

Identifying Constant Columns

An important operation in L'Express is the identification of the constant columns of a matrix. That is, any number which appears in all rows of the matrix.

constant_columns(
  [[4, 9, 23, 55, 63, 107, 239],
   [5, 9, 31, 55, 60, 73, 82, 99, 107],
   [9, 23, 55, 107, 128, 512],
   [6, 9, 13, 17, 22, 55, 63, 107 ]], CC).

CC = [ 9, 55, 107 ]

Constant Column #1

An obvious logical formulation of this problem is simply to find those columns which are members of every row.

constant1(M,Cols) :-
     findall(Col, constant(M,Col), Cols).


constant([], _).
constant([R|Rs], C) :-
    member(C, R),
    constant(Rs, C).

member(H,[H|_]).
member(H,[_|T]) :-
    member(H,T).

Constant Column #2

constant2([R|Rs], Cols) :-
    findall(X, (member(X, R),
		 column(Rs, X)), Cols).

column([], _).
column([R|Rs], I) :-
     item(I, R),
     column(Rs, I).

item(I, [H|T]) :-
     ( I =:= H -> true
     ; I > H   -> item(I, T)
     ).

Constant Column #3


constant3([R|Rs], Cols) :-
    constant3(R, Rs, Cols).

constant3([], _, []).
constant3([I|Is], Cs, Cols) :-
    column(Cs, I),
    !,
    Cols = [I|Xs],
    constant3(Is, Cs, Xs).
constant3([_|Is], Cs, Cols) :-
    constant3(Is, Cs, Cols).



column([], _).
column([R|Rs], I) :-
     item(I, R),
     column(Rs, I).

item(I, [H|T]) :-
     ( I =:= H -> true
     ; I > H   -> item(I, T)
     ).

Constant Column #4

constant4([R|Rs], Cols) :-
    constant4(Rs, R, Cols).

constant4([], Cols, Cols).
constant4([C|Cs], Ref, Cols) :-
    intersect4(C, Ref, Result),
    constant4(Cs, Result, Cols).

intersect4(_, [], [])  :- !.
intersect4([], _, []).
intersect4([C|Cs], [P|Ps], Xs) :-
    !,
    (C =:= P ->  Xs = [C|X1s],
		  intersect4(Cs,Ps,X1s)
    ; C > P  ->  intersect4([C|Cs], Ps, Xs)
    ;            intersect4(Cs,[P|Ps], Xs)
    ).

Constant Column #5

constant5([R|Rs], Cols) :-
    constant5(Rs, R, Cols).

constant5([], Cols, Cols).
constant5([C|Cs], Ref, Cols) :-
    intersect5(C, Ref, Result),
    constant5(Cs, Result, Cols).

intersect5([C|Cs], [P|Ps], Xs) :-
    !,
    (C =:= P ->  Xs = [C|X1s],
		  intersect5(Cs,Ps,X1s)
    ; C > P  ->  intersect5([C|Cs], Ps, Xs)
    ;            intersect5(Cs,[P|Ps], Xs)
    ).
intersect5(_, [], [])  :- !.
intersect5([], _, []).

Constant Column #6

constant6([R|Rs], Cols) :-
    constant6(Rs, R, Cols).

constant6([], Cols, Cols).
constant6([C|Cs], Ref, Cols) :-
    intersect6(C, Ref, Result),
    constant6(Cs, Result, Cols).

intersect6([], _, []).
intersect6([C|Cs], Ref, Result) :-
    intersect6(Ref, C, Cs, Result).

intersect6([], _, _, []).
intersect6([P|Ps], C, Cs, Xs) :-
    (C =:= P ->  Xs = [C|X1s],
		  intersect6(Cs,Ps,X1s)
    ; C > P  ->  intersect6(Ps, C, Cs, Xs)
    ;            intersect6(Cs,[P|Ps], Xs)
    ).

Profile Data

In scenerio 1, there are only a few constant columns near the left side of the matrix.

      constant1    687
      constant2    216        constant2a   170
      constant3    218        constant3a   172
      constant4    105
      constant5    108
      constant6    126

But in scenerio 2, the rightmost column in the matrix is constant, eliminating the gains of our ``clever'' algorithms.

      constant1     704
      constant2    1030      constant2a    782
      constant3    1032      constant3a    784
      constant4    1427
      constant5    1397
      constant6    1672

Constant1 is much too expensive in scenario 1, so this is probably unacceptable, but constant2 is only twice as expensive as the best version and it is second best in scenario 2.

Using compare/3

item(I, [H|T]) :-
     ( I =:= H -> true
     ; I > H   -> item(I, T)
     ).

New version of item/2 using compare/3 and indexing.

item(I, [H|T]) :-
     compare(Op, I, H),
     item(Op, I, T).

item(=, _, _).
item(>, I, T) :-
      item(I,T).

Final version of Constant Column #2a

constant2([R|Rs], Cols) :-
    findall(X, (member(X, R),
		 column(Rs, X)), Cols).

column([], _).
column([R|Rs], I) :-
     item(I, R),
     column(Rs, I).

item(I, [H|T]) :-
     compare(Op, I, H),
     item(Op, I, T).

item(=, _, _).
item(>, I, T) :-
      item(I,T).

SICStus Profiler vs. Poor Man's

	      SICStus      PoorMan (1000x)
  constant1    687             290
  constant2a   170              83
  constant3a   172              75
  constant4    105              47
  constant5    108              45
  constant6    126              39

	       SICStus    PoorMan (1000x)
  constant1     704            257
  constant2a    782            371
  constant3a    784            362   
  constant4    1427            542   
  constant5    1397            492
  constant6    1672            520

Summary

Consider rewriting code at regular intervals
Read O'Keefe
Don't be afraid to twiddle bits in Prolog
How many bits do you use...
- In Indexing
- In Data Structures
Don't complain about too many Arguments; examine them.
Integrate documentation with programming
Decompose: analyze major subcomponents for generality and reusability
Whitebox
Blackbox
Do Performance Measurement Last
Eliminate cuts and if-then-elses
Eliminate append/3 (as in non-naive reverse)
Eliminate non-determinism; think of your program as a state-machine.
Use DCGs instead of assert to add ``state'' to your program.

Peter Reintjes
Tue Sept 1 15:21:00 EDT 1994