The path towards what eventually became known as Prolog starts with Alan Robinson, who around 1965 developed his resolution method (which includes unification) as a computational approach to theorem proving. The next step occurred in the early 1970s, when a group headed by Robert Kowalski in the AI Department of the University of Edinburgh was using interpreters for logic in experiments on natural language understanding. Although this work was very interesting, the programs were too inefficient to be practical. A little later, Alain Colmerauer in France built a much more efficient interpreter, and called the system Prolog (for "programming in logic"); he also introduced some features which departed significantly from pure logic, such as the "cut". In the 1980s, David Warren at SRI designed an efficient abstract machine which became the basis for all subsequent Prolog compilers. Logic programming was the focus of the Japanese Fifth Generation Project, which aimed to build highly efficient machines for logic programming languages, and use them for applications that required much greater intelligence than was then available. The fact that the Fifth Generation Project was a very expensive failure contributed to a decline in interest in logic programming for large applications. However, logic programming remains one of the best approaches for natural language processing, as well as a very interesting adventure in programming language design. Note that all these projects were AI oriented; as usual, this cultural context helps to understand the design decisions that went into Prolog, such as its reuse of the Lisp data structure for lists.
As a footnote, Alfred Horn, for whom Horn clauses are named, had nothing to do with logic programming; he was a professor of logic at UCLA who in 1951 wrote paper using the sentences that now bear his name for reasons having little to do with computer science. As a second footnote, it seems to me rather misleading to call Prolog a "logic programming" language, since it departs rather far from logic; I would rather have had it called a "relational programming" language, because it is the use and manipulation of relations that is most characteristic of its programming style.
BinProlog is the "official" version of Prolog for this class. You should read the short "getting started" Notes on BinProlog, where many important and useful facts are given. However, if you prefer to use a different version of Prolog for exercises, that is certainly permissible.
We may divide programming languages into procedural languages, and non-procedural languages, where the first includes imperative languages (where assignment is fundamental, as in C, Java, etc.) and functional languages (where functions are first class citizens, as in ML, LISP, etc.). Non-procedural languages are not just what's left over, but are supposed to have declarative flavor, i.e., to (more or less) allow programers to state a problem without stating how to solve it. Most non-procedural languages are logic languages, in which the syntax and semantics are supposed to be based on some form of logic. For example, Prolog is (partially) based on Horn clause logic, and OBJ is based on equational logic. However, SETL is based on set theory, which is not a form of logic, so it is a non-logical and non-procedural language.
The following may help you to convert from thinking in the imperative programming paradigm to that of the logic paradigm; this material was adapted by Fox Harrell from The Art of Prolog, by Leon Sterling and Ehud Shapiro.
delete
procedure which removes all occurrences
of some element from a list. ?- delete([a,b,c,b],b,X)
,
and its answer, which in this case should be X = [a,c]
.[X|Xs]
. There are two
possibilities. In the second case since the head of the list is not equal to X we want to keep it around. In this case an appropriate rule is
There is an interesting reason why it is difficult to explain cut well: whereas Prolog is supposed to be a non-procedural language, cut cannot be understood in non-procedural terms; i.e. cut is procedural, and thus is in conflict with the basic design philosophy of the language; i.e., cut is a kind of kludge, and kludges are usually difficult to explain. Also, one can do some rather surprising things with cut, which at first may seem counter-intuitive, and are easy to get wrong.
Cut is a nullary predicate, denoted !, which always succeeds when the search reaches it, but which also has a rather drastic side effect on the search. I like to think of a cut in a clause as a way for the programmer to tell Prolog something like the following:
p1(X)
and p2(X)
, which are mutually
exclusive, i.e. exactly one of them is true for each value of X
;
an example is predicates for even and odd integers. Now suppose we have the
following:
X
is even, we should check a(X)
, but there
is no need to check the second q(X)
clause, because we know it
will fail. Therefore it is safe to write the following, which is also more
efficient:
X
is even, the first literal of the first clause succeeds,
so the cut prevents backtracking on q(X)
or
even(X)
, and the second clause will not be tried.
Here is another example, which is a little more complex, and which also illustrates the Prolog search strategy:
a(X)
, the first clause will match, giving us
a tentative value 1
for X
, provided the body
b
of the clause succeeds; next, the third clause matches
b
, and raises the goal c
, which fails; because of
the cut, Prolog will not undo its commitment to the third clause for
b
, and so it will never find the fifth clause, which would have
succeeded; therefore it backtracks all the way to the top, undoes the first
clause, and tries the second, now with a tentative value 2
for
X
and a subgoal e
, which then succeeds. So in this
case, the cut causes Prolog to miss a solution. Of course, without the cut,
Prolog will find both solutions. (Note that the fifth clause would not be
needed for some other versions of Prolog, which automatically fail if there is
no clause for some literal.) Here is a link to the
BinProlog output for this example, which includes a trace.
Now let's look at a couple of stranger things that one can do with cut. The first is something that rather excited researchers when they first discovered it, a way of introducing exceptions into Horn clause rule sets. This was considered important because exceptions are common in knowledge representation, and knowledge representation was supposed to be one of the most important applications of Prolog. This example says that all birds fly, except for penguins, which are birds, but do not fly (the version of this example on page 201 of Stansifer is wrong):
fly(sparrow)
will succeed in the usual
way, but fly(penguin)
will fail due to the fourth clause, and the
cut will prevent to fifth clause from being tried, so penguins will not be
claimed to fly, even though they are birds. The query bird(X)
will find all three birds, but the query fly(X)
does not work
correctly (at least, not in BinProlog); this is due to the conflict between
the logic and procedural features.
Another odd looking use of cut is to implement so-called negation by failure, the code for which is as follows:
not
is fly(penguin)
. The above
query succeeds by showing that fly(penguin)
fails. Here's how it
works: the first clause for not
will try to make
fly(penguin)
succeed; if it did succeed, then the subsequent cut
and fail
would make that clause fail without trying the next
not
clause. However, since fly(penguin)
fails, the
second not
clause is tried; it matches, and causes
not(fly(penguin))
to succeed. On the other hand, the query
We can look at unification as solving systems of equations, using a process rather like what is called Gaussian elimination in linear algebra. Let's do some simple examples to illustrate the idea. Suppose the two terms to be unified are
s
such that
X, Y, Z
such that
f
, so this is OK. Then in order for
the two terms to be equal, the following equalities have to hold:
Y
in terms of
X
, while the second is solved in the same way that we started
this process, checking that the head functors are equal, and then noting that
for the two terms to be equal, we must have
Y = g(X)
, from the second of these
equations we get
X = Z
, which we have already
deduced. So putting everything together, we get
X
, and then the values of Y, Z
are
uniquely determined from that. In fact, the above is a substitution,
and moreover, it is the most general substitution which is a solution
to the original unification problem, in the sense that any other solution is a
substitution instance of this one. The existence of such solutions when there
are any solutions at all, is the main result about unification, first proved
by Alan Robinson.
The above procedure is essentially the unification algorithm devised by Montanari and Martelli; it is linear in the size of the problem. Notice that if the problem had been slightly different, there would have been no solution. For example, if the original terms had been
Of course, unification can also fail in simpler ways, for example, if the heads of subterms fail to match, as in
f(X, Y)
and h(Y, g(Z))
cannot be
unified because f
is unequal to h
.
The algorithm used for solving certain queries in BinProlog actually goes beyond unification, in that it introduces new variables, and searches for values for them; this is called narrowing. Examples of this can be given using the the code in pl/addn.html, such as
Difference lists are an important idiom in Prolog programming; they
are a tricky way of representing lists that allow for certain kinds of
computation on lists to go much faster, including parsing. First, the
intuition behind this idiom is to represent a list as the "difference" of two
other lists. As an analogy, the number 6
can be represented as
the difference 9 - 3
, or 13 - 7
, or more generally,
(6 + Y) - Y
for any number Y
. Similarly, we can
represent the list [1, 2, 3]
as the difference between [1,
2, 3, 4, 5]
and [4, 5]
, or between [1, 2, 3, a, b,
c]
and [a, b, c]
, or more generally between [1, 2, 3
| Y]
and Y
, for any list Y
. Symbolically, we
may write
dl
is particularly common. Thus, the above representations of
[1, 2, 3]
would be written
One very nice example of the use of this idiom is to get a very fast version of append for difference lists; the definition is also very simple.
X
and Y
with the difference of
Y
and Z
is the difference of X
and
Z
. In the analoguous arithmatic notation, we may write this as
It is common to use difference lists for parsing in Prolog, instead of using append, because difference lists can be made to do appends implicitly, which is much more efficient. This is illustrated in the linked Prolog code for parsing a simple English language grammar; and here is a link to its output. A good reference for difference lists and their use in parsing is The Art of Prolog, by Leon Sterling and Ehud Shapiro (MIT Press).
There is a small bug on page 458, on the 4th line from the bottom, "L = [a]" should be "X = [a]" (or else previous X's should be L's).