Go to the previous, next section.
A Prolog program consists of a sequence of sentences or lists of sentences. Each sentence is a Prolog term. How terms are interpreted as sentences is defined below (see section Syntax of Sentences as Terms). Note that a term representing a sentence may be written in any of its equivalent syntactic forms. For example, the 2-ary functor `:-' could be written in standard prefix notation instead of as the usual infix operator.
Terms are written as sequences of tokens. Tokens are sequences of characters which are treated as separate symbols. Tokens include the symbols for variables, constants and functors, as well as punctuation characters such as brackets and commas.
We define below how lists of tokens are interpreted as terms (see section Syntax of Terms as Tokens). Each list of tokens which is read in (for interpretation as a term or sentence) has to be terminated by a full-stop token. Two tokens must be separated by a layout-text token if they could otherwise be interpreted as a single token. Layout-text tokens are ignored when interpreting the token list as a term, and may appear at any point in the token list.
We define below defines how tokens are represented as strings of characters (see section Syntax of Tokens as Character Strings). But we start by describing the notation used in the formal definition of Prolog syntax (see section Notation).
C --> F1 | F2 | F3
which states that an entity of category C may take any of the alternative forms F1, F2, F3, etc.
sentence --> module : sentence
| list
{ where list is a list of sentence }
| clause
| directive
| grammar-rule
clause --> non-unit-clause | unit-clause
directive --> command | query
non-unit-clause --> head :- body
unit-clause --> head
{ where head is not otherwise a sentence }
command --> :- body
query --> ?- body
head --> module : head
| goal
{ where goal is not a variable }
body --> module : body
| body -> body ; body
| body -> body
| \+ body
| body ; body
| body , body
| goal
goal --> term
{ where term is not otherwise a body }
grammar-rule --> gr-head --> gr-body
gr-head --> module : gr-head
| gr-head , terminals
| non-terminal
{ where non-terminal is not a variable }
gr-body --> module : gr-body
| gr-body -> gr-body ; gr-body
| gr-body -> gr-body
| \+ gr-body
| gr-body ; gr-body
| gr-body , gr-body
| non-terminal
| terminals
| gr-condition
non-terminal --> term
{ where term is not otherwise a gr-body }
terminals --> list | string
gr-condition --> ! | { body }
module --> atom
term-read-in --> subterm(1200) full-stop
subterm(N) --> term(M)
{ where M is less than or equal to N }
term(N) --> op(N,fx) subterm(N-1)
{ except in the case of a number }
{ if subterm starts with a (,
op must be followed by layout-text }
| op(N,fy) subterm(N)
{ if subterm starts with a (,
op must be followed by layout-text }
| subterm(N-1) op(N,xfx) subterm(N-1)
| subterm(N-1) op(N,xfy) subterm(N)
| subterm(N) op(N,yfx) subterm(N-1)
| subterm(N-1) op(N,xf)
| subterm(N) op(N,yf)
term(1000) --> subterm(999) , subterm(1000)
term(0) --> functor ( arguments )
{ provided there is no layout-text between
the functor and the ( }
| ( subterm(1200) )
| { subterm(1200) }
| list
| string
| constant
| variable
op(N,T) --> name
{ where name has been declared as an
operator of type T and precedence N }
arguments --> subterm(999)
| subterm(999) , arguments
list --> []
| [ listexpr ]
listexpr --> subterm(999)
| subterm(999) , listexpr
| subterm(999) | subterm(999)
constant --> atom | number
number --> unsigned-number
| sign unsigned-number
| sign inf
| sign nan
unsigned-number --> natural-number | unsigned-float
atom --> name
functor --> name
By default, SICStus Prolog uses the ISO 8859/1 character set standard, but will
alternatively support the EUC (Extended UNIX Code) standard. This is
governed by the value of the environment variable SP_CTYPE
(see section Getting Started).
The character categories used below are defined as follows in the two standards:
token --> name
| natural-number
| unsigned-float
| variable
| string
| punctuation-char
| layout-text
| full-stop
name --> quoted-name
| word
| symbol
| solo-char
| [ ?layout-text ]
| { ?layout-text }
quoted-name --> ' ?quoted-item... '
quoted-item --> char { other than ' or \ }
| "
| \ escape-sequence
word --> small-letter ?alpha...
symbol --> symbol-char...
{ except in the case of a full-stop
or where the first 2 chars are /* }
natural-number --> digit...
| base ' alpha...
{ where each alpha must be less than the base,
treating a,b,... and A,B,... as 10,11,... }
| 0 ' char-item
{ yielding the character code for char }
char-item --> char { other than \ }
| \ escape-sequence
base --> digit... { in the range [2..36] }
unsigned-float --> simple-float
| simple-float exp exponent
simple-float --> digit... . digit...
exp --> e | E
exponent --> digit... | sign digit...
sign --> - | +
variable --> underline ?alpha...
| capital-letter ?alpha...
string --> " ?string-item... "
string-item --> char { other than " or \ }
| ""
| \ escape-sequence
layout-text --> layout-text-item...
layout-text-item --> layout-char | comment
comment --> /* ?char... */
{ where ?char... must not contain */ }
| % ?char... LFD
{ where ?char... must not contain LFD }
full-stop --> .
{ the following token, if any, must be layout-text}
char --> { any character, i.e. }
layout-char
| alpha
| symbol-char
| solo-char
| punctuation-char
| quote-char
alpha --> capital-letter | small-letter | digit | underline
escape-sequence --> b { backspace, character code 8 }
| t { horizontal tab, character code 9 }
| n { newline, character code 10 }
| v { vertical tab, character code 11 }
| f { form feed, character code 12 }
| r { carriage return, character code 13 }
| e { escape, character code 27 }
| d { delete, character code 127 }
| a { alarm, character code 7 }
| x alpha alpha
{treating a,b,... and A,B,... as 10,11,... }
{ in the range [0..15], hex character code }
| digit ?digit ?digit
{ in the range [0..7], octal character code }
| ^ ? { delete, character code 127 }
| ^ capital-letter
| ^ small-letter
{ the control character alpha mod 32 }
| c ?layout-char... { ignored }
| layout-char { ignored }
| char { other than the above, represents itself }
A backslash occurring inside integers in `0'' notation or inside quoted atoms or strings has special meaning, and indicates the start of an escape sequence. Character escaping can be turned off for compatibility with old code. The following escape sequences exist:
char mod 32, where char is a letter.
X,Y
denotes the term ','(X,Y) in standard syntax.
(X)
denotes simply the term X.
{X}
denotes the term {}(X) in standard syntax.
-3 denotes a number whereas -(3)
denotes a compound term which has the 1-ary functor - as its
principal functor.
Go to the previous, next section.