Nero Tutorial
This section describes the Nero language as a standalone language. Nero
in this form can be used via the joe nero
tool.
Standalone Nero, as provided by joe nero
, is useful in these ways:
- To experiment with and learn Nero syntax.
- To prototype Nero rule sets for particular purposes.
- As a test bed for Nero's Datalog implementation.
Axioms and Rules
A Nero program consists of a set of axioms and rules. Axioms assert facts and rules derive new facts from known facts. These derived or inferred facts constitute the output of the program.
For example the following program asserts the parent/child relationships
between certain individuals, and gives rules for determine ancestors and their
descendants. (Comments begin with //
and continue to the end of the line,
as in Joe.)
// Parent/2 - parent, child
Parent(#anne, #bert);
Parent(#bert, #clark);
// Ancestor/2 - ancestor, descendant
Ancestor(x, y) :- Parent(x, y);
Ancestor(x, y) :- Parent(x, z), Ancestor(z, y);
The axiom Parent(#anne, #bert)
asserts the logical predicate that
#anne
is the Parent
of #bert
.
Parent
is called the relation.#anne
and#bert
are constant terms representing individuals.- A literal constant term in a Nero program must be one of the following:
- A Joe keyword, string, number,
true
,false
, ornull
.
- A Joe keyword, string, number,
- A literal constant term in a Nero program must be one of the following:
- The string
Parent/2
appearing in the comment indicates thatParent
is a relation with an arity of 2, i.e., it always takes two terms. - A form consisting of a relation and one or more terms is called an atom.
The collection of Parent
axioms is essentially equivalent to a table
with two columns in a relational database.
The rule Ancestor(x, y) :- Parent(x, y);
simply asserts that if some
individual x
is the parent of some individual y
, then x
is also
an ancestor if y
.
Ancestor
is thus the relation of a new collection of predicates.- The single atom to the left of the
:-
token is called the head of the rule. - The atom(s) to the right of the
:-
token are called the body of the rule x
andy
are called variable terms. In Nero, variable terms are just normal Joe identifiers.
Similarly, the rule Ancestor(x, y) :- Parent(x, z), Ancestor(z, y);
simply asserts that if x
is the parent of z
, AND z
is an ancestor
of y
, then x
must be an ancestor of y
.
A Nero program can contain any number of axioms and rules, making use of any number of distinct relations.
Execution
Executing a Nero program amounts to iterating over the rules, matching the body of each rule against the known facts. For each successful match, Nero infers a new fact using the rule's head. This process continues until no new facts are produced.
For the simple program we've been looking at,
// Parent/2 - parent, child
Parent(#anne, #bert);
Parent(#bert, #clark);
// Ancestor/2 - ancestor, descendant
Ancestor(x, y) :- Parent(x, y);
Ancestor(x, y) :- Parent(x, z), Ancestor(z, y);
we get the following output.
Ancestor(#anne, #bert)
Ancestor(#anne, #clark)
Ancestor(#bert, #clark)
Parent(#anne, #bert)
Parent(#bert, #clark)
The Parent
facts are inferred trivially from the axioms; and then the
Ancestor
facts are inferred from the Parent
facts.
Notice how the variables are used in the second rule:
- First, some
Parent
fact is found and the variable termsx
andz
are bound to the parent and child in that fact. - Then, some
Ancestor
fact is found in which individualz
is the ancestor;y
is then bound to the descendant. - The bound values of
x
andy
are then inserted intoAncestor(x, y)
to produce the newAncestor
fact.
Wildcards
Suppose we wanted to produce a list of all the individuals mentioned in our input database. We could use a rule like this:
Person(x) :- Parent(x, y);
Person(x) :- Parent(y, x);
That is, x
is a person if x
is either a parent or a child in a
parent/child relationship.
Notice that the x
variable does all the work in these two rules. The
y
variable matches something, but we never care what. In cases like this
we usually use wildcard terms instead of variable terms:
Person(x) :- Parent(x, _);
Person(x) :- Parent(_, x);
A wildcard term can match anything at all; syntactically, a wildcard
is simply an identifier beginning with an underscore, usually just the
bare _
. A rule's body can contain any number of wildcards, each of which
matches something different.
It's desirable to use wildcards in cases like these, especially in more complicated rules, as it explicitly says that "this term doesn't matter".
Negation
The rules we've seen so far look for matches with known facts; it's also
possible to look for an absence of matches. For example, suppose we
want to identify the individuals in the Parent
data who have no known
parent. We could add a rule like this:
Progenitor(x) :- Parent(x, y), not Parent(_, x);
This says that x
is a "progenitor" is x
is the parent of some individual
y
but there is no individual in the input data that is the parent of
x
.
This technique is called negation, and the body atom following the
not
token is said to be negated.
A negated atom can only refer to variables bound in the non-negated
atoms to its left; because a negated atom matches an absence of facts,
not a specific fact, it cannot bind new variables. Consequently, we
have to use a wildcard for the first term in not Parent(_, x)
rather
than a variable. (In practice, we'd probably use a wildcard for the
y
term in Parent(x, y)
as well.)
Using negation requires that the rule set be stratified; Nero will flag an error if it is not. This is not usually a problem in practice; see the discussion of stratification in the Technical Details section if the error rears its ugly head.
Constraints
Barring time travel, it is impossible for a person to be his or her own ancestor; if our graph of parents and children contains a cycle, that implies an error in our data. We can write a rule to detect such errors by using a constraint:
CycleError(x) :- Ancestor(x, y) where x == y;
We flag person x
as an error if x
is his own ancestor.
A constraint is a simple comparison consisting of:
- A variable term, bound in a body atom.
- A comparison operator, one of
==
,!=
,>
,>=
,<
,<=
. - A second term, either a constant term or another variable term bound in a body atom.
CycleError(x) :- Ancestor(x, x);
The rule will fire for any Ancestor
fact whose first and second terms
are the same. This shows an important principle: when matching facts,
Datalog binds a variable to a value on first appearance from left to
right. Subsequent appearances must match the bound value.