Some general utility programs include operators for manipulating the Joy stack just below the top element, further operators for manipulating aggregate values, and a few output programs. Generally useful are the stack type and the queue type. Values and operators of these two types are easily implemented as Joy lists and list operators.
Another collection of useful operators take an aggregate as parameter and produce a list of subaggregates. These operators are polymorphic in the sense that the aggregate parameter can be a (small) set, a string, or a list. For example, one such operator can take a set as parameter and produces a list of its subsets. All of these operators are definable without recursion by using the linrec combinator.
Some arithmetic operators are often used to illustrate recursive definitions, although it is well known that iterative execution is more efficient. In particular the use of accumulating parameters can often replace recursion. This is easily done in Joy using various iteration combinators.
Values of sequence types, such as strings and lists, can be sorted, and sorted sequences can be merged. Programs for doing this are easily written in Joy without recursive definitions but using appropriate combinators instead.
Joy's inbuilt set type is implemented just as bitstrings, and hence it is limited to small sets of small numbers. The more useful big set type, which allows large sets of elements of any type, can be implemented in any language which has lists. It is simple to do in Joy, and the usual set-theoretic operations are easily provided. A similar implementation can be used for the dictionary type, which uses lookup tables for finite functions.
Also useful is the tree type, of lists, or lists of lists, or lists of lists of lists ... of elements other than lists.
The remainder of this paper illustrates programming in Joy by way of simple examples. Many of the programs are first written in pseudo-code and the translated into Joy. Some of the Joy programs here have been adapted from the literature on ML or Miranda\footnote{"Miranda" is a trademark of Research Software Ltd}. The next section defines some useful general purpose operators and combinators. This is followed by a section on two little collections of operators for two datatypes, stacks and queues. A longer section deals with programs for creating and manipulating lists of subaggregates. A shorter section then illustrates the use of accumulating parameters for the efficient implementation of numeric functions. Then there is another section on sorting sequences and on merging sorted sequences. Another section gives an implementation of unrestricted sets and of lookup dictionaries.
This section describes some very simple utilities which are useful in very different settings. Joy definitions are of the form
ATOM == PROGRAMwhere the long equals == means that the atom on the left is being defined to cause the execution of the program on the right.
Joy has three important operations for manipulating the top few items of the stack: pop for removing the top item, dup for creating a copy of the top item, and swap for interchanging the top two items. Often it is necessary to perform similar operations further below the stack. The following define three similar operators which leave the top element of the stack intact and perform the work just below that. All three use the dip combinator which takes as a parameter a quoted program and below that a further item. The item is saved, the quoted program is executed and the item then restored.
popd == [pop ] dip dupd == [dup ] dip swapd == [swap] dipThe popd operator removes the second item. The dupd operator duplicates the second item. The swapd operator interchanges the second and third item.
Three similar operators affect the order of the top three items on the stack. The rollup operator places items one, two and three from the top in the order two, three, one. The rolldown operator places items one, two and three in the order three, one, two. The rotate operator places them in the order three, two, one.
rollup == swap [swap] dip rolldown == [swap] dip swap rotate == swap [swap] dip swap
The next examples are for unary operators
which expect an item on the stack
and replace it with either a set or a string or a list
containing just that item.
All three operators work by first pushing
The empty set {}
or the empty string ""
or the empty list []}
on top of the stack.
Then they use the cons operation to
add the item below into the aggregate.
The items have to be of the right type.
The unitset operator requires a small number,
the unitstring operator requires a character,
and the unitlist operator requires anything.
If the items are not of the right type,
an error occurs when the cons
is executed.
unitset == {} cons unitstring == "" cons unitlist == [] consThe action of all three is reversed by the single standard operator first.
Analogously one may define three operators pairset, pairstring and pairlist, which form a set, string or list from two appropriate items on top of the stack:
pairset == {} cons cons pairstring == "" cons cons pairlist == [] cons consThe action of all three is reversed by a single operator unpair, which may be defined in either of two equivalent ways:
unpair == uncons uncons pop unpair == uncons first
Joy has two operators applicable to non-empty sets, strings and lists: The operator first extracts the first item of a string or list, and that item from a set which is the first in the underlying order. The operator rest removes the first item and returns the remaining set, string or list. Sometimes it is necessary to extract the second or third item. Suitable definitions are these:
second == rest first third == rest rest first
The operators uncons and unswons undo what is done by
cons
and swons
,
Often it is useful to dissect not just one aggregate into
its first
and rest
,
but to dissect two aggregates.
This can be done by uncons2 and unswons2,
defined as follows:
uncons2 == [uncons ] dip uncons swapd unswons2 == [unswons] dip unswons swapdBoth expect two aggregates on top of the stack, both leave two
first
s and two rest
s on the stack.
For uncons2
the two first
s are items 3 and 4 on the stack,
the two rest
s are items 1 and 2.
For unswons2
it is the other way around.
Similarly, it is sometimes necessary to test whether at least
one of two aggregates
is empty, or whether at least one of two numeric values is equal to zero.
For a single parameter this is done by null
,
for two parameters it is done by null2,
defined in either of two equivalent ways:
null2 == [null] [true] [pop null] ifte null2 == [ [[null] true] [pop null] ] cond
Strings and lists are special kinds of aggregates, they are sequences. Sometimes it is necessary to reverse sequences. The naive way of doing this is recursively as follows:
To reverse a sequence S: If S is empty then return the empty sequence else remove the first element reverse the rest of S append the first element of S at the tailIt is easy to see that this is very inefficient because the append operation requires a lot of copying, and every element to be appended requires portions of the rest to be copied again and again. A well-known optimisation uses an extra parameter, an accumulating parameter, to obtain the same effect. The idea is to prepend the elements of the original list onto the accumulating parameter. Sometimes this is expressed by analogy with railways. The shunt operator takes two sequences as parameters and, starting at the front of the topmost parameter, moves all items onto the front of the second parameter. Joy has a combinator step for stepping through all items of its top parameter, and for each item executing a program that is given as a further parameter. That program has to take the item and add it to the accumulating parameter, so it is the swons operator. So this is how
shunt
can be defined:
shunt == [swons] stepTo reverse a list or a string, an empty list or empty string has to be supplied as an accumulating parameter just below the list or string that is to be reversed. So here are definitions for reverselist and reversestring:
reverselist == [] swap shunt reversestring == "" swap shuntBut there is something unsatisfactory about this, the reversal operation should be polymorphic. So the following version of reverse first tests whether the sequence to be reversed is a list or not, and inserts the appropriate accumulating parameter. The testing is done by the iflist combinator which takes two (here rather tiny) programs as parameters. and below that one other item, the list or string to be reversed. If the latter happens to be a list, then the first quoted program is executed, and it will push an empty list. Otherwise the second program is executed, and it will push an empty string. In either case the two top items are now
swap
ped and then shunt
ed.
reverse == [[]] [""] iflist swap shunt
It comes as a surprise that lists can be reversed in another way. The idea is this: When a list is executed by a combinator, all the members of the list will be literals, so they will each be pushed onto the stack. The last element of the list will end up on top of the stack. So the elements of the list will then be in reverse order. To make use of this idea we have to arrange that an initially empty list is treated as a stack. This is what the infra combinator does. It takes a list as one parameter and a program as the second. It uses the list as a temporary stack and executes the program. The resultant stack is then pushed as a list. For the reversal problem the program is the list to be reversed, and the other parameter has to be the empty list. That empty list first has to be inserted below the list to be reversed. So another program to reverse a list is this:
reverselist == [] swap infraWhat makes this version possible is that in Joy the principle that program = data is extended to program = data = memory. This version is actually more efficient than the one given earlier. Of course it cannot be adapted for reversing strings.
The two principal operators for explicit output are
put, which prints a single value of any type,
and putch, which prints a single stand alone character
without quote symbol.
Two useful little utility operators are worth defining.
The putchars operator uses the step combinator
to step through the characters in a list or string
and writes them to the output file without the enclosing
[]
or ""
.
The newline operator just outputs
the newline character \n
to terminate a line.
putchars == [putch] step newline == '\n put
Using the step
combinator it is easy
to define several conversion operators which can be useful.
The first two produce sets from aggregates of upper or lower case
aggregates.
The last two produce strings of upper or lower case characters
from aggregates of small numbers.
upper2set == {} swap [64 - swons] step lower2set == {} swap [96 - swons] step set2upper == "" swap [64 + swons] step set2lower == "" swap [96 + swons] step
The dip combinator expects a quotation and below that an item that will be saved before execution of the quotation and restored afterwards. Sometimes one wants to save and restore two or even three items, so it is useful to have further variants dip2 and dip3, defined as follows:
dip2 == [dip] cons dip dip3 == [dip] cons dip2Note that cons is being used to build a constructed program that is then supplied as a parameter to the last combinator.
This section and the next two contain implementations
of two simple data types.
Members of the stack type
are linear structures which allow read and write access
at one end only,
and members of the queue type
are linear structures which allow read access at one end
and write access at the other end.
Both have this much in common:
the stack or queue remains on the Joy stack,
and any stack or queue operations using it leave it there.
Of course it can be explicitly removed with pop
.
1. First, the stack type. A stack is a linear structure that can grow by having items added, inspected and removed all from the same end. The simplest way to implement stacks in Joy is as lists. One essential operator is st-new for the creation of a new empty stack, which is just an empty list. So the definition is
st-new == []
Stacks can have additional items pushed onto them,
and this is done by adding them to the front of the list.
Since the stack is already there,
the new item will typically be first pushed onto the Joy
stack, and then it is to be pushed onto the stack.
One way to do this is to swap
the item and the stack
and then perform a cons
operation.
But Joy has an operation which combines these two,
namely swons.
So the st-push operation can be defined by
st-push == swonsAn essential predicate is st-null for testing whether a stack is empty or null. However it will not do to just use
null
, since this will
remove the stack --- but typically the stack is intended for
further applications.
So, to avoid losing the stack, it has to be dup
licated first,
and then the null
test can be applied to the duplicate:
st-null == dup null
The previous two operations both make sense when the stack is empty, but this is not the case for the stack operations to follow. The first of these is st-top for extracting the top element of the stack, while leaving the stack itself unchanged. The second is st-pop for removing the top element. The third is st-pull which combines the last two, it is the opposite of push. It extracts the top item and pops the stack. Ignoring the complication of an empty stack, the definitions would simply be:
st-top == dup first st-pop == rest st-pull == unswonsTo guard against an empty stack, a test has to be performed to determine whether the stack is empty. If it is, then an error message should be given, otherwise the operation should be performed. So for all three operations the structure will be of the form
== [null] [ERROR-MESSAGE] [PERFORM OPERATION] ifteThe error messages should state which of the operations was being attempted, but otherwise they should be the same. So the name of the operation is given as a string parameter to an error handling operation. That particular operation will be called
_st-error
,
and we leave the details of its implementation till a little later.
The leading underscore _
in the name has been
added because this operation is not intended to be used
by the programmer; in the current implementation
the help command hides
identifiers with a leading underscore.
The remaining stack operations are then:
st-top == [null] ["st-top" _st-error] [dup first] ifte st-pop == [null] ["st-pop" _st-error] [rest ] ifte st-pull == [null] ["st-pull" _st-error] [unswons ] ifteAs may be seen, the three operations still have a lot in common, and one might consider extracting that further. However, the result is likely to be less clear to the human reader. It remains to implement the error operation. It should state that an error has occurred due to an empty stack, and this part is the same for all three operations. It should also state which of the operations failed. So a minimal implementation of
_st-error
would simply write one string which is common for any call,
and another string which is the specific parameter.
This is crude but very easy to implement:
_st-error == "non_empty stack needed for " put putA minor improvement is to concatenate the two strings (in the right order), so that only one string has to be written. But the double quotation marks in the output still look silly. So instead of writing the one or two strings with
put
,
it looks nicer to write their constituent characters with putchars.
Also, the line should be terminated with newline.
Finally, there is little sense in continuing the computation,
so after the two parts of the error message have been displayed,
it is best to abort, to return to the top level.
In a library implementation for the collection of definitions of stack operations might look like this:
LIBRA (* stack *) _st-error == "non-empty stack needed for " putchars putchars newline abort; st-new == []; st-push == swons; st-null == dup null; st-top == [null] ["st-top" _st-error] [dup first] ifte; st-pop == [null] ["st-pop" _st-error] [rest ] ifte; st-pull == [null] ["st-pull" _st-error] [unswons ] ifte.As may be seen from the example, a library declaration begins with the word LIBRA and terminates with a period. In between is a sequence of definitions separated by semicolons ;. A definition consists of an atomic symbol and then the symbol == followed by a Joy program. Note again that the
_st-error
operator
is not really intended to be used outside the remaining definitions.
It could well be hidden completely from outside.
A mechanism for this will be illustrated below.
2. Next, the queue type. A queue is a linear structure that can grow by having items added at one end, and inspected or removed from the other end. A simple minded implementation would consists of a single Joy list to which items are added at the end and removed from the front. But adding something at the end requires copying the entire list every time. Nothing would be gained by reversing the role of front and end, because in that case the removal requires copying of (almost) the entire list. A better implementation uses two lists. Conceptually one is the front of the queue, and items are removed at the front. Conceptually the other list is the back of the queue, but in reverse, and items are added to the front of this list. If at any time the list implementing the front of the queue becomes empty, the other list gets explicitly reversed and it becomes the front, and the empty list becomes the rear. There are two auxiliary operators that need only be visible to the remaining operators but not to the outside; in the following they are hidden in the private part of this module. Because they are hidden, there is no need to choose names which indicate the datatype on which they operate.
LIBRA (* queue *) HIDE error == "non_empty queue needed for " putchars putchars newline abort; prepare == [null] [swap reverse] [] ifte IN q-new == [] []; q-add == swap [swons] dip; q-addl == swap [shunt] dip; q-null == prepare dup null; q-front == prepare [null] ["q-front" error] [dup first] ifte; q-rem == prepare [null] ["q-rem " error] [unswons ] ifte END.
As may be seen, such a declaration consists of the word HIDE, followed by a sequence of definitions, then the word IN followed by another sequence of definitions, and then the word END. A sequence of definitions is again separated by semicolons ;. The whole declaration can occur inside a library declaration where a single definition can occur. Any hiding declaration can occur wherever a single definition can occur, so they can be nested.
The first auxiliary error reporting procedure, error
, is
similar to the one for stacks.
The second auxiliary operation, prepare
,
prepares the two lists: if the list implementing the front
happens to be empty,
the roles of the two lists are interchanged.
If the front list is not empty,
nothing is done.
A new queue is created by q-new in the form of two empty lists. An item can be added (to the "rear of the queue") by q-add which adds it to the front of the second list. The members of a whole list can be added to the rear by q-addl. The operator q-null first prepares the two lists so that the list implementing the front is not empty, if that is possible at all. It then tests the front list. The operator q-front and q-rem extract respectively a copy of the front element or the front element itself. The copy or the original are left above the two lists. Both operators require the queue to be prepared so that the list implementing the front is not empty. Also, both operators need to check whether the front really is non-empty. If it is not, the error operator is called.
The definitions for stacks and queues are part of the library file TYPLIB.JOY.
true
on top of the stack,
the then-part is executed and the combinator exits.
Otherwise the rec1-part is executed,
then the combinator calls itself with the same four parts,
and then the rec2-part is executed.
The first definition below is for an operator restlist
which takes any aggregate as parameter
and produces the list of all those subaggregates
that would be formed by repeatedly taking the rests
of the aggregate.
Such an operator can of course be defined recursively
and this could be done in any language.
But in Joy it is possible to use a non-recursive definition
using the linrec
combinator.
Here is some pseudocode:
1. If the aggregate is empty 2. then form its unitlist 3. else take a copy and the rest of that, recurse using this rest, eventually forming a list of aggregates, 4. use cons to add the original aggregate to the front of this listThe above pseudocode translates directly into a recursive form in any language, but in Joy a non-recursive definition is also possible. The four program parameters for the
linrec
combinator
correspond exactly to the labelled lines.
Nothing corresponds to the unlabelled line,
the linrec
combinator recurses here automatically.
restlist == 1. [ null ] 2. [ unitlist ] 3. [ dup rest ] 4. [ cons ] linrec
The next program also takes an aggregate as parameter
and produces a list of subaggregates.
But the subaggregates are those obtained by successively
deleting the last elements.
In analogy with the previous operator it will be called
the frontlist operator.
For empty aggregate parameters again the unitlist
has to be returned,
so the if-part and the then-part are the same as before.
Also, for non-empty aggregates the aggregate has to be taken apart
in the rec1-part.
This can be done in two ways.
We can take the front aggregate and the last element,
but that would require defining a suitable operator,
and it would require expensive copying in the case of list
or string aggregates.
Alternatively we can just uncons
.
This leaves only the rec2-part to be written.
But it will
be more complicated than for the previous operator.
Let us ignore for the moment that the operator is intended to be used
for aggregates of any of the three types.
When the anonymous recursion has completed,
the stack will contain the first item of the non-empty aggregate
and above that the frontlist
of its rest.
The first item has to be cons
ed into each member
of the frontlist,
and that is best done by
[cons] mapThen that first item, which is now still the second element on the stack, has to be deleted. This can be done by a variant of
pop
, namely popd
.
Finally, assuming that the operator is to be used for lists,
the empty list has to be added to the frontlist,
and the easiest way is by [] swap cons
,
or simply by [] swons
.
This gives the following provisional rec2-part:
[ [cons] map popd [] swons ]But the assumption that the operator is to be used only for lists is unnecessarily restrictive. The final part, adding an empty aggregate, should depend on what the initial aggregate was, a set, a string or a list. This can be achieved by looking up the first element of the frontlist, it is a one element aggregate and taking its rest produces the required empty aggregate of that type. So the required rec2-part is:
[ [cons] map popd dup first rest swons ]The entire definition for
frontlist
,
applicable to any aggregate, now is:
frontlist == [ null ] [ unitlist ] [ uncons ] [ [cons] map popd dup first rest swons ] linrec
The next program defines an operator subseqlist which is in some ways a combination of the preceding ones. Again it takes any aggregate as parameter and returns a list of subaggregates. This time the subaggregates are all those obtainable from the parameter aggregate by deleting first or last elements. For ordered aggregates, lists and strings, the resulting subaggregates will still contain elements in the same order as the parameter. It is tempting to define the operator very simply by
== frontlist [restlist] mapBut this produces not a list of subsequences but a list of list of subsequences. This list of lists could then be flattened to a single list, even if this is somewhat inefficient. However, a different solution is possible.
The if-part and the then-part are as
for restlist
and frontlist
, of course.
The rec2-part is easy, it is only necessary to concat
enate
two lists that were produced by the rec1-part.
But the rec1-part is rather complex,
and this is what it has to do:
the first element of the aggregate has to be extracted
and later it has to be cons
ed into every
subaggregate of the frontlist
of the rest
of the aggregate.
But also the rest of the aggregate
has to be made available for the linrec
combinator
to work on.
So the rec1-part must uncons
the aggregate,
and produce a second copy of the rest.
The second copy has to be kept aside by using the dip
combinator to work on the original copy.
So an intermediate draft of the rec1-part looks like this:
[ uncons dup [ ... ] dip ]The
[...]
program must take the frontlist
(of the original copy of the rest) and then cons
the first element into each of the members of the result.
We already know how to do that, and how to delete the
hidden first member.
So the rec2-part is the following:
[ uncons dup [ frontlist [cons] map popd ] dip ]The entire program now is this:
subseqlist == [ null ] [ unitlist ] [ uncons dup [frontlist [cons] map popd] dip ] [ concat ] linrecThe program uses
frontlist
,
but because the latter is defined without recursion,
it is possible to simply use the RHS of the definition
of frontlist
and insert that textually into the
definition of subseqlist
.
The frontlist
and subseqlist
operators
were adapted from recursive programs in
\AX{Thompson}{1991 p 247}{Thompson:91}.
The next program defines a unary operator powerlist which for any aggregate returns a list of all subaggregates. If the parameter aggregate has N members, the resulting list of subaggregates has 2^N members.
powerlist == [ null ] [ unitlist ] [ uncons ] [ dup swapd [cons] map popd swoncat ] linrecIt uses the
linrec
combinator and the same
if-part and then-part as the previous programs.
It also uses the same rec1-part as the frontlist
program: before recursing,
the parameter is split into its first and its rest by uncons
.
The recursion then produces the powerlist of the rest.
The rec2-part then uses that result to produce
two copies, using dup
.
One copy is left untouched,
the other has the original first element
cons
ed before every sublist using map
.
For this to work, the original first element
has to be placed in the right position by swapd
and eventually deleted by popd
.
The two resultant lists are finally combined with
swoncat
.
This produces the list of subaggregates starting with the empty one;
if concat
is used instead of swoncat
,
the list ends with the empty one.
Neither method yields the subsequences sorted according to size
;
but see the later section on sorted sequences.
The next program defines a binary operator insertlist. This operator expects a list or a string as a parameter and below that a potential list member or a character. It returns a list whose elements are either lists or strings, each with the potential member or character inserted once at all possible positions. So if the original list or string has N members, the resultant list has N+1 lists or strings as members.
insertlist == (* Item Sequence -> List(Sequence) *) cons [ small ] [ unitlist ] [ dup (* keep original *) unswons [uncons] dip swons ] (* take out second *) [ swap [swons] cons map (* swons in second *) cons ] (* cons in original *) linrecThe operator can also be used, with a set parameter instead of a string or list. Then it produces a list of N+1 identical sets, each with the new member added. But such a use will rarely be wanted.
The permlist operator expects a sequence and returns the list of all permutations of that sequence. So if the original sequence has N members, the result list has factorial(N) members.
permlist == [ small ] [ unitlist ] [ uncons ] [ swap [swap insertlist] cons map flatten ] linrecNote again that in the two preceding programs the
map
combinator
uses a constructed program.
The zip operator expects two aggregate parameters,
not necessarily of the same type, and not necessarily of the same size
.
It produces a list of two element lists by
combining corresponding elements from the two aggregates.
The result list contains as many pairs as the smaller of the two
parameter aggregates.
Here is the definition:
zip == [ null2 ] [ pop pop [] ] [ uncons2 ] [ [pairlist] dip cons ] linrecThis might be paraphrased as: If one or the other of the two parameter aggregates is
null
,
then pop
them both and return the empty list []
,
otherwise take out the two first
elements
and the two rest
s,
recurse with the two rest
s
producing a result list of two element lists of that,
then dip
below that result list to combine the two previous first
s
with pairlist to form a two element list,
and cons
that into the front of the result list.
Related to this is the more general zipwith combinator,
adapted from
\AX{Bird and Wadler}{1988 p 57}{Bird:88}.
It takes three parameters,
two aggregates and one quotation which can be used
to combine members of the aggregates.
The program again uses the linrec
combinator
which needs four quoted programs as parameters.
The fourth quotation now has to be a constructed program,
it is built from the program parameter of zipwith
and the program stub [dip cons]
already seen
for zip
.
The other three program parameters for linrec
first have to be dip
ped below the parameter of zipwith
.
The definition for this combinator thus is:
zipwith == 1. [ [ null2 ] 2. [ pop pop [] ] 3. [ uncons2 ] ] dip 4. [ dip cons ] cons linrec
A list of sequences can be concatenated into a single sequence
by the unary operator flatten.
The code is straightforward:
if the parameter list is empty,
then there is nothing to concatenate, leave it as it is.
Otherwise use uncons
to take out its first and its rest,
recurse anonymously on the rest to produce the flatten
ed
result of that,
and finally concat
enate the saved first part
to the front of the last result.
flatten == [null] [] [uncons] [concat] linrec
A two dimensional matrix can be implemented as a list of lists. One important matrix operation is the interchange of rows and columns, performed by the unary operator transpose. A draft of the program is the following:
To transpose a list of lists LL : 1 If LL is empty or some sublist of LL is empty 2 then pop it off and return the empty list [] else (all sublists are non-empty) 3a construct a list of all the first s of sublists 3b construct a list of all the rest s of sublists recurse anonymously on the list of rests 4 cons the list of firsts into the result.This version has been adapted from \AX{Reade}{1989 p 133}{Reade:89}. To test the disjunction whether LL is empty or some of its sublists are empty, we use the conditional combinator ifte. The if-part has to test whether LL is empty, and if it is, then the then-part has to return
true
.
Otherwise the then-part will have to determine whether some
sublists of LL are empty.
This is best done with the combinator some.
So part 1 of the Joy version is:
1 [null] [true] [[null] some] ifteFor parts 3a and 3b it is necessary to use the parameter LL to produce two lists, of the
first
s and the rest
s.
Either of these two can be obtained with the map combinator.
To obtain the two lists, the cleave combinator
can be used, it takes two quotation parameters and a further
parameter, and produces two values,
one from each of the two quotations.
So parts 3a and 3b are just:
3a [ [first] map ] 3b [ [rest ] map ] cleaveThe entire Joy program thus is:
transpose == 1 [ [null] [true] [[null] some] ifte ] 2 [ pop [] ] 3 [ [[first] map] [[rest] map] cleave ] 4 [ cons ] linrec;
Alternatively, line 1 can be replaced by the following:
1 [ [null] [[null] some] disjoin i ]Here the two tests are first disjoined to form a single test predicate. This is the called by the
i
combinator.
The net effect is exactly the same as in the version given earlier.
A common binary operation on aggregates
is that of forming the Cartesian product.
It will take two aggregates as parameters
and produce the list of two element list
which each contain one element from each of the two
aggregates.
If the two aggregates have M and N members respectively,
then the resultant list has M \times N elements.
In order to form the Cartesian product,
it is necessary to consider each of the members of one aggregate
with each of the members of the other aggregate.
This is like step
ping through an aggregate,
except that there are two aggregate to be stepped through.
It will be useful first to define a combinator step2
which does just that,
leaving at each step two items on top of the stack.
Then the Cartesian product operator will just
form their pairs, and then form a list of all these pairs.
An application of the step2
combinator will look like this:
A1 A2 [P] step2The implementation is based on the simple idea that
A2
and [P]
be used to construct a program
which is then used by the ordinary step
combinator
to step through the elements of A1
:
A1 [ .. A2 .. [P] .. ] stepWe now fill in the dots. The program to be constructed has to
step
through
the members of A2
using a program which depends on [P]
.
It has to do this for each member of A1
,
and when it has done that the current member of A1
can be pop
ped off.
So the program will have to look like this:
A1 [ A2 [ .. P ] step pop ] stepSince
P
must be allowed to consume a current member of
A1
but this still has to be available
for the next inner step
,
that member of A1
first has do be duplicated,
below the current member of A2
.
So the dots are just [dup] dip
.
In sum, the required program is
A1 [ A2 [ [dup] dip P ] step pop ] stepThe definition of step2 must construct that program from
A2
and [P]
and then call step
with just two parameters, the program just constructed
and below that the other aggregate A1
.
The definition looks like this:
step2 == [[dup] dip] swoncat (* form inner quote *) [step pop] cons (* form outer quote *) cons (* insert A2 *) step (* through A1 *)
It is now relatively easy to define the Cartesian product operator
as follows.
First we need to insert an accumulating parameter,
an empty list []
. It has to be inserted below
the two aggregates of which the Cartesian product is to be computed.
This is easily done with [] rollup
.
The program which is then used by the step2
combinator
has to form the pairlist
of the two items on top of the stack.
The resultant pair has to be inserted into the accumulator with
swons
.
But between the accumulator and the just formed pair
is the current original member of the first aggregate
which must be left intact.
So the pair and the member have to be swap
ped,
and the swons
has to be done below the member,
by dip
.
This is the program that is given as the parameter for step2
.
So the definition for the cartproduct operator is
cartproduct == [] rollup [pairlist swap [swons] dip] step2The program works for aggregates of any type, and the two aggregates do not have to be of the same type. If both are lists of numbers or sets of small numbers, other variations are possible: By changing the pairing operator
pairlist
to, say,
multiplication *
,
we obtain a program which produces a list of all products.
By further changing the accumulator []
to
the number 0
and the insertion operation swons
to, say, addition +
,
we obtain a program which produces the sum of all M \times N products.
For two numeric aggregates of the same size, say N,
another binary operator can be defined, the scalarproduct:
scalarproduct == 0 rollup (* accumulator *) [ null2 ] [ popd ] [ uncons2 [* +] dip2 ] tailrecIt produces the sum of all N products of pairs taken from corresponding positions in the two aggregates.
This section gives efficient implementation of several well-known functions which are often used in the literature for demonstration purposes: the factorial, the Fibonacci-function, exponentiation and the greatest common divisor. All of them are often defined recursively, and of course they can be defined recursively in Joy. Using one of several suitable recursion combinators they can be computed recursively in Joy without a recursive definition. But recursive execution in any language can be inefficient.
There are well known techniques for removing linear recursion, see for example \AX{Bauer and W\"ossner}{1982 Chapter 4}{Bauer-Woessner:82}. The same topic is discussed in \AX{Henson}{1987 Chapter 4}{Henson:87} using the useful concept of eureka definitions due to Burstall and Darlington. These involve creative steps in the production of more efficient versions of programs, and hence would be difficult to perform by an optimising program.
Several of the functions to be defined require a little program to be executed a number of times. A useful combinator for this is times. It requires the program to be repeated as the top element of the stack and the required number of repetitions to be the second element on the stack.
The factorial of a number n is simply
the product of n factors from 1 to n.
To compute it using times
,
a small program has to be pushed on top
of the number n which is the parameter.
The number itself will be consumed by times
.
The program works on two other numbers on the stack.
One of these is the accumulating parameter,
it has to start at 1.
The other is the next factor to be used by
the program with which to multiply the accumulator.
The multiplication has to be done without losing
the factor, so it has to be duplicated first.
Apart from doing the multiplication,
the program also has to increment the factor using
the successor operator succ.
The program which is the parameter to times
thus is
[dup [*] dip succ]Before the
times
combinator can get to work
on the parameter n and the quoted program,
the accumulator and the first factor
have to be placed in position, below the parameter n.
Both begin with the value 1,
so the rolldown
operator can be used
to push these two values below n.
Finally, after times
has completed,
the stack will contain the required accumulator value
but also on top of that the next factor.
The latter is simply pop
ped off.
The entire definition of fact is:
fact == 1 1 rolldown [dup [*] dip succ] times pop
The Fibonacci function can be computed in a similar way.
Again there is a certain computation that has to be repeated
a number of times as given by the parameter n.
Again the computation involves two further numbers,
the larger one is to be replaced by their sum,
and the smaller one is to be replaced by the former larger one.
Adding the two must not destroy the original larger number,
so again it has to be dup
licated.
The addition is then performed under the control of dip
.
Then the two numbers are swap
ped.
This describes the little program that serves as the parameter
to the times
combinator.
Before it can start, the two initial values 0 and 1
have to be placed below the parameter n with rolldown
.
When it has completed, the required accumulated sum
is the second element,
and the top element, the useless next summand, is pop
ped.
So this is the definition of fib:
fib == 0 1 rolldown [dup [+] dip swap] times popThis version of the Fibonacci function requires a computation time which is a linear function of the parameter n.
The recursive version of the Fibonacci function requires quadratic computation time. Since the result values are not very large, it is often used as a test program. What is of interest is the number of recursive calls made during the computation, to be divided by the total time it took. To obtain the number of recursive calls it is often convenient to use a variant of the Fibonacci function, sometimes called nfib. It has the property that the value returned is the same as the number of calls made during recursive execution. The following are recursive definitions of Fibonacci and its variant:
r-fib == [small] [] [pred dup pred [r-fib ] app2 + ] ifte r-nfib == [small] [pop 1] [pred dup pred [r-nfib] app2 + succ] ifteThese are recursive definitions which of course are intended to run in quadratic time. The following is a definition of nfib which uses accumulators to run in linear time. Of course it does not measure its own runtime, it is included here to illustrate a programming technique.
nfib == 1 1 rolldown [dup [+ succ] dip swap] times pop
The next two programs are for binary operators
which compute functions of two parameters:
the exponentiation function
and the greatest common divisor.
Exponentiation can be computed by performing
a certain operation as many times as given by
the exponent.
This description again suggests using the times
combinator to execute a quoted program several times.
The operation to be repeated consists
in multiplying an accumulator by the base
which is the second parameter.
So it is necessary to construct
a little program from the base which
for every call will multiply by the base.
Assuming that the base is in the right position on the stack,
the program is easily constructed,
by
[*] consBefore the constructed program can be handed to
times
,
the initial value 1 has to be placed as an accumulator
below the two numbers which are the two parameters,
the base and the exponent.
This is done by 1 rollup
.
To get the two parameters into the order appropriate
for cons
it is necessary to perform a rotate
first.
So here is the exp operator:
exp == 1 rotate [*] cons timesThe technique of first constructing a program (here by
cons
)
and then supplying it to a combinator (here times
)
is very useful in Joy.
The next program computes the greatest common divisor
of two numbers,
using Euclid's algorithm.
The algorithm uses two numbers
and repeatedly takes the remainder after
dividing one by the other.
The remainder obtained is then used to replace
the dividend.
The process is repeated as long as
the potential divisor is positive.
So, unlike the previous programs, we cannot use the times
combinator.
Instead a combinator called while
is used
which resembles while-loops in imperative languages.
It takes two parameters:
the while-part is a quoted program which must return a truth value,
and the do-part is a quoted program which can compute anything.
The while-part in the following gcd program
is of course very similar to a corresponding part
in the fib
program.
gcd == [0 >] [dup [rem] dip swap] while pop
Two other arithmetic functions
that are sometimes useful are for
computing the sum or the product
of a set or a list of numbers.
Both are best implemented by stepping through
all members of the set or list,
doing additions or multiplications with an accumulator every time.
The initial accumulator value, 0
or 1
,
is first pushed onto the stack below the parameter set or list.
For comparison, the third line below gives a definition
of the size operator which is applicable to any aggregate.
The fourth line below gives a definition of a similar operator
size2 for determining the total number
of elements in a list of aggregates.
If these aggregates are themselves lists,
then their members are counted but not the members of their sublists.
sum == 0 swap [+ ] step product == 1 swap [* ] step size == 0 swap [pop succ] step size2 == 0 swap [size + ] stepA generalisation of
size2
for counting the leaves
in recursive lists or trees is treesize,
defined later.
The sequence types of Joy are the string type and the list type. Values of these types can be ordered. Strings contain just characters, but lists may contain anything. So for lists it only makes sense to speak of ordering if the elements are characters or integers or something else that has an ordering defined on it.
An informal description of the quicksort algorithm is this:
To sort a sequence S : 1 If S is small (has only 0 or 1 element) 2 then it is sorted already, leave it alone else (S has at least one element) 3 using the first of S as a "pivot" for comparison, split the rest of S into two portions - those that are less than the pivot and those that are not separately sort both portions P1 and P2 4 concatenate the now sorted lesser portion, the pivot, and the sorted other portion.The following is a definitions of an operator qsort which uses the above algorithm. But instead of using explicit binary recursion it uses the binrec combinator. This is like the
linrec
combinator
except that it recurses twice,
once each on the top two element of the stack.
The recursions again occur between the rec1-part
and the rec2-part.
The program also uses another combinator split
which takes as parameter an aggregate
and above that a quoted program which must return a truth value.
The split
combinator returns two aggregates,
containing those elements for which the test yields false
and those for which it yields true
.
The split
combinator has access
to the remainder of the stack which in this case contains
the pivot.
So the test determines whether the pivot is >
than the element being examined.
qsort == 1 [ small ] 2 [ ] 3 [ uncons [>] split ] 4 [ swap23 cons concat ] binrec
Sometimes it is required to sort a list
of aggregates on the basis of their first elements.
In that case it is necessary to supply
to the comparison operator >
not the pivot
and the element to be apportioned by split
,
but their first elements instead.
This is conveniently done by the app2 combinator
which applies a quoted program to two elements
on top of the stack and replaces them by whatever
values the programs return.
qsort1 == 1 [ small ] 2 [ ] 3 [ uncons [[first] app2 > ] split ] 4 [ swap23 cons concat ] binrec
Note that in part 3 when the first element of the pivot
has to be compared with the first element of the
aggregate to be apportioned,
the first element of the pivot is being extracted every time.
It would perhaps be more efficient if the first element of the
pivot is extracted just once,
as soon as the pivot is available.
In that case it is necessary to take the pivot apart with
unswons
, but this has to be done by dip
ping
below the rest of the list still to be sorted.
Then the quotation parameter to split
just needs to take out the first
of the current aggregate
and compare it with the first of the pivot.
After split
has done its job,
the pivot has to be re-assembled by swons
,
but this now has to be done below the two
portions with dip2.
So part 3 can be replaced by
3 [ uncons [unswons] dip [first >] split [swons] dip2 ]
Sometimes it might be necessary to sort a list of items
on the basis not of their first element
but on their size or their second or third element
or even the size of the second of the third element.
For the last example it would only be necessary
to use [third second size]
instead of [first]
in the qsort1
program.
But it would be impossible to anticipate
all alternative sorting bases for a library,
and it would be awkward to have to write
the appropriate sorting program on every special occasion.
It is possible to write a general quicksort program
which takes as an additional parameter
something like
[first]
or [third second size]
.
The mk-qsort combinator does just that:
mk_qsort == [ [small] [] ] dip [ app2 >] cons [split] cons [uncons] swoncat [ swap23 cons concat ] binrecIt begins in line 1 by inserting the standard if-part and then-part below its parameters. In line 2 it uses the parameter to build a constructed program, the required rec1-part. Then in line 3 it pushes the standard rec2-part. At this point the top five elements of the stack are the list to be sorted and above that the four program parts needed for
binrec
.
The latter now executes.
For example the program
[third second size] mk-qsortwill sort a list of lists of three or more elements whose third member are aggregates of two or more elements. It will sort according to the size of the second of the third element.
The binary operator insert takes a sorted sequence and a potential new member as parameters, it returns a new sequence with the additional member inserted in the appropriate position. Here is a draft program:
To insert an item into a sorted sequence : 1 If the sequence is empty or its first element is >= than the item 2 then add the new item in the front of the sequence 3 else set aside the first item of the sequence recurse with the rest of the sequence and the new item 4 add the previously set aside first item to the frontThe disjunction in line 1 is best handled by the disjoin operator on programs. It expects two quoted programs which return a truth value, and it returns a single quoted program which computes their disjunction. So line 1 consists of two quoted programs one of them tests whether the sequence is empty, the other tests whether its first element is
>=
than the item
to be inserted.
The disjoin
operator then produces their disjunction.
The resulting program is the if-part for the linrec
combinator.
The other three parts are now quite obvious.
So the definition in Joy is:
insert == 1 [ pop null ] [ [first] dip ] disjoin 2 [ swons ] 3 [ [uncons] dip ] 4 [ cons ] linrec
Two sorted sequences can be merged into a single sequence which respects the original ordering. Here is a very informal algorithm for a recursive version:
To merge two sorted sequences : If the first sequence is empty, then throw it away and return the second sequence. If the second sequence is empty, then throw it away and return the first sequence. (Both sequences are non-empty, so both have a first element:) If the first of the first sequence is less than the first of the second, then set the lesser element aside, recurse using the rest of the first sequence, prepend the previously set aside element. If the first of the first sequence is greater than the first of the second, then set the lesser element aside, recurse using the rest of the second sequence, prepend the previously set aside element. (The two first elements of the sequences are equal:) Default set both first elements aside, recurse using the rests of both sequences, prepend the two previously set aside elements.
Like just about all programming languages,
Joy has an if-then-else construct (ifte
)
for two-way branching.
Multiway branching can be achieved by
nested ifte
s, but this can become difficult to read.
Joy has another combinator for multi-way branching
borrowed from Lisp.
The combinator cond
expects one parameter which is a list of cases.
The last case is the default case,
the other cases each consist of a condition or if-part
and a program or then-part
The condition is a quoted program
in front of the program.
Execution of the cond
combinator
tests successive conditions,
and for the first condition that yields
true
the associated program is executed.
If none of the conditions is true,
the default case is executed.
The informal algorithm given earlier
now translates into the following recursive definition
of r-merge:
r-merge == [ [ [null] pop] [ [pop null] swap pop] [ [unswons2 <] [uncons] dip r-merge cons] [ [unswons2 >] uncons swap23 r-merge cons] [ uncons2 r-merge cons cons] ] cond
As may be seen from the earlier informal version
and the above Joy version,
for each case the program recurses at most once.
Therefore the program has the pattern of linear recursion.
However, because there are three cases in which
recursion occurs,
it is not possible to use the linrec
combinator.
However, Joy has a combinator condlinrec
which has features of cond
and linrec
.
The combinator condlinrec
also expects
one parameter which is a list of cases.
Again the last case is the default case,
and the other cases consist of a list of two or three quoted programs.
If there are just two parts,
then they are called the if-part and the then-part.
Their meaning is as for cond
.
If there are three parts,
then they are called the if-part, the rec1-part and the rec2-part.
In that case linear recursion occurs between execution
of the rec1-part and the rec2-part.
The following is a non-recursive definition
of merge:
merge == [ [ [null] [pop] ] [ [pop null] [swap pop] ] [ [unswons2 <] [[uncons] dip] [cons] ] [ [unswons2 >] [uncons swap23] [cons] ] [ [uncons2] [cons cons] ] ] condlinrec;
Sometimes it is necessary to merge two lists of aggregates
on the basis of their first elements.
In that case the comparisons <
and >
should not be applied to the elements of the sequences
but to their first members.
A simple solution is to replace the two comparisons
respectively by the following two:
[first] app2 < [first] app2 >So the definition of the merge1 operator could be
merge1 == [ [ [null] [pop] ] [ [pop null] [swap pop] ] [ [unswons2 [first] app2 <] [[uncons] dip] [cons] ] [ [unswons2 [first] app2 >] [uncons swap23] [cons] ] [ [uncons2] [cons cons] ] ] condlinrecThe definition of
merge
(and especially merge1
) could be
optimised so that the unswons
(and the first
)
is not done repeatedly for each comparison.
As the definitions stand, they are easy to understand
and work correctly.
Computer words are short bit-sequences and a common size is 32. These can be used to implement small sets of small numbers 0..31, with a few common set operations implemented in hardware. Joy uses this in its set type. But often it is necessary to have either much larger sets or sets of larger elements. Such a big set type can be implemented in various ways: as unordered lists, as ordered lists, as unbalanced trees or as balanced trees. Each implementation method has its advantages and disadvantages. The following implementation of big sets in terms of ordered lists has been adapted from \AX{Bird and Wadler}{1988 p 230 ff}{Bird:88}.
The empty set is represented as an empty list, in this library it is written as bs-new.
LIBRA (* big sets *) bs-new == [];
One very important binary set operation is union. The two parameters are sorted lists, and the returned value also has to be a sorted list. It would appear that the two lists should be simply merged. But if they have an element in common, then the returned list would then contain the element twice. However, in sets any element should occur at most once. This consideration affects the default case, the last case of the program list which is the parameter. The case occurs when the first elements of the two parameter lists are equal. So in the definition of bs-union instead of saving and later restoring both, only one is saved and later restored.
bs-union == [ [ [null] [pop] ] [ [pop null] [swap pop] ] [ [unswons2 <] [[uncons] dip] [cons] ] [ [unswons2 >] [uncons swap23] [cons] ] [ [rest [uncons] dip] [cons] ] ] condlinrec;
The same situation arises for inserting or adding a new member to a set. If the new member is already in the set, then it should not be inserted again. So if the first member of the current list is equal to the candidate new member, then the candidate is just popped off in the third line below. In the definition of bs-insert the only recursion occurs in the last, the default case.
bs-insert == [ [ [pop null] [swons] ] [ [[first] dip >] [swons] ] [ [[first] dip =] [pop] ] [ [[uncons] dip] [cons] ] ] condlinrec;
The next operator tests for membership,
so it must return a truth value.
If the list is null or its first element
is >
than the candidate,
then false
is returned.
If the first element is =
to the candidate,
then true
is returned.
In the default case, when the relation is <
,
the list has to be inspected recursively further down.
So the case must contain two programs to effect
the recursion.
However, on the way back from the recursion,
the last returned truth value is the one to be used.
Hence no further action is required,
and the second program is just []
.
This is the definition of bs-member:
bs-member == [ [ [pop null] [pop2 false] ] [ [[first] dip >] [pop2 false] ] [ [[first] dip =] [pop2 true] ] [ [[rest] dip] [] ] ] condlinrec;
The same device is used in the default case of the definition of bs-differ
for finding the difference between two sets.
As may be seen, there are two further recursive cases,
for <
and >
, and one of them uses the same device again.
bs-differ == [ [ [null] [pop]] [ [pop null] [pop pop []] ] [ [unswons2 <] [[uncons] dip] [cons] ] [ [unswons2 >] [rest] [] ] [ [[rest] dip rest] [] ] ] condlinrec;
The next definition is for bs-delete, it deletes a specified member from a set, if it is a member at all. The only recursive case is the default case.
bs-delete == [ [ [pop null] [pop] ] [ [[first] dip >] [pop] ] [ [[first] dip =] [pop rest] ] [ [[uncons] dip] [cons] ] ] condlinrec.
The operations of inserting or deleting members into or from a set are essentially special cases of taking unions or differences with unitsets. So the following definitions might have been given instead of the earlier, more efficient definitions:
bs-insert == unitlist bs-union; bs-delete == unitlist bs-differ;
A dictionary is a way of implementing finite functions
as argument-value pairs.
A pair is best implemented in Joy as a two element list.
The totality of pairs is then essentially a big set,
and any of the ways of implementing these is suitable here.
If the argument part of pairs is subject to an ordering relation,
the sets of pairs can be implemented as
lists ordered in accordance with the first element,
the argument of the pairs.
Not surprisingly then,
some of the code to follow is reminiscent
of code for qsort1
and merge1
.
The following is a library for the dictionary type.
A new dictionary is created by d-new.
A predicate d-null returns true
or false
according as the parameter dictionary is empty or not.
New pairs are added by d-add,
they are inserted in the correct place based on the ordering
of the first member of the pairs.
The union or difference of two dictionaries
is given by the two binary operators d-union and d-differ.
A single pair is removed by the binary operator d-rem,
it removes the pair whose first member matches the given query
parameter.
Instead of a test for membership there is a binary operator
d-look which extracts the first pair whose first element
matches the query.
Only the program for one of the operators will be developed here, the program for d-union:
To form the union of two dictionaries D1 and D2: 1 If D2 is empty, pop it off and return just D1 2 If D1 is empty, retain D2, pop D1 and return D2 3 Extract the first pairs from D1 and D2, from both pairs compare their firsts with < If the comparison is true, below D2, uncons D1 into its first and rest recurse anonymously on the rest of D1 and D2 cons the saved first pair from D1 into the result 4 Extract the first pairs from D1 and D2, from both pairs compare their firsts with > If the comparison is true, uncons D2, put its first below D2 recurse anonymously on D1 and the rest of D2 cons the saved first pair from D2 into the result Default (the firsts of the first pairs of D1 and D2 are =): uncons both D1 and D2 into their first and rest, recurse on the two rests to form their union, cons the two saved firsts into the result.In the default case both first pairs are retained, so that if one is deleted, the other one, which may well have a different second component, is still available.
As may be seen, the d-union operator is
very similar to the bs-union
operator.
The other three operators d-differ, d-look and d-rem
are similar to their counterparts for big sets.
The entire library is the following:
LIBRA (* dictionary *) d_new == []; d_null == null; d_add == [ [ [pop null] [swons] ] [ [[first] dip [first] app2 >=] [swons] ] [ [[uncons] dip] [cons] ] ] condlinrec; d_union == [ [ [null] [pop] ] [ [pop null] [popd] ] [ [unswons2 [first] app2 <] [[uncons] dip] [cons] ] [ [unswons2 [first] app2 >] [uncons swap23] [cons] ] [ [uncons2] [cons cons] ] ] condlinrec; d_differ == [ [ [null] [pop]] [ [pop null] [pop pop []] ] [ [unswons2 [first] app2 <] [[uncons] dip] [cons] ] [ [unswons2 [first] app2 >] [rest] [] ] [ [[rest] dip rest] [] ] ] condlinrec; d_look == [dup] dip [ [ [pop null] [pop pop "not found"] ] [ [[first first] dip >] [pop pop "not found"] ] [ [[first first] dip =] [pop first] ] [ [[rest] dip] [] ] ] condlinrec; d_rem == [ [ [pop null] [pop] ] [ [[first first] dip >] [pop] ] [ [[first first] dip =] [pop rest] ] [ [[uncons] dip] [cons] ] ] condlinrec.The definitions of big sets and dictionaries are part of the library file TYPLIB.JOY.
Apart from the aggregate types it is useful to have another type, the tree type. These are lists which can contain lists as members which might contain lists as members and so on. Formally define a leaf to be anything which is not a list. Then a tree is defined to be either a leaf or a list of trees. Sometimes one needs the concept of a proper tree -- this is just a list of trees. Trees are similar to other aggregates, but since the tree datatype is recursive, a special treatment is generally needed.
Just as there is the step
combinator to step
through the elements of an aggregate,
so there is a treestep combinator to step
through the leaves of a tree.
For example,
the following are the already familiar program
for computing the sum of the numbers in an aggregate
and a similar program for computing the sum
of the numbers in a tree:
sum == 0 swap [+] step treesum == 0 swap [+] treestepIn the same way, the following are a familiar program and a new one for determining the
size
of an aggregate
and the treesize of a tree:
size == 0 swap [pop succ] step treesize == 0 swap [pop succ] treestepSimilarly, the following are a familiar program and a new one for shunting members of an aggregate or a tree, respectively, into an initially empty list:
shunt == [swons] step treeshunt == [swons] treestepFor the binary operator treeshunt the all leaves will appear in the result list, but in reverse order.
A tree may be flattened completely, losing its entire internal structure but retaining the order of the leaves by the unary operator treeflatten:
treeflatten == [] swap treeshunt reverse
From a given tree we can obtain the reverse list of its leaves by
[] swap treeshuntBut this may not be what is wanted. To reverse the tree while retaining its structure it is necessary to reverse the top level list, reverse the second level lists, reverse the third level lists and so on. For tasks such as this Joy has a ternary combinator treegenrec for general recursion through trees. It is used like this:
[O1] [O2] [C] treerecgenHere
[O1]
must be a program applicable to leaves,
[O2]
must be an operator applicable to lists,
and [C]
must be a combinator
applicable to lists with operators such as [O2]
.
Different choices of the three quotation parameters
yield surprisingly different operators for trees
or combinators applicable to trees.
Using this combinator the unary treereverse operator
is defined by
treereverse == [] [reverse] [map] treegenrec
The same treegenrec
combinator can be used to define
a unary combinator treemap which takes a tree and quoted program
as parameters and returns a tree of the same structure
but with each leaf as modified by the program parameter.
treemap == [] [map] treegenrec
The same combinator can be used to define a unary combinator treefilter which expects a tree and a quoted predicate. What is returned is a tree of the same structure but with only those leaves which pass the test predicate.
treefilter == [] swap orlistfilter [map] treegenrecThe first portion,
[] swap
just inserts the required [O1]
which in this case
does nothing.
Following that is a modification of the test predicate,
to be explained presently.
The rest of the definition is familiar.
treefilter == [] swap orlistfilter [map] treegenrecThe
[O2]
operator to be used here is constructed
from the test predicate [P]
by orlistfilter,
which constructs
[ [ [list] [P] disjoin ] filter ]The
orlistfilter
is defined in two steps:
orlist == [list] swap disjoin orlistfilter == orlist [filter] consAn operator to remove all leaves from a tree, but retaining its list structure is treestrip, defined as follows:
treestrip == [list] treefilter
Trees cannot have lists as leaves, but otherwise they are very flexible. In particular they can be used as queues. The following is a small collection of operations for manipulating trees when the focus is only on their leaves. A new empty tree is generated by t-new. A new leaf or a whole tree of leaves is added to an existing tree by the operator t-add; it always ensures that the tree is of a form suitable for the remaining operators. The tree predicate t-null tests whether the tree is empty. It first has to prepare the tree by ensuring that it does not consist of lists of lists and so on which ultimately only contain the empty list. Since this is also required by two other operators, the preparing is done by a hidden unary operator. Two other operator t-front and t-rem produce, respectively, the first leaf together with the remainder of the tree, or just the remainder of the tree after removing the first leaf. Both operators first have to check that the tree is non-empty; if it is, then an error is reported. A leaf or proper tree can be turned into a suitable form by t-reset.
The implementation is as follows.
A proper tree is always a list,
and an empty tree starts off by t-new as an empty list.
Anything can be added by t-add to an existing tree,
and this has to ensure that the result has a suitable standard form.
The same is true for t-reset which firstmakes a copy of an existing tree.
The other operators, t-null, t-front and t-rem
all require the tree to be in a suitable standard form.
This is done by prepare
which is defined using condlinrec.
If the tree is null
, it is left as it is.
If the first
is null
, then the rest
is taken
and condlinrec
recurses.
If the first
of the first
is a list,
then that is unswonsed
, condlinrec
recurses
and on return does nothing further.
In all other cases the tree is left as it is.
HIDE (* tree *) error == "non-empty tree needed for" putchars putchars abort; prepare == [ [ [null] [] ] [ [first null] [rest] [] ] [ [first first list] [[unswons] infra] [] ] [ [] ] ] condlinrec IN t-new == []; t-reset == dup unitlist unitlist; t-add == unitlist unitlist cons; t-null == prepare dup null; t-front == prepare [null] ["t-front\n" error] [dup first first] ifte; t-rem == prepare [null] ["t-rem\n" error] [unswons unswons [swons] dip] ifte END
The definitions of trees is partof the library file TYPLIB.JOY.