# A Data Manipulation Language - Nial Introduction, Chapter 3

# A Data Manipulation Language

All data objects in Nial are treated as rectangular nested arrays with dimen sionality, extent and depth. The array

```
X := 2 3 reshape (8 9) 23 'hello world' "goodbye (tell 3 2)([sin,cos]0.5)
+-----------------------------------+
|8 9 | 23|hello world |
+-------+---------+-----------------|
|goodbye|+-------+|0.479426 0.877583|
| ||0 0|0 1|| |
| |+---+---|| |
| ||1 0|1 1|| |
| |+---+---|| |
| ||2 0|2 1|| |
| |+-------+| |
+-----------------------------------+
```

is a 2 by 3 array of other arrays. The upper left entry (at address 0 0) is the pair 8 9 and the item at address 1 1 is the table of addresses for a 3 by 2 array. X holds a character string at address 0 2 and a phrase at address 1 0.

There are three measurements of the structure of an array:

valence | the number of dimensions (or axes) |

shape | the list of lengths of the array along each dimension |

tally | the number of items in the array. |

The measurements are related. For every array A, the following equations hold:

```
valence A = tally shape A
tally A = product shape A
```

The definitions along with the second equation indicate that the term item refers to top level objects in an array and not the most deeply nested ones. The meaning of these measure ments for the array X is as follows:

- the valence is 2 (X has two dimen sions),
- the shape is the list 2 3 (X has two rows and three columns), and
- the tally is 6 (X has six items).

The expression for the table assigned to X involves the infix use of
operation *reshape* between two lists, the first giving the shape of the
table and the second being a list of the arrays to be used as the items
of the table.

The syntax for a list is a sequence of two or more array expressions placed side-by-side. This construct is called a strand. A strand evaluates to the list that has as its items the values of the expressions in the strand. An item of a strand can be an atomic array, string or an expression in parentheses.

## Functional Objects

Nial has two classes of functional objects: operations and transformers.
Operations are the functional objects that map arrays to arrays, such as
*rows*, *reshape* and *link*. Operations are either predefined (implemented by
the Q’Nial interpreter) or defined by the user during a session.

Operations in Nial are said to be first order functions because they act directly on an argument array to produce a result array. Since a list of arrays is itself an array, some operations require that the argument array have a specific number of items. As well, many operations return a result that is a list of arrays computed by the operation.

The syntax for the use of an operation allows it to be used in both a
prefix and an infix manner. For example, operation *link* can be used
between two arguments that are lists in order to join the two lists
together, or it can be used in front of an argument that is a list of an
arbitrary number of lists in order to join all the lists together.

An operation is applied to an array and results in an array value. For
example, applying the operation *sum* to the list 3 4 5 results in the
value 12.

The notation for applying an operation in Nial is based on juxtaposition (side-by-side placement of objects). The operation is placed before the argument. The following example shows the juxtaposition syntax for applying an operation.

```
sum (3 4 5)
12
```

The parentheses used in the above example are not necessary. Nial would give the same result if they were omitted.

The notation for infix use of an operation like *reshape* places the
operation between two arguments. It is interpreted as the operation
applied to the pair formed by the two arguments.

```
2 3 reshape 4 5 6 7 8 9
4 5 6
7 8 9
reshape ((2 3) (4 5 6 7 8 9))
4 5 6
7 8 9
```

A transformer is a functional object that maps an operation to a
modified operation. It is said to be second order function, because it
is a functional object that operates on one or more first order
functions and produces a function. The effect of applying the
transformer *EACH* to the operation *first*, for example, results in an
modified operation (called the EACH transform of first) that applies
*first* to each item of its argument. Juxtaposition is used to denote the
application of a transformer to an operation. The application of the
modified operation in

```
(EACH first) ((2 3) (4 5 6 7 8 9))
2 4
```

is equivalent to applying first to the lists in the pair

```
(first (2 3)) (first (4 5 6 7 8 9))
2 4
```

Juxtaposition of objects can be used in three additional ways besides its use in forming strands. Juxtaposition can occur in prefix use of an operation, infix use of an operation and in application of a transformer. To avoid ambiguity, specific rules have been adopted:

- Strand formation takes precedence over either infix or prefix operation application.
- Transformer application takes precedence over operation application.
- If two or more prefix uses of operations occur in a row, as in
*sum link A*, the rightmost operation is done first and its result is used as the argument of the operation to the left. - If an infix use precedes a prefix use, as in
*A reshape link B*, the prefix application is done first.

The first rule means that *sum 3 4 5* is the same as *sum (3 4 5)* and that
*2 3 link 4 5* is the same as *(2 3) link (4 5)*.

The second rule means that the expression *EACH first (2 3 4) (5 6 7 8 9)*
applies the modified operation *EACH first* to the pair formed by the
strand.

The third and fourth rules exist because there is no other sensible interpretation of these juxtapositions.

A modified operation can also be used in an infix manner. For example, in the operation labeltable described in the previous chapter, the expression

```
Rowlabels EACHBOTH hitch rows Table
```

uses *EACHBOTH hitch* in an infix manner between Rowlabels and the prefix
expression rows Tables.

The reading of the above expression is helped by the spelling convention
in Nial for names associated with the different classes of objects. An
operation is spelled in lower case, a transformer in upper case and an
array variable or named expression begins with a capital with the rest
in lower case. These rules make it possible to recognize the kind of
object associated with the name. The user of Q’Nial is not required to
spell names according to these rules. When definitions are displayed by
*see* or *defedit* they appear in canonical form with this prescribed
spelling format. Nial is case insensitive; *first* and *First* are both
legal spellings, but the former is the canonical one.

The interpretation of *sum link A* to mean *sum (link A)* given above
suggests that the juxtaposition sum link could be used to denote the
composition of operations *sum* and *link*, resulting in the following
equation:

```
(sum link) A = sum (link A)
```

As well, it is useful to have this notation for composition in order to specify the composition of two operations as the argument of a transformer. For example:

```
EACH (link second) A
```

The interpretation of the above rules is illustrated by the following example:

```
2 3 4 EACH first sum 4 5 6
= (2 3 4) EACH first sum (4 5 6) (strand rule)
= (2 3 4)(EACH first) sum (4 5 6) (transformer precedence)
= (2 3 4)(EACH first)(sum (4 5 6)) (prefix operations rule)
= (2 3 4)(EACH first) 15 (meaning of sum)
= (EACH first) ((2 3 4) 15) (infix rule)
= (first (2 3 4)) (first 15) (meaning of each)
= 2 15 (meaning of first)
```

A feature of Nial syntax is that all operations are implicitly unary and their interpretation when used in an infix way is determined by the above rules rather than by an operation precedence table. Thus,

```
2 + 3 * 4
= + (2 3) * 4
= 5 * 4
= * (5 4)
= 20
```

This example demonstrates that infix uses of operations are evaluated left to right without precedence. The same expression in Pascal or C would be evaluated by doing the multiplication first.

## Bracket Notation

Nial syntax has a second way to construct a list. It is called bracket-comma notation. The expression

```
[34,275,86,-52]
```

has the same meaning as the strand

```
34 275 86 -52
```

The bracket-comma notation has the advantage that it can represent lists of length one and length zero. Thus,

```
[235]
```

denotes the list of length one holding the atom 235 as its item and the empty list Null is denoted by

```
[]
```

There is a corresponding syntactic object for operations. It is called an atlas (a list of maps). An atlas applies each component operation to the argument producing a list of results. Thus,

```
[sin,cos,sqrt] 3.14
```

is equivalent to

```
[sin 3.14,cos 3.14,sqrt 3.14]
```

An elegant example of atlas notation is to rewrite the definition of average as:

```
average IS / [sum,tally]
```

The equivalence with the earlier definition can be demonstrated as follows:

```
average A
= (/ [sum,tally]) A
= / ([sum,tally] A) (prefix operations rule)
= / [sum A,tally A] (atlas rule)
= / (sum A) (tally A) (strand equivalence)
= (sum A) / ( tally A) (infix rule)
```

The atlas notation is mainly used as a shorthand for describing operations without the need to explicitly name the argument. It is also used to form an operation argument for a transformer that uses two or more operations.

## Addresses and Indexing

In procedural programming languages, arrays are treated as subscripted variables. A name declared to be an array is said to denote a collection of variables, each of which can be assigned a value independently. In such languages, there is a notation for denoting one variable from the collection, similar to the idea of using a subscript in mathematical notation. In such languages, array computations use looping constructs to do a computation that accesses all the items of an array.

In Nial, arrays are implicitly given an addressing scheme. For the table
T, the addresses are given by the operation *grid* as an array of the same
shape as T, with each location holding its own address.

```
T := 3 4 reshape count 12
1 2 3 4
5 6 7 8
9 10 11 12
grid T
+---------------+
|0 0|0 1|0 2|0 3|
+---+---+---+---|
|1 0|1 1|1 2|1 3|
+---+---+---+---|
|2 0|2 1|2 2|2 3|
+---------------+
```

The addresses of an array are integers for a 1-dimensional array (a list) or lists of integers for other arrays. The numbering scheme uses zero-origin counting. For the table T above, the addresses are lists of length two. The list A below has addresses that are integers.

```
set "diagram;
A := count 5
+---------+
|1|2|3|4|5|
+---------+
grid A
+---------+
|0|1|2|3|4|
+---------+
```

The operation *pick* is the fundamental selection operation in Nial based
on addressing. The expression

```
1 2 pick T
7
```

selects the item of T at address 1 2. In picking from the list A, either an integer or a list of one integer corresponding to an address can be used:

```
4 pick A
5
[4] pick A
5
```

The operation *choose* can be used to select multiple items from an array.
For example, in the expression below, *choose* returns the items of T at
the three addresses in the left argument:

```
[1 2,2 3,0 0] choose T
7 12 1
```

Nial also has an indexing notation similar to subscript notation in other languages. The following expression selects the item of T at address 1 2.

```
T@(1 2)
7
```

This form of selection is called at-indexing. One difference between it
and using *pick* for selection is that *pick* can be modified using a
transformer because it is an operation but the @ symbol is a syntactic
construct and cannot be modified by a transformer. Also the at-notation
can only be used with a variable, whereas pick can select from the
result of any expression.