Evaluating Vector Expressions from DERVISH
Dervish supports a reasonably full set of mathematical operations upon
VECTORs of floating point values;
these are (yet another) array datatype. Fortunatly they are transparently
converted from ARRAYs (or AFs) as needed, and may be converted back to AFs
at the whim of the user.
VECTORS can now do what AFs can do. There are
TCL bindings and c
binding for VECTORS that mimic (and improve a bit) on what AFs do.
The evaluator will accept most expressions familiar from C, with a few
additions (and, of course, some omissions). Details are given in the section
Expressions acceptable to the vector expression package.
The full grammar is presented in
The Grammar for Vector Expressions.
The following discussion details each TCL expression and whenever
appropriate, parts of TCL procedures are included as examples.
In the discussions for each TCL extension, the following conventions apply:
All required parameters are denoted by delimiting them with < and >,
example <VECTOR>
All optional parameters are denoted by delimiting them with [ and ];
example [pos]
All optional command line switches are denoted by preceeding the switch
with a - (negative sign); example -increasing
Description of TCL expressions
Creation and Deletion
vectorExprNew Create a new VECTOR
vectorExprDel Delete a VECTOR
Miscellaneous Operations
vectorExprEval Evaluate an expression
vectorExprSet Evaluate an expression, and set
a preexisting VECTOR
vectorExprPrint Print a vector
vectorExprPlot Plot a pair of vectors
vectorExprToAf Convert a vector to an AF
----------------------------------------------------------------------------
Creates a new vector and returns a handle bound to it. The dimension
of the
vector must be specified; the new vector will be initialised to zero.
TCL SYNTAX:
vectorExprNew <DIMEN>
<DIMEN> - length of the vector to create
RETURNS:
TCL_OK Successful completion. Interp result contains the handle of
the newly created vector
TCL_ERROR On an error. Interp result will contain reason for the error
----------------------------------------------------------------------------
Deletes an existing vector, including all of its elements.
TCL SYNTAX:
vectorExprDel <VECTOR>
<VECTOR> handle of the vector to be deleted
RETURNS:
TCL_OK Successful completion. No result string
TCL_ERROR Error occurred. Interp result will contain the reason
---------------------------------------------------------------------------
Evaluate an expression, and return a handle to the resulting vector.
This is the basic method used to evaluate an expression, but it creates
both a new vector and a new handle every time that it is called. You may
prefer to use vectorExprSet for most purposes.
because the <EXPR> is a single argument, if it contains any spaces
it must be quoted.
TCL SYNTAX:
vectorExprEval <EXPR>
<EXPR> - An expression to evaluate
RETURNS:
TCL_OK Successful completion. Interp result contains the handle of
the newly created vector
TCL_ERROR On an error. Interp result will contain reason for the error
----------------------------------------------------------------------------
Evaluate an expression, and set a preexisting vector <handle> to the result.
The previous value of the <handle> is freed (but only after it has been used!
It is safe to say
set a [vectorExprSet $a asin($a)]
without fear that you are taking the arcsin of a non-existent vector).
This is the verb equivalent to an assignment command in a programming language;
you are required to have `declared' the vector before it is used, using either
vectorExprNew or vectorExprEval.
Because the <EXPR> is a single argument, if it contains any spaces
it must be quoted.
TCL SYNTAX:
vectorExprSet <VECTOR> <EXP>
<VECTOR> - A preexisting handle to a vector, to be replaced by the value of
<EXPR>.
<EXPR> - An expression to evaluate
RETURNS:
TCL_OK Successful completion. Interp result contains the handle of
the newly created vector
TCL_ERROR On an error. Interp result will contain reason for the error
----------------------------------------------------------------------------
Print the values of a vector
TCL SYNTAX:
vectorExprPrint <VECTOR>
<VECTOR> - vector to print
RETURNS:
TCL_OK Successful completion.
TCL_ERROR On an error. Interp result will contain reason for the error
----------------------------------------------------------------------------
Use pgplot to plot two vectors <x> and <y>.
You can specify that you want the vectors connected, drawn as points, or
drawn as specified by a PGSTATE object. Optionally, error bars may be drawn.
The axis limits must be set before this command is used.
TCL SYNTAX:
vectorExprPlot [ -pgstate <PGSTATE> | -connect | -points <MARK-TYPE>] <X> <Y>
<X> - The handle to the vector of x-coordinates
<Y> - The handle to the vector of y-coordinates
-connect - Connect the points
-pgstate <PGSTATE> - Handle to desired PGSTATE
-points <MARK-TYPE> - Draw the points, using a marker of type <MARK-TYPE>
-lxerror - The handle to the vector of x-coordinates of lower error bar end
-uxerror - The handle to the vector of x-coordinates of upper error bar end
-lyerror - The handle to the vector of y-coordinates of lower error bar end
-uyerror - The handle to the vector of y-coordinates of upper error bar end
-eBarSize - The relative size of the bars across the error bar tip
RETURNS:
TCL_OK Successful completion.
TCL_ERROR On an error. Interp result will contain reason for the error
An example of the use of this command would be:
set pg [pgstateNew]
pgstateOpen $pg
pgPage
pgWindow -1.1 1.1 -1.1 1.1
pgBox
set x [vectorExprEval -1,1,.1]
vectorExprPlot -pgstate $pg $x [vectorExprEval sin($x)]
----------------------------------------------------------------------------
Converts a <vector> to an <af>. The error and mask elements will be
set to 0.
TCL SYNTAX:
vectorExprToAf <VECTOR>
<VECTOR> - The vector to convert
RETURNS:
TCL_OK Successful completion. Interp result contains the handle of
the newly created AF
TCL_ERROR On an error. Interp result will contain reason for the error
A VECTOR is a C structure with the definition:
typedef struct {
int dimen;
float *vec;
} VECTOR
which is used by the expression package. The package could have been built upon
Dervish ARRAYs; maybe one day it will be.
It could also have been built upon AFs, but this did not seem like a good idea
for a number of reasons. The vector expression words subsume almost all of
AFs functionality in a different syntax, and making the two coexist within one
Dervish package would have been awkward. AFs are very `heavy weight', with each
value carrying around a mask and an error. Neither of these extra fields is
included in the vector package, as the functionality can easily be supplied
when desired.
The expression package will correctly convert an ARRAY (of suitable type)
or an AF to a VECTOR when referenced in a vector context.
Expressions acceptable to the vector expression
package
The full grammar acceptable to the expression evaluator is given in the
section on the expression grammar
All operators operate upon every element of the vector, so
set x [vectorExprEval 0,5]
set y [vectorExprEval $x^2]
vectorExprPrint $y
prints 0 1 4 9 16 25.
Operator precedence for expressions is given by the following table. The
highest precedence operators are given at the top, and all operators
in the same row have equal precendence. All operators are left associative
(i.e. 1 / 2 / 3 is evaluated as (1 / 2) / 3 rather than 1 / (2 / 3) ---
which is the behaviour that you'd hope for.) `e' is used to mean an expression.
---------------------------------------------
<> [] () {} (int)
+ - ! unary
^ e,e e,e,e
* /
+ -
== != < <= > >=
&& ||
e?e:e if
---------------------------------------------
In the first row, () refers both to parentheses used to change precedence
(e.g. 3*(1+2)) and to function calls. The functions supported are
abs, asin, acos, atan, atan2, cos, lg, ln, sin, sqrt, and tan. The only one
of these that may be unfamiliar is atan2; it takes two arguments, y and x
and returns the arctangent of y/x in the proper quadrant.
Because TCL makes typing [] (for array indexing) hard, you are permitted
to use < and > instead. But this makes the grammar
ambiguous; this is resolved by demanding that < used as a subscript
operator must have no space between the array name and the <,
and that < used as a relational operator must have space.
The unfamiliar construct {} allows you to enter an explicit set of numbers,
but as {} are special to TCL you have to double them:
vectorExprPrint [vectorExprEval 2*{{3 1 4 1 5 9}}]
prints 6 2 8 2 10 18.
(int) is a cast to int.
The next row consists of unary operators, for example -1 or !10.
Next comes ^, used to calculate powers (e.g. 2^3), and implicit DO loops;
for example
vectorExprPrint [vectorExprEval 1,4,.5]
prints 1 1.5 2 2.5 3 3.5 4.
As mentioned above, < used as a logical operator must be
preceeded by white space. Sorry.
The logical operators && and || do not `shortcircuit'; that is,
both sides are evaluated even when evaluating only the left side would be
sufficient.
The last row provides two interesting operators. e1?e2:e3 is just like the
corresponding expression in C (if e1 is true return e2, otherwise return e3)
but it is much more useful in a vector context, for example
set x [vectorExprEval -1,1,.1]
vectorExprPrint [vectorExprEval "$x >= 0 ? ln($x) : -1000"]
sets $x to the natural log of $x if non-negative, and -1000 otherwise.
Note that the result has the same dimension as the input vector.
Due to the way that things are written, both expressions e2 and e3 are
currently evaluated so you may see some spurious error messages.
The if statement is related, but leads to a vector with a smaller dimension
than the input; for example
vectorExprSet $x "$x if $x >= 0"
vectorExprPrint [vectorExprSet $x ln($x)]
creates a vector with only 11 elements.
You might be wondering how this expression code interacts with the words that
evaluate handle expressions,
such as h0.mask->rows<0><0>. The answer
is that it doesn't (the contexts where you'd want to use them are rather
different), but that you can of course mix them through the magic of TCL:
vectorExprEval sin([eval "handleBindNew [handleExprEval h22.arr]"])
should work.
Select uses the 2nd vector as an index to the first, and returns
a vector with those elements. It combines in interesting ways with
sort. Consider:
set v [vectorExprEval {{2 1 3 0}}]
set v2 [vectorExprEval {{30 20 10 0}}]
vectorExprPrint [vectorExprEval sort($v)]
vectorExprPrint [vectorExprEval sort($v:$v2)]
set ind [vectorExprEval sort($v:)]
vectorExprPrint [vectorExprEval select($v:$ind)]
vectorExprPrint [vectorExprEval select($v:sort($v:))]
Ntile counts the elements on the 1st vector that are less then
the value of the 2nd. The 2nd vector may be either of size 1 or of
dimen(vector1). When divided by the dimen(vector1), the results of
Ntile are the n-tile of the vector 1. If vector1 is sorted, the
result is the index of the first element in vector 1 that is greater
than or equal to the value.
Cumulative functions are two vectors, one with the values, the other
with the cumulative distribution evaluated at the values.
To build a cumulative function, use Ntile:
set v [vectorExprEval {{2 1 3 0 4 1 9}}]
set values [vectorExprEval {{1 3 10}}]
set cumul [vectorExprEval ntile($v:$value)/dimen($v)]
To manipulate cumulative functions, one needs a "getIndex" which
returns the index of the first element above some value and this is
exactly what ntile does. One uses that index to find the value
of the values vector. Also important in manipulating cumulatives is
their use in monte carlos. One may represent, for instance,
a luminosity function by building the cumulative function of magnitudes.
One may then wish to construct many realizations of that luminosity
function. This may be done by the following:
# assume values are in $values and cumulative function is in $cumul
set n 100
set min [vectorExprNew $n]
set max [vectorExprNew $n]
vectorExprSet $max 1
mags = select($values:ntile($cumul:rand($min:$max)))
The rand and randn functions use rand. The seed for rand is
set using srand, which is wrapped into a tcl verb called
seedSet. If one wishes reproducible random numbers, the sequence
would be:
seedSet 10
set random [vectorExprEval rand(0:1)]
The grammar used by the expression evaluator is:
expr : logical_expr # toplevel expression
| expr ? expr : expr
| expr IF expr
| expr CONCAT expr
logical_expr : rel_expr # logical expression; no short circuit
| rel_expr && rel_expr
| rel_expr || rel_expr
rel_expr : add_expr # relational expression
| add_expr == add_expr
| add_expr != add_expr
| add_expr < add_expr # see note at end of this section
| add_expr <= add_expr # " " " " " " " "
| add_expr > add_expr
| add_expr >= add_expr
add_expr : mult_expr # additive expression
| mult_expr + mult_expr
| mult_expr - mult_expr
mult_expr : pow_expr * pow_expr # multiplicative expression
| pow_expr / pow_expr
pow_expr : unary_expr # evaluate a power or implicit DO loop
| unary_expr ^ unary_expr
| unary_expr, unary_expr
| unary_expr, unary_expr, unary_expr
unary_expr : primary # unary expression
| + expr
| - expr
| ! expr
primary : WORD # primary
| number
| { number_list }
| WORD < add_expr > # see note at end of this section
| WORD [ add_expr ]
| ( expr )
| ( INT ) expr
| ABS ( expr )
| ASIN ( expr )
| ACOS ( expr )
| ATAN ( expr )
| ATAN2 ( expr : expr )
| COS ( expr )
| DIMEN ( expr )
| LG ( expr )
| LN ( expr )
| MAP ( expr1 : index: expr2 ) replace expr2 elements given by index with expr1
| NTILE ( expr1 : expr2 ) count on expr1 the values less than expr2
=> find the index into expr1 that maps expr2 onto expr1
| RAND ( expr1 : expr2 ) uniform random #'s between expr1 and expr2
| RANDN ( expr1 : expr2 ) normal random #'s with mean expr1 and sigma expr2
| SELECT ( expr1 : expr2 ) select expr1 according to expr2
| SIN ( expr )
| SORT ( expr ) sort expr
| SORT ( expr : ) return sorted indices
| SORT ( expr1 : expr2 ) sort expr1 into expr2's order
| SQRT ( expr )
| SUM ( expr )
| TAN ( expr )
Auxiliary rules:
number_list : number
| number_list number
number : FLOAT
| INT
| - FLOAT
| -INT
Because TCL makes typing [] hard, by default this expression evaluator
permits you to use < and > instead. But this makes the grammar
ambiguous; this is resolved by demanding that < used as a subscript
operator must have no space between the array name and the <,
and that < used as a relational operator must have space.