Evaluating Vector Expressions from DERVISH

Dervish supports a reasonably full set of mathematical operations upon VECTORs of floating point values; these are (yet another) array datatype. Fortunatly they are transparently converted from ARRAYs (or AFs) as needed, and may be converted back to AFs at the whim of the user.

VECTORS can now do what AFs can do. There are TCL bindings and c binding for VECTORS that mimic (and improve a bit) on what AFs do.

The evaluator will accept most expressions familiar from C, with a few additions (and, of course, some omissions). Details are given in the section Expressions acceptable to the vector expression package. The full grammar is presented in The Grammar for Vector Expressions.

The following discussion details each TCL expression and whenever appropriate, parts of TCL procedures are included as examples.

In the discussions for each TCL extension, the following conventions apply:

  • All required parameters are denoted by delimiting them with < and >, example <VECTOR>
  • All optional parameters are denoted by delimiting them with [ and ]; example [pos]
  • All optional command line switches are denoted by preceeding the switch with a - (negative sign); example -increasing
  • Description of TCL expressions

    Creation and Deletion

  • vectorExprNew Create a new VECTOR
  • vectorExprDel Delete a VECTOR
  • Miscellaneous Operations

  • vectorExprEval Evaluate an expression
  • vectorExprSet Evaluate an expression, and set a preexisting VECTOR
  • vectorExprPrint Print a vector
  • vectorExprPlot Plot a pair of vectors
  • vectorExprToAf Convert a vector to an AF
  • ----------------------------------------------------------------------------

    vectorExprNew

    Creates a new vector and returns a handle bound to it. The dimension of the vector must be specified; the new vector will be initialised to zero.

    TCL SYNTAX: vectorExprNew <DIMEN> <DIMEN> - length of the vector to create RETURNS: TCL_OK Successful completion. Interp result contains the handle of the newly created vector TCL_ERROR On an error. Interp result will contain reason for the error ----------------------------------------------------------------------------

    vectorExprDel

    Deletes an existing vector, including all of its elements.

    TCL SYNTAX: vectorExprDel <VECTOR> <VECTOR> handle of the vector to be deleted RETURNS: TCL_OK Successful completion. No result string TCL_ERROR Error occurred. Interp result will contain the reason ---------------------------------------------------------------------------

    vectorExprEval

    Evaluate an expression, and return a handle to the resulting vector. This is the basic method used to evaluate an expression, but it creates both a new vector and a new handle every time that it is called. You may prefer to use vectorExprSet for most purposes.

    because the <EXPR> is a single argument, if it contains any spaces it must be quoted.

    TCL SYNTAX: vectorExprEval <EXPR> <EXPR> - An expression to evaluate RETURNS: TCL_OK Successful completion. Interp result contains the handle of the newly created vector TCL_ERROR On an error. Interp result will contain reason for the error ----------------------------------------------------------------------------

    vectorExprSet

    Evaluate an expression, and set a preexisting vector <handle> to the result. The previous value of the <handle> is freed (but only after it has been used! It is safe to say

    set a [vectorExprSet $a asin($a)] without fear that you are taking the arcsin of a non-existent vector).

    This is the verb equivalent to an assignment command in a programming language; you are required to have `declared' the vector before it is used, using either vectorExprNew or vectorExprEval.

    Because the <EXPR> is a single argument, if it contains any spaces it must be quoted.

    TCL SYNTAX: vectorExprSet <VECTOR> <EXP> <VECTOR> - A preexisting handle to a vector, to be replaced by the value of <EXPR>. <EXPR> - An expression to evaluate RETURNS: TCL_OK Successful completion. Interp result contains the handle of the newly created vector TCL_ERROR On an error. Interp result will contain reason for the error ----------------------------------------------------------------------------

    vectorExprPrint

    Print the values of a vector

    TCL SYNTAX: vectorExprPrint <VECTOR> <VECTOR> - vector to print RETURNS: TCL_OK Successful completion. TCL_ERROR On an error. Interp result will contain reason for the error ----------------------------------------------------------------------------

    vectorExprPlot

    Use pgplot to plot two vectors <x> and <y>. You can specify that you want the vectors connected, drawn as points, or drawn as specified by a PGSTATE object. Optionally, error bars may be drawn.

    The axis limits must be set before this command is used.

    TCL SYNTAX: vectorExprPlot [ -pgstate <PGSTATE> | -connect | -points <MARK-TYPE>] <X> <Y> <X> - The handle to the vector of x-coordinates <Y> - The handle to the vector of y-coordinates -connect - Connect the points -pgstate <PGSTATE> - Handle to desired PGSTATE -points <MARK-TYPE> - Draw the points, using a marker of type <MARK-TYPE> -lxerror - The handle to the vector of x-coordinates of lower error bar end -uxerror - The handle to the vector of x-coordinates of upper error bar end -lyerror - The handle to the vector of y-coordinates of lower error bar end -uyerror - The handle to the vector of y-coordinates of upper error bar end -eBarSize - The relative size of the bars across the error bar tip RETURNS: TCL_OK Successful completion. TCL_ERROR On an error. Interp result will contain reason for the error An example of the use of this command would be: set pg [pgstateNew] pgstateOpen $pg pgPage pgWindow -1.1 1.1 -1.1 1.1 pgBox set x [vectorExprEval -1,1,.1] vectorExprPlot -pgstate $pg $x [vectorExprEval sin($x)] ----------------------------------------------------------------------------

    vectorExprToAf

    Converts a <vector> to an <af>. The error and mask elements will be set to 0.

    TCL SYNTAX: vectorExprToAf <VECTOR> <VECTOR> - The vector to convert RETURNS: TCL_OK Successful completion. Interp result contains the handle of the newly created AF TCL_ERROR On an error. Interp result will contain reason for the error

    What is a VECTOR

    A VECTOR is a C structure with the definition: typedef struct { int dimen; float *vec; } VECTOR which is used by the expression package. The package could have been built upon Dervish ARRAYs; maybe one day it will be.

    It could also have been built upon AFs, but this did not seem like a good idea for a number of reasons. The vector expression words subsume almost all of AFs functionality in a different syntax, and making the two coexist within one Dervish package would have been awkward. AFs are very `heavy weight', with each value carrying around a mask and an error. Neither of these extra fields is included in the vector package, as the functionality can easily be supplied when desired.

    The expression package will correctly convert an ARRAY (of suitable type) or an AF to a VECTOR when referenced in a vector context.

    Expressions acceptable to the vector expression package

    The full grammar acceptable to the expression evaluator is given in the section on the expression grammar

    All operators operate upon every element of the vector, so

    set x [vectorExprEval 0,5] set y [vectorExprEval $x^2] vectorExprPrint $y prints 0 1 4 9 16 25.

    Operator precedence for expressions is given by the following table. The highest precedence operators are given at the top, and all operators in the same row have equal precendence. All operators are left associative (i.e. 1 / 2 / 3 is evaluated as (1 / 2) / 3 rather than 1 / (2 / 3) --- which is the behaviour that you'd hope for.) `e' is used to mean an expression.

    --------------------------------------------- <> [] () {} (int) + - ! unary ^ e,e e,e,e * / + - == != < <= > >= && || e?e:e if --------------------------------------------- In the first row, () refers both to parentheses used to change precedence (e.g. 3*(1+2)) and to function calls. The functions supported are abs, asin, acos, atan, atan2, cos, lg, ln, sin, sqrt, and tan. The only one of these that may be unfamiliar is atan2; it takes two arguments, y and x and returns the arctangent of y/x in the proper quadrant.

    Because TCL makes typing [] (for array indexing) hard, you are permitted to use < and > instead. But this makes the grammar ambiguous; this is resolved by demanding that < used as a subscript operator must have no space between the array name and the <, and that < used as a relational operator must have space.

    The unfamiliar construct {} allows you to enter an explicit set of numbers, but as {} are special to TCL you have to double them:

    vectorExprPrint [vectorExprEval 2*{{3 1 4 1 5 9}}] prints 6 2 8 2 10 18.

    (int) is a cast to int.

    The next row consists of unary operators, for example -1 or !10.

    Next comes ^, used to calculate powers (e.g. 2^3), and implicit DO loops; for example

    vectorExprPrint [vectorExprEval 1,4,.5] prints 1 1.5 2 2.5 3 3.5 4.

    As mentioned above, < used as a logical operator must be preceeded by white space. Sorry.

    The logical operators && and || do not `shortcircuit'; that is, both sides are evaluated even when evaluating only the left side would be sufficient.

    The last row provides two interesting operators. e1?e2:e3 is just like the corresponding expression in C (if e1 is true return e2, otherwise return e3) but it is much more useful in a vector context, for example

    set x [vectorExprEval -1,1,.1] vectorExprPrint [vectorExprEval "$x >= 0 ? ln($x) : -1000"] sets $x to the natural log of $x if non-negative, and -1000 otherwise. Note that the result has the same dimension as the input vector. Due to the way that things are written, both expressions e2 and e3 are currently evaluated so you may see some spurious error messages.

    The if statement is related, but leads to a vector with a smaller dimension than the input; for example

    vectorExprSet $x "$x if $x >= 0" vectorExprPrint [vectorExprSet $x ln($x)] creates a vector with only 11 elements.

    You might be wondering how this expression code interacts with the words that evaluate handle expressions, such as h0.mask->rows<0><0>. The answer is that it doesn't (the contexts where you'd want to use them are rather different), but that you can of course mix them through the magic of TCL:

    vectorExprEval sin([eval "handleBindNew [handleExprEval h22.arr]"]) should work.

    Select uses the 2nd vector as an index to the first, and returns a vector with those elements. It combines in interesting ways with sort. Consider:

    	set v [vectorExprEval {{2 1 3 0}}]
    	set v2 [vectorExprEval {{30 20 10 0}}]
    	vectorExprPrint [vectorExprEval sort($v)]
    	vectorExprPrint [vectorExprEval sort($v:$v2)]
    	set ind [vectorExprEval sort($v:)]
    	vectorExprPrint [vectorExprEval select($v:$ind)]
    	vectorExprPrint [vectorExprEval select($v:sort($v:))]
    

    Ntile counts the elements on the 1st vector that are less then the value of the 2nd. The 2nd vector may be either of size 1 or of dimen(vector1). When divided by the dimen(vector1), the results of Ntile are the n-tile of the vector 1. If vector1 is sorted, the result is the index of the first element in vector 1 that is greater than or equal to the value.

    Cumulative functions are two vectors, one with the values, the other with the cumulative distribution evaluated at the values. To build a cumulative function, use Ntile:

    	set v [vectorExprEval {{2 1 3 0 4 1 9}}]
    	set values [vectorExprEval {{1 3 10}}]
    	set cumul [vectorExprEval ntile($v:$value)/dimen($v)]
    
    To manipulate cumulative functions, one needs a "getIndex" which returns the index of the first element above some value and this is exactly what ntile does. One uses that index to find the value of the values vector. Also important in manipulating cumulatives is their use in monte carlos. One may represent, for instance, a luminosity function by building the cumulative function of magnitudes. One may then wish to construct many realizations of that luminosity function. This may be done by the following:
    	# assume values are in $values and cumulative function is in $cumul
    	set n 100
    	set min [vectorExprNew $n]
    	set max [vectorExprNew $n]
    	vectorExprSet $max 1
    
    	mags = select($values:ntile($cumul:rand($min:$max)))
    

    The rand and randn functions use rand. The seed for rand is set using srand, which is wrapped into a tcl verb called seedSet. If one wishes reproducible random numbers, the sequence would be:

    	seedSet 10
    	set random [vectorExprEval rand(0:1)]
    

    The Grammar for Vector Expressions

    The grammar used by the expression evaluator is: expr : logical_expr # toplevel expression | expr ? expr : expr | expr IF expr | expr CONCAT expr logical_expr : rel_expr # logical expression; no short circuit | rel_expr && rel_expr | rel_expr || rel_expr rel_expr : add_expr # relational expression | add_expr == add_expr | add_expr != add_expr | add_expr < add_expr # see note at end of this section | add_expr <= add_expr # " " " " " " " " | add_expr > add_expr | add_expr >= add_expr add_expr : mult_expr # additive expression | mult_expr + mult_expr | mult_expr - mult_expr mult_expr : pow_expr * pow_expr # multiplicative expression | pow_expr / pow_expr pow_expr : unary_expr # evaluate a power or implicit DO loop | unary_expr ^ unary_expr | unary_expr, unary_expr | unary_expr, unary_expr, unary_expr unary_expr : primary # unary expression | + expr | - expr | ! expr primary : WORD # primary | number | { number_list } | WORD < add_expr > # see note at end of this section | WORD [ add_expr ] | ( expr ) | ( INT ) expr | ABS ( expr ) | ASIN ( expr ) | ACOS ( expr ) | ATAN ( expr ) | ATAN2 ( expr : expr ) | COS ( expr ) | DIMEN ( expr ) | LG ( expr ) | LN ( expr ) | MAP ( expr1 : index: expr2 ) replace expr2 elements given by index with expr1 | NTILE ( expr1 : expr2 ) count on expr1 the values less than expr2 => find the index into expr1 that maps expr2 onto expr1 | RAND ( expr1 : expr2 ) uniform random #'s between expr1 and expr2 | RANDN ( expr1 : expr2 ) normal random #'s with mean expr1 and sigma expr2 | SELECT ( expr1 : expr2 ) select expr1 according to expr2 | SIN ( expr ) | SORT ( expr ) sort expr | SORT ( expr : ) return sorted indices | SORT ( expr1 : expr2 ) sort expr1 into expr2's order | SQRT ( expr ) | SUM ( expr ) | TAN ( expr ) Auxiliary rules: number_list : number | number_list number number : FLOAT | INT | - FLOAT | -INT Because TCL makes typing [] hard, by default this expression evaluator permits you to use < and > instead. But this makes the grammar ambiguous; this is resolved by demanding that < used as a subscript operator must have no space between the array name and the <, and that < used as a relational operator must have space.