TBLCOL/Schema Conversion

TCL API

C Routine Interface

Contributed Functions

Other Information

Overview

  • Translation Table Syntax
  • Portable Enumeration and Heap Capabilities
  • Examples
  • Translation Table Syntax

    Single Line Entires

    Single-line entries in a translation table are self-explanatory. They match a specific FITS field name to a field in schema, and the conversion routines will fill one field with the values found from the other. Note 1st field is always at FITS side and the 2nd is always the schema side when adding entries. This way, a single translation table can be used in conversion of either direction.

    Following example shows how to add an entry for converting the field called "id" in a schema to FITS field "MY_ID".

    dervish> schemaTransNew h1 dervish> schemaTransEntryAdd h1 name MY_ID id Entry added dervish> Note capitalizing MY_ID is optional, as the routine will convert all FITS field names to upper cases.

    Single line entries are very useful when the fields one is interested in are all elementary fields. However, such may not be all in the cases. If a user-defined structure is present in a schema and the user wants to convert it, multiple line entry is used. A multiple line entry has a single main line followed by one or more continuation lines.

    Continuation line

    Let's see an example. Suppose we want to match a FITS field called GSC_ID to nrow, i.e, region.mask.nrow (or even region->mask->nrow), where region is a REGION and mask is a MASK. The translation table looks like: ConvType FITS-side SCHEMA-side FieldType ratio constructor size name GSC_ID mask struct 1.000 none none cont mask nrow int 1.000 none none This is achieved by: dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct dervish> schemaTransEntryAdd $xtbl cont mask nrow int For each add, we may specify options like -proc, -dimen and -ratio to elaborate the conversion at that level. In this case, mask has a Tcl constructor called maskNew, we then: dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct -proc maskNew dervish> schemaTransEntryAdd $xtbl cont mask nrow int The table now looks like ConvType FITS-side SCHEMA-side FieldType ratio constructor size name GSC_ID mask struct 1.000 maskNew none cont mask nrow int 1.000 none none So each time a mask is found, maskNew will be used to construct that objects. (if one doesn't give a constructor, the conversion routines will try to other ways. See Oject State Initialization for more details.

    Array/Pointer Specification

    Array specification is flexibly allowed in a translation table. But before we explain that, we first go through some basics.

    A schema field can be a combination of following three types:

  • C elementary types (e.g, char, int) or custom-defined types (e.g, struct)
  • Pointer type with a number of indirections (stars), e.g, char ***s;
  • Array, e.g, int i[10][20];
  • Although C generally allows users to mix pointers with arrays, these two types have important differences. Array memories are static and contiguous, whereas pointers require initialization and/or run-time memory allocation, and may point to discrete memory blocks. And loosely speaking, the former doesn't need de-reference indirect addressing, while the latter does.

    For a field that is already an array (static, contiguous memory), say, int c[5], translation table allows specifications like c[3] to get the 4th element of the array. In other words, you can do

    schemaTransEntryAdd name GSC_ID {c\[3\]} int where escape \ is used. Now the translation table looks like: ConvType FITS-side SCHEMA-side FieldType ratio constructo size name GSC_ID c[3] int 1.000 none none

    Therefore the value of GSC_ID will get copied to c[3], but not other elements of the array.

    Multidimensional arrays have similar syntax. Say now c is defined to be c[5][10]. You may then in the table specify c[3][2] to access the 3rd element of the 4th array, whose value will be set to that of GSC_ID.

    You may even be able to specify c[3] for the above multi-D array. In this case, c[3] just mean c[3][0], but now GSC_ID has to be an 1-D array of size 10 because c[3] is an array of size 10 whose top is pointed by &c[3][0]!

    A field might also be just a pointer to pointer to ... pointer, say,

    REGION ****region; int ****i; This may mean either a 3-D array of REGION pointers or a 4-D array of REGIONs. However, in practice, such an ambiguity doesn't exist because, in the situation of single star, REGION *region only indicates one object but not a 1-D array of REGION objects. This is so partly because region (of REGION* ) only points to ONE object at a time, and objects like REGION are meant to have their own memory blocks, which may not be contiguous.

    When elementary types are associated with pointer (in the above example, i), the concept of object is much weaker. These fields, say i, don't have contructors and their memory can be (and in fact is) contiguous. So, in the above case, i can be treated as a 4-D array.

    We now agree that,

  • Pointer to pointer .. to pointer to complex types are not meant to be array of that many dimension. Instead, they are array of pointers to the complex type and have a dimension of number of indirections minus 1.
  • Pointer to pointer .. to pointer to elementary types are meant to be array of that many dimension. They are pointers pointing to the elementary type arrays having a dimension of size equal to the number of indirections
  • .
    Briefly, pointer to a complex type is NOT an array but an indirection to a single object, wherease pointer to an elementary type is an array.

    Dimension and size specification

    When indirection is present, dimension/size specification is achieved in one single string, i.e, the option -dimen for schemaTransEntryAdd with x used to separate each dimension. For example, -dimen 5x3 means a 2-D of size 5 and 3 in each dimension, just like C's [5][3] and indicates that the field has 3 indirections if a complex type, and 2 indirections if an elementary type.

    When giving dimension information, one must fully resolve the indirections for elementary types and leave one indirection for complex types.

    Array access

    Array access, when indirection is present, is just same as the case for ordinary arrays, treating the added dimension by -dimen as a regular part of the array. For example, MyType ****mytype[3][4] is just like a 6-D array when MyType is elementary type, but is a 5-D array of pointers to MyType when it is a complex type.

    FITS-side Array access

    Array specification is also allowed in specifying FITS-side fields and follows a syntax similar to that for schema-side fields. dervish> schemaTransEntryAdd h1 name {NROW\[0\]} {subRegs\[0\]} struct -proc regNew -dimen 3 dervish> schemaTransEntryAdd h1 cont {subRegs\[0\] nrow int dervish> schemaTransEntryAdd h1 name {NROW\[1\]} {subRegs\[1\]} struct shvia> schemaTransEntryAdd h1 cont {subRegs\[1\] nrow int dervish> schemaTransEntryAdd h1 name {NROW\[2\]} {subRegs\[2\]} struct shvia> schemaTransEntryAdd h1 cont {subRegs\[2\] nrow int In the above example, field "subRegs" is type REGION**. A 1-D array of size 3 is specified with tcl verb "regNew" to construct all 3 objects. Each object's "nrow" will be copied to NROW, which is 1-D array of size 3.

    Arrow brackets are also legitimate array specifiers.

    Portable Enumeration and Heap Capabilities

    Portable Enumeration

    Enumerated types are integral types. While they are different types from integers, C allows conversion (or cast) between enum value and integer. When writing (or reading a ) FITS, however, an enum value become meaningless as it loses the context. To avoid this, the author devised the code such that, when writing a FITS, a composite ascii string that includes both the type and the enum memer that associates with the value is written out.; when reading a FITS, both type and value string will be read in and the integer value that associates with this string found from this type will be assigned to the field. Therefore, the context is maintained throught FITS I/O process. dervish > schemaTransEntryAdd h1 name MY_ENUM type enum Note that, instead of enum, one can still use "int".

    Heap Capability

    Heap is a variable storage available in FITS. If a particular field takes variable size, then you want to use heap. Specifying heap is not much different from specifying other types. In place of "int", "enum", "struct", or whatever you may use in the 4th parameter, one simply says "heap". In addition, heap requires at least one more parameter be given "-heaptype" and, if you are converting schema objects to FITS (TBLCOL), "-heaplength".

    Heaptype is the heap base type. It usually is one of the C elementary types, but one can use customized structures. Heaplength is an expression whose result gives the length of heap, i.e, number of base blocks. Obviously, when reading FITS file, this information is not needed because length information is stored in FITS.

    dervish > schemaTransEntryAdd h1 name HP test heap -heaptype FLOAT This example indicates "test" is a float heap. Data from HP in FITS should be copied to test. When writing FITS, heaplength is absolutely required: dervish > schemaTransEntryAdd h1 name HP contents heap -heaptype FLOAT \ -heaplength ncol This example shows how heap in a REGION is taken care of. Contents is a heap of type FLOAT. Heap length is given by the value of nocl in the same object. If heaplength is not a numerical string (say, 10, a case when one wants to use fixed length for all objects), it should be a field specified with respect to the object. In this example, ncol is a field in REGION. If the desired length is not given directly in REGION but in objects pointed by some field in REGION, one should use normal C syntax, e.g, mask->row0. The expression will be evaluated for each object to convert, and hence length may vary between objects.

    There are cases where heap field is multi-dimensional. The "rows_u16" in a REGION, for example, has two indirections (i.e, two stars) and hence acts like a 2-D array. A "-dimen" option has to be specified, just like any other multi-indireciton fields. Like "-heaplength", this "-dimen" parameter also takes expressions. The result of evaluating the expressions yields the dimension information, which may vary from objects to objects.

    dervish > schemaTransEntryAdd h1 name HP rows_u16 heap -heaptype FLOAT \ -heaplength ncol -dimen nrow The result is that rows_u16 will be a field that has a size of nrow x ncol x sizeof(float) bytes.

    Multiplication is allowed when specifying -dimen or -heaplength. For example, one may use -heaplength 2 x ncol, or -dimen nrow x ncol, etc, where "x" is interpreted as the multiplication operation. To heaplength, only the result of this product is significant. But to -dimen, "x" also suggests dimension information, which will be matched against the number of indirections of this field minus 1. For example, "rows_u16" has two indirection, and therefore "-dimen" should only consist of a single number. No "x" is allowed" (The fastest changing dimension is always given by -heaplength). If "rows_u16" were "float ***", then "-dimen" would require one "x" be present, making it to be something like "nrow x ncol".

    In short, for heap types, the total number of blocks is given by the product of dimen and heaplength, with each block having sizeof(heaptype) bytes.

    both portable enum and heap can be used in multi-line table entries.

    Tcl Interface Examples

  • Converting TBLCOL to one instance of schema
  • Suppose we want to convert a FITS file to a schema HG. Specifically, the FITS file we use have fields like RA_DEG, GSC_ID, PLATE_ID and MULTIPLE, which are double, int, char*4 and char respectively. We now want to convert them to sum, id, xLabel, yLabel, which in schema HG are double, int, char*, and char* respectively.

    Build the translation table

    dervish> set xtbl [schemaTransNew]

    dervish> schemaTransEntryAdd $xtbl name RA_DEG sum double -ratio 2 dervish> schemaTransEntryAdd $xtbl name GSC_ID id int -ratio 20 dervish> schemaTransEntryAdd $xtbl name PLATE_ID xLabel string -dimen 5 dervish> schemaTransEntryAdd $xtbl name MULTIPLE yLabel string -dimen 2
    Note conversion will match the other fields in HG not listed here directly to FITS and will ignore those it fails to locate in FITS file.

    Get FITS file into TBLCOL

    dervish> set tblcol [handleNewFromType TBLCOL] dervish> fitsRead $tblcol /data/sdss2/GuideStars/v1_1/gsc/n0000/0001.gsc -hdu 1

    Create an instance of HG

    dervish> set scheam [hgNew]

    Now convert

    dervish> tblToSchema $tblcol $schema $xtbl] Note it is to the user's discretion whether to delete TBLCOL after conversion is done.

    Write back to fits just for fun

    dervish> set ntblcol [schemaToTbl $schema $xtbl] dervish> fitsWrite $ntblcol junkFits -ascii
  • Converting all in TBLCOL to array of instances of a schema
  • Suppose we now want to convert a FITS file to a schema REGION. We convert a fits file whose GSC_ID interestes us to nrow, mask.nrow, xLabel, yLabel, which in schema REGION are int, int (in MASK) and int respectively. To make it more intereseting, we specify different conversion ratios for these fields. Note, for type MASK, we'd have to specify a Tcl constructor maskNew.

    Build a tranlsation table...

    dervish> schemaTransNew dervish> schemaTransEntryAdd $xtbl name GSC_ID nrow int -ratio 2 dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct -proc maskNew dervish> schemaTransEntryAdd $xtbl cont mask nrow int -ratio 20 dervish> schemaTransEntryAdd $xtbl name GSC_ID type int -ratio 20 Create a TBLCOL and get the FITS... dervish> set tblcol [handleNewFromType TBLCOL] dervish> fitsRead $tblcol /data/sdss2/GuideStars/v1_1/gsc/s8230/9537.gsc -hdu 1

    Create an empty container...

    dervish> set container [handleNewFromType ARRAY]

    Convert TBLCOL to array...

    dervish> tblToSchema $tblcol $schema $xtbl -proc regNew -schemaName REGION Note on command line, we used -proc regNew to construct the REGION objects to put in the array. If no -proc is specified, default constructor or manaully mallocking will be used.

    Write back for fun...

    dervish> set ntblcol [schemaToTbl $container $xtbl -schemaName REGION] dervish> fitsWrite $ntblcol junkFits -ascii -pdu MINIMAL When one or more FITS-side fields are specified as array, one has to write to a binary fits file. dervish> fitsWrite $ntblcol junkFits -binary -pdu MINIMAL