TBLCOL/Schema Conversion
TCL API
C Routine Interface
Contributed Functions
Other Information
Translation Table Syntax
Portable Enumeration and Heap
Capabilities
Examples
Single-line entries in a translation table are self-explanatory. They
match a specific FITS field name to a field in schema, and the conversion
routines will fill one field with the values found from the other. Note
1st field is always at FITS side and the 2nd is always the schema side when
adding entries. This way, a single translation table can be used in
conversion of either direction.
Following example shows how to add an entry for converting the field called "id"
in a schema to FITS field "MY_ID".
dervish> schemaTransNew
h1
dervish> schemaTransEntryAdd h1 name MY_ID id
Entry added
dervish>
Note capitalizing MY_ID is optional, as the routine will convert all FITS field names
to upper cases.
Single line entries are very useful when the fields one is interested in are all
elementary fields. However, such may not be all in the cases. If a user-defined
structure is present in a schema and the user wants to convert it, multiple line
entry is used. A multiple line entry has a single main line followed by one or
more continuation lines.
Let's see an example. Suppose we want to match a FITS
field called GSC_ID to nrow, i.e, region.mask.nrow (or even region->mask->nrow),
where region is a REGION and mask is a MASK. The translation table looks like:
ConvType FITS-side SCHEMA-side FieldType ratio constructor size
name GSC_ID mask struct 1.000 none none
cont mask nrow int 1.000 none none
This is achieved by:
dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct
dervish> schemaTransEntryAdd $xtbl cont mask nrow int
For each add, we may specify options like -proc, -dimen and -ratio to
elaborate the conversion at that level. In this case, mask has a Tcl
constructor called maskNew, we then:
dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct -proc maskNew
dervish> schemaTransEntryAdd $xtbl cont mask nrow int
The table now looks like
ConvType FITS-side SCHEMA-side FieldType ratio constructor size
name GSC_ID mask struct 1.000 maskNew none
cont mask nrow int 1.000 none none
So each time a mask is found, maskNew will be used to construct that objects.
(if one doesn't give a constructor, the conversion routines will try to
other ways. See Oject State Initialization
for more details.
Array specification is flexibly allowed in a translation
table. But before we explain that, we first go through some basics.
A schema field can be a combination of following three types:
C elementary types (e.g, char, int) or custom-defined types (e.g, struct)
Pointer type with a number of indirections (stars), e.g, char ***s;
Array, e.g, int i[10][20];
Although C generally allows users to mix pointers with arrays, these two
types have important differences. Array memories are static and contiguous,
whereas pointers
require initialization and/or run-time memory allocation, and may point to
discrete memory blocks. And loosely speaking, the former doesn't
need de-reference indirect addressing, while the latter does.
For a field that is already an array (static, contiguous memory), say, int
c[5], translation table allows specifications like c[3] to get the
4th element of the array. In other words, you can do
schemaTransEntryAdd name GSC_ID {c\[3\]} int
where escape \ is used. Now the translation table looks like:
ConvType FITS-side SCHEMA-side FieldType ratio constructo size
name GSC_ID c[3] int 1.000 none none
Therefore the value of GSC_ID will get copied to c[3], but not other
elements of the array.
Multidimensional arrays have similar syntax. Say now c is defined to be c[5][10].
You may then in the table specify c[3][2] to access the 3rd element of the 4th array,
whose value will be set to that of GSC_ID.
You may even be able to specify c[3] for the above multi-D array.
In this case, c[3] just mean c[3][0], but now GSC_ID has to be an 1-D array of
size 10 because c[3] is an array of size 10 whose top is pointed by &c[3][0]!
A field might also be just a pointer to pointer to ... pointer, say,
REGION ****region;
int ****i;
This may mean either a 3-D array of REGION pointers or
a 4-D array of REGIONs. However, in practice, such an ambiguity doesn't
exist because, in the situation of single star, REGION *region only
indicates one object but not a 1-D array of REGION objects. This is
so partly because region (of REGION* ) only points to ONE object
at a time, and objects like REGION are meant to have their own
memory blocks, which may not be contiguous.
When elementary types are associated with pointer (in the above example, i),
the concept of object is much weaker. These fields, say i, don't have
contructors and their memory can be (and in fact is) contiguous. So,
in the above case, i can be treated as a 4-D array.
We now agree that,
Pointer to pointer .. to pointer to complex types are not meant to
be array of that many dimension. Instead, they are array of pointers to
the complex type and have a dimension of number of indirections minus 1.
Pointer to pointer .. to pointer to elementary types are meant to
be array of that many dimension. They are pointers pointing to the elementary
type arrays having a dimension of size equal to the number of
indirections.
Briefly, pointer to a complex type is NOT an array but an indirection to
a single object, wherease pointer to an elementary type is an array.
When indirection is present, dimension/size specification is achieved in
one single string, i.e, the option -dimen for schemaTransEntryAdd
with x used to separate each dimension. For example, -dimen
5x3 means a 2-D of size 5 and 3 in each dimension, just like C's [5][3]
and indicates that the field has 3 indirections if a complex type, and
2 indirections if an elementary type.
When giving dimension information, one must fully resolve the indirections
for elementary types and leave one indirection for complex types.
Array access, when indirection is present, is just same as
the case for ordinary arrays, treating the added dimension by -dimen
as a regular part of the array. For example, MyType ****mytype[3][4]
is just like a 6-D array when MyType is elementary type, but is a 5-D
array of pointers to MyType when it is a complex type.
Array specification is also allowed in specifying FITS-side fields and
follows a syntax similar to that for schema-side fields.
dervish> schemaTransEntryAdd h1 name {NROW\[0\]} {subRegs\[0\]} struct -proc regNew -dimen 3
dervish> schemaTransEntryAdd h1 cont {subRegs\[0\] nrow int
dervish> schemaTransEntryAdd h1 name {NROW\[1\]} {subRegs\[1\]} struct
shvia> schemaTransEntryAdd h1 cont {subRegs\[1\] nrow int
dervish> schemaTransEntryAdd h1 name {NROW\[2\]} {subRegs\[2\]} struct
shvia> schemaTransEntryAdd h1 cont {subRegs\[2\] nrow int
In the above example, field "subRegs" is type REGION**. A 1-D array of size 3
is specified with tcl verb "regNew" to construct all 3 objects. Each object's
"nrow" will be copied to NROW, which is 1-D array of size 3.
Arrow brackets are also legitimate array specifiers.
Portable Enumeration
Enumerated types are integral types. While they are different types from integers,
C allows conversion (or cast) between enum value and integer. When writing
(or reading a ) FITS, however, an enum value become meaningless as it loses
the context. To avoid this, the author devised the code such that, when writing
a FITS, a composite ascii string that includes both the type and the enum memer
that associates with the value is written out.; when reading a FITS, both type
and value string will be read in and the integer value that associates with this
string found from this type will be assigned to the field. Therefore, the context
is maintained throught FITS I/O process.
dervish > schemaTransEntryAdd h1 name MY_ENUM type enum
Note that, instead of enum, one can still use "int".
Heap Capability
Heap is a variable storage available in FITS. If a particular field takes
variable size, then you want to use heap. Specifying heap is not much
different from specifying other types. In place of "int", "enum", "struct",
or whatever you may use in the 4th parameter, one simply says "heap".
In addition, heap requires at least one more parameter be given "-heaptype"
and, if you are converting schema objects to FITS (TBLCOL), "-heaplength".
Heaptype is the heap base type. It usually is one of the C elementary types,
but one can use customized structures. Heaplength is an expression whose
result gives the length of heap, i.e, number of base blocks. Obviously, when
reading FITS file, this information is not needed because length information
is stored in FITS.
dervish > schemaTransEntryAdd h1 name HP test heap -heaptype FLOAT
This example indicates "test" is a float heap. Data from HP in FITS should be
copied to test.
When writing FITS, heaplength is absolutely required:
dervish > schemaTransEntryAdd h1 name HP contents heap -heaptype FLOAT \
-heaplength ncol
This example shows how heap in a REGION is taken care of. Contents is a heap
of type FLOAT. Heap length is given by the value of nocl in the same object.
If heaplength is not a numerical string (say, 10, a case when one wants to
use fixed length for all objects), it should be a field specified with respect to
the object. In this example, ncol is a field in REGION. If the desired length is
not given directly in REGION but in objects pointed by some field in REGION,
one should use normal C syntax, e.g, mask->row0. The expression will be
evaluated for each object to convert, and hence length may vary between
objects.
There are cases where heap field is multi-dimensional. The "rows_u16"
in a REGION, for example, has two indirections (i.e, two stars) and
hence acts like a 2-D array. A "-dimen" option has to be specified, just
like any other multi-indireciton fields. Like "-heaplength", this
"-dimen" parameter also takes expressions. The result of evaluating the
expressions yields the dimension information, which may vary from objects
to objects.
dervish > schemaTransEntryAdd h1 name HP rows_u16 heap -heaptype FLOAT \
-heaplength ncol -dimen nrow
The result is that rows_u16 will be a field that has a size of nrow x ncol
x sizeof(float) bytes.
Multiplication is allowed when specifying -dimen or -heaplength. For example,
one may use -heaplength 2 x ncol, or -dimen nrow x ncol, etc, where "x"
is interpreted as the multiplication operation. To heaplength, only the
result of this product is significant. But to -dimen, "x" also suggests
dimension information, which will be matched against the number of
indirections of this field minus 1. For example, "rows_u16" has two
indirection, and therefore "-dimen" should only consist of a single number.
No "x" is allowed" (The fastest changing dimension is always given by
-heaplength). If "rows_u16" were "float ***", then "-dimen" would require
one "x" be present, making it to be something like "nrow x ncol".
In short, for heap types, the total number of blocks is given by the product
of dimen and heaplength, with each block having sizeof(heaptype) bytes.
both portable enum and heap can be used in multi-line table entries.
Converting TBLCOL to one instance of schema
Suppose we want to convert a FITS file to a schema HG. Specifically,
the FITS file we use have fields like RA_DEG, GSC_ID, PLATE_ID and
MULTIPLE, which are double, int, char*4 and char respectively. We now
want to convert them to sum, id, xLabel, yLabel, which in schema HG
are double, int, char*, and char* respectively.
Build the translation table
dervish> set xtbl [schemaTransNew]
dervish> schemaTransEntryAdd $xtbl name RA_DEG sum double -ratio 2
dervish> schemaTransEntryAdd $xtbl name GSC_ID id int -ratio 20
dervish> schemaTransEntryAdd $xtbl name PLATE_ID xLabel string -dimen 5
dervish> schemaTransEntryAdd $xtbl name MULTIPLE yLabel string -dimen 2
Note conversion will match the other fields in HG not listed here
directly to FITS and will ignore those it fails to locate in FITS file.
Get FITS file into TBLCOL
dervish> set tblcol [handleNewFromType TBLCOL]
dervish> fitsRead $tblcol /data/sdss2/GuideStars/v1_1/gsc/n0000/0001.gsc -hdu 1
Create an instance of HG
dervish> set scheam [hgNew]
Now convert
dervish> tblToSchema $tblcol $schema $xtbl]
Note it is to the user's discretion whether to delete TBLCOL after
conversion is done.
Write back to fits just for fun
dervish> set ntblcol [schemaToTbl $schema $xtbl]
dervish> fitsWrite $ntblcol junkFits -ascii
Converting all in TBLCOL to array of instances of a schema
Suppose we now want to convert a FITS file to a schema REGION. We convert
a fits file whose GSC_ID interestes us to nrow, mask.nrow, xLabel, yLabel,
which in schema REGION are int, int (in MASK) and int respectively.
To make it more intereseting, we specify different conversion ratios for
these fields. Note, for type MASK, we'd have to specify a Tcl constructor
maskNew.
Build a tranlsation table...
dervish> schemaTransNew
dervish> schemaTransEntryAdd $xtbl name GSC_ID nrow int -ratio 2
dervish> schemaTransEntryAdd $xtbl name GSC_ID mask struct -proc maskNew
dervish> schemaTransEntryAdd $xtbl cont mask nrow int -ratio 20
dervish> schemaTransEntryAdd $xtbl name GSC_ID type int -ratio 20
Create a TBLCOL and get the FITS...
dervish> set tblcol [handleNewFromType TBLCOL]
dervish> fitsRead $tblcol /data/sdss2/GuideStars/v1_1/gsc/s8230/9537.gsc -hdu 1
Create an empty container...
dervish> set container [handleNewFromType ARRAY]
Convert TBLCOL to array...
dervish> tblToSchema $tblcol $schema $xtbl -proc regNew -schemaName REGION
Note on command line, we used -proc regNew to construct the REGION objects to put
in the array. If no -proc is specified, default constructor or manaully mallocking
will be used.
Write back for fun...
dervish> set ntblcol [schemaToTbl $container $xtbl -schemaName REGION]
dervish> fitsWrite $ntblcol junkFits -ascii -pdu MINIMAL
When one or more FITS-side fields are specified as array, one has to write
to a binary fits file.
dervish> fitsWrite $ntblcol junkFits -binary -pdu MINIMAL