FITS ASCII and Binary Tables are read into the columnar (TBLCOL) format. The following interfaces are provided for reading in FITS files:
As data is read from the FITS file, the scaling and zero adjustments (TSCALn and TZEROn keywords respectively) are applied in particular instances only; for other cases, it is up to the user application to apply scaling and zeroing, along with any other adjustments appropriately.
TSCALn and TZEROn are applied only for Binary Tables when they are the appropriate value for flipping the "signedness" of an integer data type. This data type flipping follows an unwritten FITS convention (it is not part of the FITS Standard). Sign conversion is done when TSCALn is 1 and
7-1 16-1 32-1 128 = 2 32768 = 2 2147483648 = 2Note that the `H', `U', and `V' Binary Table data types are extensions to the FITS Standard.
For cases where the "signedness" of the data type is not flipped, the user application must understand how values are read to some internal format. Some problems/questions arise if the FITS reader were to apply these adjustments:
Also, the TNULLn keyword value is read, but not applied to any data. No checks are made of incoming data against the values of keyword such as DATAMAX or DATAMIN (which are, strictly, used only for the Primary array or Random Groups). Thus, not applying TNULLn to undefined values in a Binary Table should not be noticed by the user. But, this non-application of TNULLn to ASCII Table fields might be noticed. There, the character representation of a number may be invalid. Yet, because the Dervish FITS reader converts ASCII Table numbers to their binary representation, applying TNULLn does not make sense as it is an arbitrary character string.
The FITS Table reader does not convert between the FITS Standard floating point format (IEEE) and the native machine's floating point number representation. This poses no problems when reading ASCII Tables. But, for Binary Tables, if the platform on which the FITS Table reader runs does not use the IEEE format, problems will occur. Currently, the FITS Table reader does not convert from the IEEE format to the native machine format. It simply copies the IEEE values over to the in-memory copy.
ASCII Tables specify data types for the fields using the TFORMn
keyword. In general, FORTRAN-77 syntax for FORMAT
statements
is used. The Dervish FITS reader supports the following data types:
When an ASCII Table is read in, the TFORMn data type is converted to an object schema data type maintained in the ARRAY structure's data.schemaType field (which describes the data type of the in-memory copy of the data). The ASCII data type is also implicitly mapped to one of the Binary Table data types. This mapping allows an ASCII Table to be written out as a Binary Table. The mapping from ASCII to Binary data types and object schema types is as follows:
No scaling or zeroing is applied, nor is any "signedness" conversion performed (as the `I' data type does not have any implicit size for which appropriate TSCALn and TZEROn can be detected).
Although the TFORMn keyword uses a restricted FORTRAN-77 FORMAT specification to describe the field, FORTRAN style input of the field is not necessarily followed.
If the field contains neither a decimal point nor an exponent, it is treated as a real number of w digits, in which the the rightmost d digits are to the right of the implicit decimal point, with leading zeros assumed, if necessary. Placement of the implicit decimal point is based on the last non-blank character in the field (trailing blanks do not count as positions to the right of the implicit decimal point).
Blank characters in a field are not interpreted as zeros; all zeros, even trailing zeros, must be explicit (this is equivalent to specifying BLANK= 'NULL' in the FORTRAN-77 OPEN statement).
Text of "NaN", or some derivative, indicates a Not-a-Number.
The FITS Standard allows ASCII Tables to have overlapped fields. Under Dervish, though, when fields are read in, they are allocated their own space for data. Thus, overlapped fields will become disjoint upon being read. The possibly expected behaviour of modifying one field with the intent of updating an overlapped field will not occur. Writing of overlapped fields to a FITS file will also not be handled as expected.
The reading of floating-point data recognizes "NaN" as a keyword for a Not-a-Number (NaN). Many forms of "NaN" are recognized: leading whitespace is ignored, the check is case-insensitive, and any text after "NaN" is ignored. The binary result in the TBLCOL is an IEEE floating-point quiet (non-signalling) NaN.
Binary Tables specify data types for the fields using the TFORMn keyword. The overall format is
[r]T[ignored]where optional items are surrounded by square brackets ([]). The r specifies a element count. If missing, it defaults to one (1). It can be zero (0), indicating that the field in the record storage area of the Binary Table HDU does not contain any data for that field.
The T specifies the data type. The Dervish FITS Binary Table reader supports the following data types:
[r]P
[T[(maxElemCnt)]]
The following character is taken as the data
type of the heap data. It may be any of the supported Binary Table data types
except `P' itself. The optional maximum element count, as permitted
by a FITS Standard convention, is not used or checked by the Binary Table
reader.
Dervish extends the FITS Binary Table Standard with the following data types:
When a Binary Table is read in, the TFORMn data type is converted to an object schema data type maintained in the ARRAY structure's data.schemaType field (which describes the data type of the in-memory copy of the data). There is also an implicit mapping to an ASCII data type, allowing a Binary Table to be written out as an ASCII Table. The mapping from Binary to ASCII data types and object schema types is as follows:
Scaling and zeroing are applied only when changing the "signedness" of integer data. When the TSCALn and TZEROn values are proper, the following sign conversions are done:
Additional data types are recognized that are not part of the FITS Binary Table Standard. These new data types are recognized even if the Primary HDU specified SIMPLE=T (indicating that all HDUs in the FITS file comply with the FITS Standard). These extra data types are needed to address FITS' limitation of integer types with respect to their signs. The extension data types, `U', `V', and `H', handle this deficiency. This extension maintains FITS' rule, "once FITS, always FITS." Older FITS files will be read properly and reading new FITS files with any of these types should fail under older readers.
Particular TSCALn and TZEROn (set by a FITS community convention) flip the "signedness" of the input data (and data type). This convention if also applied to heap data (if the heap data type is appropriate), even though the FITS Standard indicates that for `P' data types, TSCALn and TZEROn are not defined.
The FITS Standard does not disallow Binary Table heap data to overlap. Under Dervish, though, when heap data is read in, each variable length array is allocated its own space for data. Thus, overlapped data will become disjoint upon being read. The possibly expected behaviour of modifying one datum with the intent of updating an overlapped datum will not occur. Writing of overlapped heap data to a FITS file will also not be handled as expected.
When an ASCII or Binary Table is read, the FITS header associated with that Table is also read and retained. Although the retained Dervish header appears to be a FITS header, it should not be treated as such. The Dervish header will not contain all the original FITS header keywords and information (such as some comments, etc.).
Although Dervish headers are not FITS headers, Dervish does place restrictions on the format of Dervish headers as those specified by the FITS header. For example, the length of character string values is limited. This allows Dervish headers to be readily converted to FITS headers, without any information loss, when a Table is written out to a FITS file
Some FITS header keywords are adjusted when they are read:
Dervish header space is allocated automatically for certain object schemas:
Using the columnar (TBLCOL) format allows immediate access to FITS Tables from Dervish without the need for object schemas and handles to be defined. The FITS Table reader, fitsRead / shFitsRead, will read into a TBLCOL format when a TBLCOL Dervish handle is used.
In the TBLCOL format, the values from all rows for a field in the Table are read into an ARRAY. These values are stored in the same form as described in the FITS Table header (TFORMn keyword) with any conversions necessary for the machine type. For example, `I' data types are signed two byte integers (for Binary Tables; for ASCII Tables, `I' data types are the natural integer size for the particular platform). When stored into the array, the values are aligned on natural boundaries for the particular machine type. Besides describing Table fields, ARRAY object schemas also point to the field's data and optional information (TBLFLD object schemas).
When the FITS Binary Table reader detects an empty field (the
TFORMn element count is zero (0)), the field is retained
in the Dervish Table.
Dimension information (in the ARRAY's
dimCnt
and dim
members)
will be present, but may not necessary indicate the absence of data. Instead,
the ARRAY field's arrayPtr
and
data.dataPtr
will be null pointers.
For the `A' data type (TFORMn keyword), an additional character is allocated per character string (in case the field is multidimensional) to permit all character strings to be null-terminated. This extra character is not output when writing a FITS file.
For the `L' data type (TFORMn keyword), the FITS file value, a single character of `T' or `F', is converted to an unsigned byte value of 1 or 0 respectively. This permits a more `natural' conditional test in languages such as C.
The A User's Guide for the Flexible Image Transport System (FITS) (Version 3.1, May 2, 1994) describes a feature that is not supported:
Heap data is read from Binary Tables in a similar fashion as data from the Record Storage Area (RSA) (the data that is read into the ARRAYs). There are some differences:
cnt
(count) member is the dimension of
the variable length array (for the particular Table row and field).
Most conditions that form an invalid FITS file result in aborting the read of the FITS file. However, there are some conditions that are handled, since the ASCII and Binary Table standards do not address them explicitly:
The reading of the Table is aborted.
NOTE: The FITS Binary Table standard does not mention overlapped fields. And, using precedents, the ASCII Tables extension does not permit fields to extend beyond NAXIS1 (see A User's Guide for the Flexible Image Transport System (FITS) (Version 3.1, May 2, 1994), Section 3.4.3, "Data Records in an ASCII Tables Extension" (page 42)).
Some Binary Table specific invalid conditions and resulting actions are:
Data from the heap area will be read "normally."
Data reading is aborted.
NOTE: There may be some controversy as to what the FITS keyword TDIMn really means for variable length data (it should be noted that the FITS Binary Table Standard lists TDIMn as a convention for multidimensional arrays, not as part of the Standard). Because of this, Dervish does not support dimensioned variable length data.
Data reading is aborted.