FITS Table Reading Under Dervish

FITS ASCII and Binary Tables are read into the columnar (TBLCOL) format. The following interfaces are provided for reading in FITS files:

  • ANSI C
  • Tcl
  • Adjustments to Numeric Data

    As data is read from the FITS file, the scaling and zero adjustments (TSCALn and TZEROn keywords respectively) are applied in particular instances only; for other cases, it is up to the user application to apply scaling and zeroing, along with any other adjustments appropriately.

    TSCALn and TZEROn are applied only for Binary Tables when they are the appropriate value for flipping the "signedness" of an integer data type. This data type flipping follows an unwritten FITS convention (it is not part of the FITS Standard). Sign conversion is done when TSCALn is 1 and

    These TZEROs correspond to
              7-1            16-1                 32-1
       128 = 2      32768 = 2       2147483648 = 2
    
    Note that the `H', `U', and `V' Binary Table data types are extensions to the FITS Standard.

    For cases where the "signedness" of the data type is not flipped, the user application must understand how values are read to some internal format. Some problems/questions arise if the FITS reader were to apply these adjustments:

    Also, the TNULLn keyword value is read, but not applied to any data. No checks are made of incoming data against the values of keyword such as DATAMAX or DATAMIN (which are, strictly, used only for the Primary array or Random Groups). Thus, not applying TNULLn to undefined values in a Binary Table should not be noticed by the user. But, this non-application of TNULLn to ASCII Table fields might be noticed. There, the character representation of a number may be invalid. Yet, because the Dervish FITS reader converts ASCII Table numbers to their binary representation, applying TNULLn does not make sense as it is an arbitrary character string.

    Platform-specific Floating Point Values

    The FITS Table reader does not convert between the FITS Standard floating point format (IEEE) and the native machine's floating point number representation. This poses no problems when reading ASCII Tables. But, for Binary Tables, if the platform on which the FITS Table reader runs does not use the IEEE format, problems will occur. Currently, the FITS Table reader does not convert from the IEEE format to the native machine format. It simply copies the IEEE values over to the in-memory copy.

    Supported ASCII Table Data Types

    ASCII Tables specify data types for the fields using the TFORMn keyword. In general, FORTRAN-77 syntax for FORMAT statements is used. The Dervish FITS reader supports the following data types:

    where w is the width of the field and d indicates the position of the decimal point.

    Mapping ASCII Data Types

    When an ASCII Table is read in, the TFORMn data type is converted to an object schema data type maintained in the ARRAY structure's data.schemaType field (which describes the data type of the in-memory copy of the data). The ASCII data type is also implicitly mapped to one of the Binary Table data types. This mapping allows an ASCII Table to be written out as a Binary Table. The mapping from ASCII to Binary data types and object schema types is as follows:

    No scaling or zeroing is applied, nor is any "signedness" conversion performed (as the `I' data type does not have any implicit size for which appropriate TSCALn and TZEROn can be detected).

    Input Processing of Fields

    Although the TFORMn keyword uses a restricted FORTRAN-77 FORMAT specification to describe the field, FORTRAN style input of the field is not necessarily followed.

    Overlapped Fields

    The FITS Standard allows ASCII Tables to have overlapped fields. Under Dervish, though, when fields are read in, they are allocated their own space for data. Thus, overlapped fields will become disjoint upon being read. The possibly expected behaviour of modifying one field with the intent of updating an overlapped field will not occur. Writing of overlapped fields to a FITS file will also not be handled as expected.

    Things to Sort Out

    Extensions to the ASCII Table Standard

    The reading of floating-point data recognizes "NaN" as a keyword for a Not-a-Number (NaN). Many forms of "NaN" are recognized: leading whitespace is ignored, the check is case-insensitive, and any text after "NaN" is ignored. The binary result in the TBLCOL is an IEEE floating-point quiet (non-signalling) NaN.

    Supported Binary Table Data Types

    Binary Tables specify data types for the fields using the TFORMn keyword. The overall format is

         [r]T[ignored]
    
    where optional items are surrounded by square brackets ([]). The r specifies a element count. If missing, it defaults to one (1). It can be zero (0), indicating that the field in the record storage area of the Binary Table HDU does not contain any data for that field.

    The T specifies the data type. The Dervish FITS Binary Table reader supports the following data types:

    The `P' data type may have optional information following the `P' specifier.
         [r]P[T[(maxElemCnt)]]
    
    The following character is taken as the data type of the heap data. It may be any of the supported Binary Table data types except `P' itself. The optional maximum element count, as permitted by a FITS Standard convention, is not used or checked by the Binary Table reader.

    Dervish extends the FITS Binary Table Standard with the following data types:

    The following data types are currently not supported:

    Mapping Binary Data Types

    When a Binary Table is read in, the TFORMn data type is converted to an object schema data type maintained in the ARRAY structure's data.schemaType field (which describes the data type of the in-memory copy of the data). There is also an implicit mapping to an ASCII data type, allowing a Binary Table to be written out as an ASCII Table. The mapping from Binary to ASCII data types and object schema types is as follows:

    The following Binary data types are not mapped

    Scaling and zeroing are applied only when changing the "signedness" of integer data. When the TSCALn and TZEROn values are proper, the following sign conversions are done:

    Extensions to the Binary Table Standard

    Additional data types are recognized that are not part of the FITS Binary Table Standard. These new data types are recognized even if the Primary HDU specified SIMPLE=T (indicating that all HDUs in the FITS file comply with the FITS Standard). These extra data types are needed to address FITS' limitation of integer types with respect to their signs. The extension data types, `U', `V', and `H', handle this deficiency. This extension maintains FITS' rule, "once FITS, always FITS." Older FITS files will be read properly and reading new FITS files with any of these types should fail under older readers.

    Particular TSCALn and TZEROn (set by a FITS community convention) flip the "signedness" of the input data (and data type). This convention if also applied to heap data (if the heap data type is appropriate), even though the FITS Standard indicates that for `P' data types, TSCALn and TZEROn are not defined.

    Overlapped Heap Data

    The FITS Standard does not disallow Binary Table heap data to overlap. Under Dervish, though, when heap data is read in, each variable length array is allocated its own space for data. Thus, overlapped data will become disjoint upon being read. The possibly expected behaviour of modifying one datum with the intent of updating an overlapped datum will not occur. Writing of overlapped heap data to a FITS file will also not be handled as expected.

    FITS/Dervish Headers

    When an ASCII or Binary Table is read, the FITS header associated with that Table is also read and retained. Although the retained Dervish header appears to be a FITS header, it should not be treated as such. The Dervish header will not contain all the original FITS header keywords and information (such as some comments, etc.).

    Although Dervish headers are not FITS headers, Dervish does place restrictions on the format of Dervish headers as those specified by the FITS header. For example, the length of character string values is limited. This allows Dervish headers to be readily converted to FITS headers, without any information loss, when a Table is written out to a FITS file

    Adjustments to Keyword Values

    Some FITS header keywords are adjusted when they are read:

    Allocation of Dervish Header Space

    Dervish header space is allocated automatically for certain object schemas:

    Dervish header space for these object schemas is also automatically freed when the handles are deleted. For object schemas other than these, space for the Dervish header is allocated automatically. The user is responsible for deleting these Dervish headers when the objects with which they're associated are deleted.

    Reading a Table in TBLCOL Format

    Using the columnar (TBLCOL) format allows immediate access to FITS Tables from Dervish without the need for object schemas and handles to be defined. The FITS Table reader, fitsRead / shFitsRead, will read into a TBLCOL format when a TBLCOL Dervish handle is used.

    In the TBLCOL format, the values from all rows for a field in the Table are read into an ARRAY. These values are stored in the same form as described in the FITS Table header (TFORMn keyword) with any conversions necessary for the machine type. For example, `I' data types are signed two byte integers (for Binary Tables; for ASCII Tables, `I' data types are the natural integer size for the particular platform). When stored into the array, the values are aligned on natural boundaries for the particular machine type. Besides describing Table fields, ARRAY object schemas also point to the field's data and optional information (TBLFLD object schemas).

    When the FITS Binary Table reader detects an empty field (the TFORMn element count is zero (0)), the field is retained in the Dervish Table. Dimension information (in the ARRAY's dimCnt and dim members) will be present, but may not necessary indicate the absence of data. Instead, the ARRAY field's arrayPtr and data.dataPtr will be null pointers.

    Beyond the FITS Standard

    For the `A' data type (TFORMn keyword), an additional character is allocated per character string (in case the field is multidimensional) to permit all character strings to be null-terminated. This extra character is not output when writing a FITS file.

    For the `L' data type (TFORMn keyword), the FITS file value, a single character of `T' or `F', is converted to an unsigned byte value of 1 or 0 respectively. This permits a more `natural' conditional test in languages such as C.

    Missing Out on FITS Features

    The A User's Guide for the Flexible Image Transport System (FITS) (Version 3.1, May 2, 1994) describes a feature that is not supported:

    Reading Heap Data

    Heap data is read from Binary Tables in a similar fashion as data from the Record Storage Area (RSA) (the data that is read into the ARRAYs). There are some differences:

    Invalid Conditions and How They are Handled

    Most conditions that form an invalid FITS file result in aborting the read of the FITS file. However, there are some conditions that are handled, since the ASCII and Binary Table standards do not address them explicitly:

    Some Binary Table specific invalid conditions and resulting actions are: