This document presents an insider's look into the design of translation table, a data structure now heavily used and relied on in SDSS environment.

Why Translation Table

Filling pre-defined data structures with data from arbitrary FITS (Flexible Image Transport System) is highly desirable. Before TBLCOL/Translation table package was born, writing FITS I/O was mostly done for individual structures on an ad hoc basis. However, because the large number of different data structures (in dervish alone, there are roughly 50), writing ad hoc FITS I/O codes becomes very expensive and inefficient. Dervish (Survey Human Interface and Visualization Environment) already has the capability to recognize data structures at runtime. With TBLCOL, dervish has the capability of reading arbitrary FITS files into a TBLCOL data structure. To maximumly use these capabilities, the scheme of using translation tables is invented in order to delink data structure-specific things from data conversion.

In this scheme, a translation table is built at runtime to map data source to data destination, while actual data conversion is done through compiled-code. In other words, a translation table is a set of string-based commands that instruct the compiled-code how data conversion should be done. Translation tables contain information specific to a particular data structure, but the underlying code is generaly and invariant.

How Translation table Works

For translation table syntax please click here.

A translation table is a collection of translation entries, each entry effectively instruct the conversion code to do one conversion. For an intruction to be flexible enough, a table entry must

  • allow data source and data destination be specified at runtime.
  • allow destination data type be specified at runtime.
  • Note here we don't have to require source data type be specified. If TBLCOL is the source or a C object is the data source, the source type information is easily avaible from TBLCOL or SCHEMA. Even though most likely same thing can be said about destination data, there are cases one does have to specify the destination type, e.g, aliased types (by typedef).

  • Allow destination dimensionality be specified at runtime for pointer type destination.
  • A good example of this requirement being particularly useful is when destination is a string represented by a pointer to char and the source data should be copied over to the memory area pointed by the pointer, not the memory area occupied by the pointer. Same thing can be said about other type of pointers, pointers with mulitple level of indirections, etc.

  • Support the case when destination is a field in a structure embedded or available through indirections (pointers).
  • In SDSS, data structures in which one field points off to another structure are frequently encountered. The translation table scheme must be able to copy over the data to the real destination.

    Note, we can imitate the process of traversing links in memory by using multiple source/destination pairs, each pair is an entry. Thus the translation table should support multi-line translation entry.

  • Be able to call Tcl constructor for destinations that are pointers to other data structures.
  • Support heap data type, whose dimensionality is only known at runtime and can vary from one object to another.
  • Note this requirement is different from 3, where dimensionality is invariant among objects.

  • Support optional source/destination data ratio change.
  • Useful when source and destination should differ in units.

    Translation Table Data Structure

    These requirements lead to the formation of the table entry data structure: typedef struct _xtable_entry { CONVTYPE type; /* Req.4: conversion type */ char* src; /* Req.1: source string */ char* dst; /* Req.1: destination string */ char* dsttype; /* Req.2: dst type string */ char* heaptype; /* Req.6: heap base type */ char* heaplen; /* Req.6: number of blocks */ DSTTYPE dstDataType; /* Req.2: destinnation type */ char* proc; /* Req.5: malloc command for dst*/ char* size; /* Req.3: size string */ int num[MAX_INDIRECTION]; /* number of blocks to malloc*/ double srcTodst; /* Req.7: optional ratio of data */ } SCHEMATRANS_ENTRY;

    where num and dstDataType are the enumerated and numeric representation of dsttype and size. Num is filled not at the time when entry is added.

    A translation table keeps the track of the access status of all the entries and uses a hashing scheme:

    typedef struct _xtabl_hash /* hash table for SCHEMATRANS */ { int totalNum; int curNum; int * entries; } SCHTR_HASH; typedef struct _xtable { int totalNum; /* total number of entries, empty or not*/ int entryNum; /* current number of filled entries */ SCHEMATRANS_ENTRY* entryPtr; /* pointed to entryNum entries */ SCHTR_STATUS * status; /* status of entries */ /* 0 = fresh otherwise visited */ SCHTR_HASH hash[SCHTRS_HASH_TBLLEN]; } SCHEMATRANS;

    Routines that operates on translation tables

    A set of C routines are defined to work on translation tables. Some of the important ones are :
  • shSchemaTransNew()
  • Creates a bare-bone translation table in memory. No entries yet.

  • shSchemaTransDel()
  • Deletes a translation table as well all the entries in it and free the memory.

  • shSchemaTransEntryAdd()
  • Add a table entry to a translation table.

  • shSchemaTransClearEntries()
  • clear and frees the memory of all the entries in a table but keep the bare-bone structure.

  • shSchemaTransCreateFromFile
  • create a complete translation table from a given ascii file.

  • shSchemaTransWriteToFile
  • dump out the content of the translation table to an ascii file. Users may edit the output file.

  • shSpptGeneralSyntaxCheck()
  • Check the translation table syntax according to a specific schema. The routine checks for correct grammar and test if the fields are existent in the schema. At this time, it also fill a few attributes in each entry, e.g, the "num" field in SCHEMATRANS_ENTRY.