Match Lists of Objects using FOCAS Algorithm

Match Lists of Objects using FOCAS Algorithm

This file contains documentation for a set of routines which allow the user to match up two sets of objects -- say, from two different images of the same field, or from a catalog and an image. The code which determines the transformation from one coordinate system to the other is based on the FOCAS Automatic Catalog Matching Algorithm, described in PASP 107, 1119 (1995).

One difference between the matching routines described here and other ASTROTOOLS matching routines is that the routines here not only find the transformation between two coordinate systems, but also create new CHAINs of matched pairs of objects. The atMatchChains routine employs an exclusive matching algorithm, which prevents any object from being part of more than a single matched pair.

The C bindings for these routines are in atPMatch.html, and the TCL bindings in tclPMatch.html.

An example of using the routines

In order to use the matching routines described herein, one must have the items to be matched in the form of a pair of CHAINs. Each CHAIN can have a different type of items, and a different number of entries. Let's first discuss the manner in which one can turn catalogs into CHAINs.

If the catalog is a FITS binary table, then one can simply use the fits2Schema routine to read the table and automatically create a CHAIN. In this case, the elements of the CHAIN will have names supplied by the FITS table.

If the catalog is in ASCII text format, one can edit it slightly and then use the param2Chain tool in DERVISH. Suppose that there is a catalog of stars in an ASCII file, starcat.txt, with entries like this:

    HD_2234    12.34564  +22.23542  1950  9.95  0.04 0.00 
    HD_2235    12.36042  -03.09953  1950 10.12  0.45 1.45 
    HD_2236    12.37023  +09.88432  1950  8.85  0.33 0.88 
in which the second and third items in each line are the RA and Dec, and the fifth item is a magnitude. One can edit starcat.txt, adding a header which describes its format, and then inserting a type identifier (CATSTAR, in the example below) at the start of each line in the file. After editing the file, it might look like this:
typedef struct {
  char name[20];
  double ra;
  double dec;
  float epoch;
  float mag;
  float bv;
  float ub;
} CATSTAR;
  
CATSTAR    HD_2234    12.34564  +22.23542  1950  9.95  0.04 0.00 
CATSTAR    HD_2235    12.36042  -03.09953  1950 10.12  0.45 1.45 
CATSTAR    HD_2236    12.37023  +09.88432  1950  8.85  0.33 0.88 
One can now call the function param2Chain (now in DERVISH) to turn the text file into a CHAIN of type CATSTAR: astls> set ch [param2Chain catstar.txt hdr] h2 astls> set el [chainElementGetByPos $ch 0] h3 astls> exprGet el.ra 12.34564

In order to use the matching routines described here, each CHAIN must contain items which have (among others) fields corresponding to

Note that the fields do not have to be called "x", "y" and "mag". There is no provision for correcting non-cartesian coordinates of objects, so if one uses Right Ascension and Declination, one will have to restrict the area of interest to a small enough patch of the sky that there is little spherical curvature.

Warning! I have discovered that one should be careful in providing x and y values. If the dynamic range of the coordinates is too large, the matching routines can fail to find a proper transformation. For example, if one is trying to match a small field (10 by 10 arcmin), it is much better to provide coordinates in "small" units (arcsec, or pixels), than "big" units (degrees or radians). In other words,

    this is a good set:        12.2, -8.3, 100.0
    this is a good set:        1510, -2000, 830
    this is not a good set:    180.0123, 180.0234, 180.0567
Subtracting a constant from "large" coordinate values can help prevent errors.

To continue with our example, let us suppose that one has run a program which finds objects in an image and measures their properties, creating a CHAIN of the following OBJ structures:

struct { float row; /* centroid in the row direction, in pixels */ float col; /* centroid in the column direction, in pixels */ float rowWidth; /* extent in row direction, in pixels */ float colWidth; /* extent in column direction, in pixels */ float apCounts; /* sum of counts inside an aperture */ float apMag; /* apCounts, converted to a magnitude via */ /* apMag = 30 - 2.5*log10(apCounts); } OBJ; Note that one cannot use the apCounts field to compare this set of OBJs against the set of CATSTARS, since bright stars will have large values of apCounts, but small values of mag in the catalog.

The matching code can understand a single level of indirection in structure field names. Therefore, given a structure like this: struct { float row; float col; float apMag[5]; /* aperture mag in u, g, r, i, z */ } the user may indicate that he wishes to use the 3'rd element of the "apMag" field for matching purposes. As usual, one must use angled brackets to indicate subscripts at the TCL level. Thus, to use an element of an array in the matching process, one might type astls> atFindTrans $ch1 row col apMag<3> $ch2 row col mag

Suppose, then, that one has

Step 1: Finding the TRANS which matches two set of objects

The first step is to find a coordinate transformation which brings the two sets of items into a common coordinate system. We assume that the two can be connected via a transformation of the form

          x' = A + Bx + Cy
          y' = D + Ex + Fy
which allows for a simple translation in each direction, a rotation, and a scale change. The coefficients A,B,C,D,E,F of these equations can be stored in a TRANS structure. We will call a TCL verb which returns such a structure.

With a few caveats (there must be at least 6 objects in each list, and no objects can be NULL), we can find the required coordinate transformation like so:

astls> set trans [atFindTrans $cat_chain ra dec mag $obj_chain x y apMag] h7 The returned TRANS will transform the coordinates of the first CHAIN of objects ("$cat_chain" in the example above) into those of the second CHAIN ("$obj_chain").

The user can find default values for the optional radius, maxdist, and nobj arguments in the file atPMatch.h.

In addition, there is another optional argument, scale. This is the ratio of the size of coordinates in list B to coordinates in list A; i.e., if we have points

then this ratio is 5.0. If the user supplies a value, only similar triangles with a ratio of sizes within 10 percent of scale will be counted as "matches."

As the paper by Valdes et al. describes, this algorithm attempts to find a transformation using only the nobj brightest items from each catalog. The user must be sure that there are at least 6 matches between these nobj brightest objects. If, for example, one set of stars is the Yale Bright Star Catalog, which spans the range -2 < mag < 6, and the other set of stars is from a survey which spans the range 4 < mag < 10, then it's possible that the brightest objects in each set are disjoint. It may help to select only those objects from a catalog which are precisely within the area covered by an image, or to prune from one set of stars all objects outside the magnitude range of the other.

There are several #define'd values in atPMatch.h which can alter the behavior of the matching algorithm. If the routines fail repeatedly, it might be worthwhile to fiddle with them. Read the comments in atPMatch.h concerning

Step 2: Applying the TRANS to one set of objects

The next step is to apply this TRANS to the coordinates of each element in one of the CHAINs, so that both sets of objects have coordinates in the same system. It will then be easy to find matching pairs. Since we gave "$cat_chain" as the first argument to atFindTrans in our example above, we must apply the returned TRANS structure to "$cat_chain", like so:

astls> atApplyTrans $cat_chain ra dec $trans

The elements of structures in the chain specified by the xname and yname arguments may have types int, float, or double. All values are read internally into variables of type float for calculations.

Step 3: Finding matched pairs of objects

Finally, we are ready to find matching pairs of objects, since now the items on both CHAINs have coordinates which are similar. Recall that this is the coordinate system of the OBJs, so the units are those of the OBJ structure (pixels). We must choose a maximum difference in coordinates allowed for matching objects; suppose we let radius = 5 pixels. We must supply the names of the 2 CHAINs, and the names of the fields within each that specify the "x" and "y" coordinates, to the atMatchChains verb.

astls> chainSize $cat_chain 103 astls> chainSize $obj_chain 45 astls> atMatchChains $cat_chain ra dec $obj_chain x y 5 mcat mobj ucat uobj astls> chainSize $mcat 39 astls> chainSize $mobj 39 astls> chainSize $ucat 64 astls> chainSize $mcat 6 The output CHAINs contain the following information:

Since the atMatchChains function enforces exclusive matching, each item can be part of a single match, or it must be unmatched. Therefore, the sizes of mcat and ucat add up to that of "$cat_chain", and the sizes of ucat and uobj add up to that of "$obj_chain". And, of course, the sizes of mcat and mobj must be equal.

Beware the properties of the output CHAINs: each contains pointers to the very same objects which are included in the original, input CHAINs. Therefore, if one deletes the items in the input CHAINs, one will lose the information in the output CHAINs as well.