STILTS-TMATCH1(1) | Stilts commands | STILTS-TMATCH1(1) |
NAME¶
stilts-tmatch1 - Performs a crossmatch internal to a single table
SYNOPSIS¶
stilts tmatch1 [matcher=<matcher-name>] [params=<match-params>] [tuning=<tuning-params>] [values=<expr-list>] [action=identify|keep0|keep1|wide2|wideN] [progress=none|log|time|profile] [runner=parallel|parallel<n>|parallel-all|sequential|classic|partest] [ifmt=<in-format>] [istream=true|false] [in=<table>] [icmd=<cmds>] [ocmd=<cmds>] [omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui] [out=<out-table>] [ofmt=<out-format>]
DESCRIPTION¶
tmatch1 performs efficient and flexible crossmatching between the rows of a single table. It can match rows on the basis of their relative position in the sky, or alternatively using many other criteria such as separation in in some isotropic or anisotropic Cartesian space, identity of a key value, or some combination of these; the full range of match criteria is dicussed in SUN/256.
The basic task performed by the intra-table matcher is to identify groups of rows within the table which match each other. See SUN/256 for an explanation of exactly what consitutes a match group. The result of identifying these groups is expressed as an output table in one of a variety of ways, specified by the action parameter. These options include marking group membership in added columns and eliminating some or all rows which form part of a match group.
OPTIONS¶
- identify: The output table is the same as the input table except that it contains two additional columns, GroupID and GroupSize, following the input columns. Each group of rows which matched is assigned a unique integer, recorded in the GroupID column, and the size of each group is recorded in the GroupSize column. Rows which don't match any others (singles) have null values in both these columns.
- keep0: The result is a new table containing only "single" rows, that is ones which don't match any other rows in the table. Any other rows are thrown out.
- keep1: The result is a new table in which only one row (the first in the input table order) from each group of matching ones is retained. A subsequent intra-table match with the same criteria would therefore show no matches.
- wideN: The result is a new "wide" table consisting of matched rows in the input table stacked next to each other. Only groups of exactly N rows in the input table are used to form the output table; each row of the output table consists of the columns of the first group member, followed by the columns of the second group member and so on. The output table therefore has N times as many columns as the input table. The column names in the new table have _1, _2, ... appended to them to avoid duplication.
The options are:
- none: no progress is shown
- log: progress information is shown
- time: progress information and some time profiling information is shown
- profile: progress information and limited time/memory profiling information are shown
- parallel: uses multithreaded implementation for large tables, with default parallelism, which is the smaller of 6 and the number of available processors
- parallel<n>: uses multithreaded implementation for large tables, with parallelism given by the supplied value <n>
- parallel-all: uses multithreaded implementation for large tables, with a parallelism given by the number of available processors
- sequential: uses multithreaded implementation but with only a single thread
- classic: uses legacy sequential implementation
- partest: uses multithreaded implementation even when tables are small
The parallel* options should normally run faster than sequential or classic (which are provided mainly for testing purposes), at least for large matches and where multiple processing cores are available.
The default value "parallel" is currently limited to a parallelism of 6 since larger values yield diminishing returns given that some parts of the matching algorithms run sequentially (Amdahl's Law), and using too many threads can sometimes end up doing more work or impacting on other operations on the same machine. But you can experiment with other concurrencies, e.g. "parallel16" to run on 16 cores (if available) or "parallel-all" to run on all available cores.
The value of this parameter should make no difference to the matching results. If you notice any discrepancies please report them.
- A filename.
- A URL.
- The special value "-", meaning standard input. In this case the input format must be given explicitly using the ifmt parameter. Note that not all formats can be streamed in this way.
- A scheme specification of the form :<scheme-name>:<scheme-args>.
- A system command line with either a "<" character at the start, or a "|" character at the end ("<syscmd" or "syscmd|"). This executes the given pipeline and reads from its standard output. This will probably only work on unix-like systems.
In any case, compressed data in one of the supported compression formats (gzip, Unix compress or bzip2) will be decompressed transparently.
Commands may alternatively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored. A backslash character '\fR' at the end of a line joins it with the following line.
Commands may alternatively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored. A backslash character '\fR' at the end of a line joins it with the following line.
Possible values are
- out
- meta
- stats
- count
- checksum
- cgi
- discard
- topcat
- samp
- tosql
- gui
Use the help=omode flag or see SUN/256 for more information.
This parameter must only be given if omode has its default value of "out".
This parameter must only be given if omode has its default value of "out".
SEE ALSO¶
If the package stilts-doc is installed, the full documentation
SUN/256 is available in HTML format:
file:///usr/share/doc/stilts/sun256/index.html
VERSION¶
STILTS version 3.5.1-debian
This is the Debian version of Stilts, which lack the support of
some file formats and network protocols. For differences see
file:///usr/share/doc/stilts/README.Debian
AUTHOR¶
Mark Taylor (Bristol University)
Mar 2017 |