clm close(1) | USER COMMANDS | clm close(1) |
NAME¶
clm_close - Fetch connected components from graphs or subgraphs
clmclose is not in actual fact a program. This manual page documents the behaviour and options of the clm program when invoked in mode close. The options -h, --apropos, --version, -set, --nop are accessible in all clm modes. They are described in the clm manual page.
SYNOPSIS¶
clm close -imx <fname> [options]
clm close -imx fname (specify matrix input) -abc fname (specify label input) -dom fname (input domain/cluster file) [-o fname (output file)] [--is-undirected (trust input graph to be undirected)] [-levels LO/STEP/HI[/prefix] (write cluster size distribution for each cutoff)] [-levels-norm num (divide each level by num to define cutoff)] [--write-count (output component count)] [--write-sizes (output component sizes (default))] [--write-size-counts (output compressed list of component sizes)] [--write-cc (output components as clustering)] [--write-block (output graph restricted to -dom argument)] [--write-blockc (output graph complement of -dom argument)] [-cc-bound num (select components with size at least num)] [--sl (output single linkage tree as list of joins (for -imx input))] [-write-sl-list fname (write list of join order with weights)] [-tf spec (apply tf-spec to input matrix)] [-h (print synopsis, exit)] [--apropos (print synopsis, exit)] [--version (print version, exit)]
DESCRIPTION¶
Use clm close to fetch the connected components from a graph. Different output modes are supported (see below). In matrix mode (i.e. using the -imx option) the output returned with --write-cc can be used in conjunction with mcxsubs to retrieve individual subgraphs corresponding to connected components.
OPTIONS¶
-abc <fname> (label input)
The file name for input that is in label format.
-imx <fname> (input matrix)
The file name for input that is in mcl native matrix format.
-o fname (output file)
Specify the file where output is sent to. The default is STDOUT.
-dom fname (input domain/cluster file)
If this option is used, clm close will, as a first step, for each of the
domains in file fname retrieve the associated subgraph from
the input graph. These are then further decomposed into connected
components, and the program will process these in the normal manner.
--write-count (output component count)
--write-sizes (output component sizes (default))
--write-size-counts (output compressed list of component sizes)
--write-cc (output components as clustering)
--write-block (output graph restricted to -dom argument)
--write-blockc (output graph complement of -dom argument)
The default behaviour is currently to output the sizes of the connected
components. It is also possible to simply output the number of components
with --write-count, to write a counted list of sizes with
--write-size-counts, or to write the components as a clustering in
mcl format with -write-cc. Even more options exist: it is possible to
output the restriction of the input graph to a domain, or to output the
complement of this restriction.
-levels LO/STEP/HI[/prefix] (write cluster size distribution for
each cutoff)
-levels-norm num (divide each level by num to define cutoff)
Use -levels to inspect the cluster size distribution at various
cut-offs by specifying a triplet of numbers (separated by forward slashes),
the first of which is the starting point, the second is the step size, and
the third is the end point. If a fourth argument (preceded by another slash)
is given, all clusterings are written to a file based on the supplied
argument as file name prefix. The cut-off can be further varied by the
argument to -levels-norm.
--sl (output single linkage tree as list of joins (for -imx
input))
-write-sl-list fname (write list of join order with weights)
A primary use case for this is to apply single link clustering to the rcl
(restricted contingency linkage) graph that is output by clm vol with
its write-rcl option. This rcl graph encodes a consensus clustering
derived from the multiple clusterings that are given to clm vol.
The output (save with -o or UNIX redirection) can be supplied to rcl-res.pl with a list of varying resolution parameters to produce a small number of nested clusterings. The resolution parameters (second and subsequent arguments) to rcl-res.pl are set sizes; For each of the supplied resolutions res the script will descend the tree as long as the current node has some split below it where both clusters are of size at least res. Note that the resulting clustering may still have smaller clusters and singletons (resulting from other splits).
The mcl distribution has an example script
graphs/rcl-example.sh that illustrates the different steps.
--is-undirected (omit graph undirected check)
With this option the transformation to make sure that the input is undirected
is omitted. This will be slightly faster. Using this option while the input
is directed may lead to erronenous results.
-cc-bound num (select components with size at least num)
Transform the input matrix values according to the syntax described in
mcxio(5).
AUTHOR¶
Stijn van Dongen.
SEE ALSO¶
mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.
9 Oct 2022 | clm close 22-282 |