GT_MPI_GATHER(1) | User Commands | GT_MPI_GATHER(1) |
NAME¶
gt_mpi_gather - MPI gatherer for GenomicsDB
SYNOPSIS¶
gt_mpi_gather [options]
OPTIONS¶
- --help, -h
- Print a usage message summarizing options available and exit
- --json-config=<query json file>, -j <query json file>
- Can specify workspace, array, query_column_ranges, query_row_ranges, vid_mapping_file, callset_mapping_file, query_attributes, query_filter, reference_genome, etc. as fields in the json file e.g.
- { "workspace" : "/tmp/ws",
- "array" : "t0_1_2", "query_column_ranges" : [ [ [0, 100 ], 500 ] ], "query_row_ranges" : [ [ [0, 2 ] ], "vid_mapping_file" : "/tests/inputs/vid.json", "callset_mapping_file": "/tests/inputs/callset_mapping.json", "query_attributes" : [ "REF", "ALT", "BaseQRankSum", "MQ", "MQ0", "ClippingRankSum", "MQRankSum", "ReadPosRankSum", "DP", "GT", "GQ", "SB", "AD", "PL", "DP_FORMAT", "MIN_DP" ] }
- --loader-json-config=<loader json file>, -l <loader json file>
- Optional, if vid_mapping_file and callset_mapping_file fields are specified in the query json file
- --workspace=<workspace dir>, -w <GenomicsDB workspace dir>
- Optional, if workspace is specified in any of the json config files
- --array=<array dir>, -A <GenomicsDB array dir>
- Optional, if array is specified in any of the json config files
- --print-calls
- Optional, prints VariantCalls in a JSON format
- --print-csv
- Optional, outputs CSV with the fields and the order of CSV lines determined by the query attributes
- --produce-Broad-GVCF
- Optional, produces combined gVCF from the GenomicsDB data constrained by the query configuration --output-format=<output_format>, -O <output_format>
- used with --produce-Broad-GVCF
- Output format can be one of the following strings: "z[0-9]" (compressed VCF),"b[0-9]" (compressed BCF) or "bu" (uncompressed BCF). Default is uncompressed VCF if not specified.
- --produce-histogram
- Optional
- --produce-interesting-positions
- Optional
- --version Print version and exit
- If none of the print/produce arguments are specified, the tool prints all the Variants constrained by the query configuration in a JSON format
- Parallel Querying
- MPI could be used for parallel querying, e.g. mpirun -n <num_processes> -hostfile <hostfile> ./bin/gt_mpi_gather -j <query.json> -l <loader.json> [<other_args>]
July 2022 | gt_mpi_gather |