Scroll to navigation

SETOP(1) User Commands SETOP(1)

NAME

setop - make set of strings from input

SYNOPSIS

setop [-h] [--quiet | --verbose] [-C] [--include-empty] [-n insepar | -l elregex] [-o outsepar] [-t trimchars] [-u|i|s] [inputfilename]* [-d filename]* [-# | --is-empty | -c element | -e filename | -b filename | -p filename]

DESCRIPTION

Apply set operations like union, intersection, or set difference to input files and print resulting set (sorted and with unique string elements) to standard output or give answer to special queries like number of elements.

OPTIONS

produce this help message and exit
output name and version
suppress all output messages in case of special queries (e. g. when check if element is contained in set)
always use output messages in case of special queries (i. e. also output message on success)
handle input elements case-insensitive
don’t ignore empty elements (these can come from empty lines, trimming, etc.)
describe the form of an input separator as regular expression in ECMAScript syntax; default is new line (if --input-element is not given); don’t forget to include the new line character \n when you set the input separator manually, when desired!
describe the form of input elements as regular expression in ECMAScript syntax
escape sequences are allowed
trim all given characters at beginning and end of elements (escape sequences allowed)
unite all given input sets (default)
unite all given input sets
build symmetric difference for all given input sets
subtract all elements in given file from output set
-# [ --count ]
just output number of (different) elements, don’t list them
check if resulting set is empty
check if given element is contained in set
check set equality, i. e. check if output corresponds with content of file
check if content of file is subset of output set
check if content of file is superset of output set

No input filename or "-" is equal to reading from standard input.

The sequence of events of setop is as follows: At first, all input files are parsed and combined according to one of the options -u, -i, or -s. After that, all inputs from option -d are parsed and removed from result of first step. Finally, the desired output is printed to screen: the set itself, or its number of elements, or a comparison to another set (option -e), etc.

By default each line of an input stream is considered to be an element, you can change this by defining regular expressions within the options --input-separator or --input-element. When using both, the input stream is first split according to the separator and after that filtered by the desired input element form. After finding the elements they are finally trimmed according to the argument given with --trim. The option -C lets you treat Word and WORD equal, only the first occurrence of all input streams is considered. Note that -C does not affect the regular expressions used in --input-separator and --input-element.

When describing strings and characters for the output separator or for the option --trim you can use escape sequences like \t, \n, \" and \'. But be aware that some of these sequences (especially \\ and \") might be interpreted by your shell before passing the string to setop. In that case you have to use \\\\ respectively \\\" just for describing a \ or a ". You can check your shell’s behavior with echo "\\ and \""

Special boolean queries (e. g. check if element is contained in set) don’t return anything in case of success except their exit code EXIT_SUCCESS (0). In case the query is unsuccessful (e. g. element not contained in set) the exit code is guaranteed to be unequal to EXIT_SUCCESS and to EXIT_FAILURE (1). (Here it is 3.) This way, setop can be used in the shell.

EXAMPLES

setop -c ":fooBAR-:" --trim ":-\t" -C -d B.txt A.txt

case-insensitive check if element "foobar" is contained in A minus B

setop A.txt - -i B.txt --input-element "\d+"

output intersection of console, A, and B, where elements are recognized as strings of digits with at least one character; i. e. elements are non-negative integers

setop -s A.txt B.txt --input-separator [[:space:]-]

find all elements contained in A *or* B, not both, where a whitespace (i. e. \v \t \n \r \f or space) or a minus is interpreted as a separator between elements
December 2023 setop 0.1