table of contents
SETOP(1) | User Commands | SETOP(1) |
NAME¶
setop - make set of strings from input
SYNOPSIS¶
setop [-h] [--quiet | --verbose] [-C] [--include-empty] [-n insepar | -l elregex] [-o outsepar] [-t trimchars] [-u|i|s] [inputfilename]* [-d filename]* [-# | --is-empty | -c element | -e filename | -b filename | -p filename]
DESCRIPTION¶
Apply set operations like union, intersection, or set difference to input files and print resulting set (sorted and with unique string elements) to standard output or give answer to special queries like number of elements.
OPTIONS¶
- --help
- produce this help message and exit
- --version
- output name and version
- --quiet
- suppress all output messages in case of special queries (e. g. when check if element is contained in set)
- --verbose
- always use output messages in case of special queries (i. e. also output message on success)
- -C [ --ignore-case ]
- handle input elements case-insensitive
- --include-empty
- don’t ignore empty elements (these can come from empty lines, trimming, etc.)
- -n [ --input-separator ] arg
- describe the form of an input separator as regular expression in ECMAScript syntax; default is new line (if --input-element is not given); don’t forget to include the new line character \n when you set the input separator manually, when desired!
- -l [ --input-element ] arg
- describe the form of input elements as regular expression in ECMAScript syntax
- -o [ --output-separator ] arg (=\n) string for separating output elements;
- escape sequences are allowed
- -t [ --trim ] arg
- trim all given characters at beginning and end of elements (escape sequences allowed)
- -u [ --union ]
- unite all given input sets (default)
- -i [ --intersection ]
- unite all given input sets
- -s [ --symmetric-difference ]
- build symmetric difference for all given input sets
- -d [ --difference ] arg
- subtract all elements in given file from output set
- -# [ --count ]
- just output number of (different) elements, don’t list them
- --is-empty
- check if resulting set is empty
- -c [ --contains ] arg
- check if given element is contained in set
- -e [ --equal ] arg
- check set equality, i. e. check if output corresponds with content of file
- -b [ --subset ] arg
- check if content of file is subset of output set
- -p [ --superset ] arg
- check if content of file is superset of output set
No input filename or "-" is equal to reading from standard input.
The sequence of events of setop is as follows: At first, all input files are parsed and combined according to one of the options -u, -i, or -s. After that, all inputs from option -d are parsed and removed from result of first step. Finally, the desired output is printed to screen: the set itself, or its number of elements, or a comparison to another set (option -e), etc.
By default each line of an input stream is considered to be an element, you can change this by defining regular expressions within the options --input-separator or --input-element. When using both, the input stream is first split according to the separator and after that filtered by the desired input element form. After finding the elements they are finally trimmed according to the argument given with --trim. The option -C lets you treat Word and WORD equal, only the first occurrence of all input streams is considered. Note that -C does not affect the regular expressions used in --input-separator and --input-element.
When describing strings and characters for the output separator or for the option --trim you can use escape sequences like \t, \n, \" and \'. But be aware that some of these sequences (especially \\ and \") might be interpreted by your shell before passing the string to setop. In that case you have to use \\\\ respectively \\\" just for describing a \ or a ". You can check your shell’s behavior with echo "\\ and \""
Special boolean queries (e. g. check if element is contained in set) don’t return anything in case of success except their exit code EXIT_SUCCESS (0). In case the query is unsuccessful (e. g. element not contained in set) the exit code is guaranteed to be unequal to EXIT_SUCCESS and to EXIT_FAILURE (1). (Here it is 3.) This way, setop can be used in the shell.
EXAMPLES¶
setop -c ":fooBAR-:" --trim ":-\t" -C -d B.txt A.txt
- case-insensitive check if element "foobar" is contained in A minus B
setop A.txt - -i B.txt --input-element "\d+"
- output intersection of console, A, and B, where elements are recognized as strings of digits with at least one character; i. e. elements are non-negative integers
setop -s A.txt B.txt --input-separator [[:space:]-]
- find all elements contained in A *or* B, not both, where a whitespace (i. e. \v \t \n \r \f or space) or a minus is interpreted as a separator between elements
December 2023 | setop 0.1 |