Scroll to navigation

mlpack_preprocess_binarize(1) User Commands mlpack_preprocess_binarize(1)

NAME

mlpack_preprocess_binarize - binarize data

SYNOPSIS


mlpack_preprocess_binarize -i unknown [-d int] [-t double] [-V bool] [-o unknown] [-h -v]

DESCRIPTION

This utility takes a dataset and binarizes the variables into either 0 or 1 given threshold. User can apply binarization on a dimension or the whole dataset. The dimension to apply binarization to can be specified using the ’--dimension (-d)' parameter; if left unspecified, every dimension will be binarized. The threshold for binarization can also be specified with the ’--threshold (-t)' parameter; the default threshold is 0.0.

The binarized matrix may be saved with the '--output_file (-o)' output parameter.

For example, if we want to set all variables greater than 5 in the dataset ’X.csv' to 1 and variables less than or equal to 5.0 to 0, and save the result to 'Y.csv', we could run

$ mlpack_preprocess_binarize --input_file X.csv --threshold 5 --output_file Y.csv

But if we want to apply this to only the first (0th) dimension of 'X.csv', we could instead run

$ mlpack_preprocess_binarize --input_file X.csv --threshold 5 --dimension 0 --output_file Y.csv

REQUIRED INPUT OPTIONS

Input data matrix.

OPTIONAL INPUT OPTIONS

Dimension to apply the binarization. If not set, the program will binarize every dimension by default. Default value 0.
Default help info.
Print help on a specific option. Default value ''.
Threshold to be applied for binarization. If not set, the threshold defaults to 0.0. Default value 0.
Display informational messages and the full list of parameters and timers at the end of execution.
Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

--output_file (-o) [unknown] Matrix in which to save the output.

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

11 January 2024 mlpack-4.3.0