User Commands

NAME¶

mlpack_preprocess_scale - scale data

SYNOPSIS¶



 mlpack_preprocess_scale -i unknown [-r double] [-m unknown] [-f bool] [-e int] [-b int] [-a string] [-s int] [-V bool] [-o unknown] [-M unknown] [-h -v]

DESCRIPTION¶

This utility takes a dataset and performs feature scaling using one of the six scaler methods namely: 'max_abs_scaler', 'mean_normalization', ’min_max_scaler' ,'standard_scaler', 'pca_whitening' and 'zca_whitening'. The function takes a matrix as '--input_file (-i)' and a scaling method type which you can specify using '--scaler_method (-a)' parameter; the default is standard scaler, and outputs a matrix with scaled feature.

The output scaled feature matrix may be saved with the '--output_file (-o)' output parameters.

The model to scale features can be saved using '--output_model_file (-M)' and later can be loaded back using'--input_model_file (-m)'.

So, a simple example where we want to scale the dataset 'X.csv' into ’X_scaled.csv' with standard_scaler as scaler_method, we could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method standard_scaler

A simple example where we want to whiten the dataset 'X.csv' into ’X_whitened.csv' with PCA as whitening_method and use 0.01 as regularization parameter, we could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method pca_whitening --epsilon 0.01

You can also retransform the scaled dataset back using'--inverse_scaling (-f)'. An example to rescale : 'X_scaled.csv' into 'X.csv'using the saved model '--input_model_file (-m)' is:

$ mlpack_preprocess_scale --input_file X_scaled.csv --output_file X.csv --inverse_scaling --input_model_file saved.bin

Another simple example where we want to scale the dataset 'X.csv' into ’X_scaled.csv' with min_max_scaler as scaler method, where scaling range is 1 to 3 instead of default 0 to 1. We could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method min_max_scaler --min_value 1 --max_value 3

REQUIRED INPUT OPTIONS¶

--input_file (-i) [unknown]: Matrix containing data.

OPTIONAL INPUT OPTIONS¶

--epsilon (-r) [double]: regularization Parameter for pcawhitening, or zcawhitening, should be between -1 to 1. Default value 1e-06.
--help (-h) [bool]: Default help info.
--info [string]: Print help on a specific option. Default value ''.
--input_model_file (-m) [unknown]: Input Scaling model.
--inverse_scaling (-f) [bool]: Inverse Scaling to get original dataset
--max_value (-e) [int]: Ending value of range for min_max_scaler. Default value 1.
--min_value (-b) [int]: Starting value of range for min_max_scaler. Default value 0.
--scaler_method (-a) [string]: method to use for scaling, the default is standard_scaler. Default value 'standard_scaler'.
--seed (-s) [int]: Random seed (0 for std::time(NULL)). Default value 0.
--verbose (-v) [bool]: Display informational messages and the full list of parameters and timers at the end of execution.
--version (-V) [bool]: Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS¶

--output_file (-o) [unknown] Matrix to save scaled data to.

--output_model_file (-M) [unknown]: Output scaling model.

ADDITIONAL INFORMATION¶

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

28 January 2025

mlpack-4.5.1

Source file:	mlpack_preprocess_scale.1.en.gz (from mlpack-bin 4.5.1-1+b2)
Source last updated:	2025-01-28T11:38:59Z
Converted to HTML:	2025-01-28T15:20:38Z