Scroll to navigation

mlpack_lars(1) User Commands mlpack_lars(1)

NAME

mlpack_lars - lars

SYNOPSIS


mlpack_lars [-i unknown] [-m unknown] [-l double] [-L double] [-n bool] [-N bool] [-r unknown] [-t unknown] [-c bool] [-V bool] [-M unknown] [-o unknown] [-h -v]

DESCRIPTION

An implementation of LARS: Least Angle Regression (Stagewise/laSso). This is a stage-wise homotopy-based algorithm for L1-regularized linear regression (LASSO) and L1+L2-regularized linear regression (Elastic Net).

This program is able to train a LARS/LASSO/Elastic Net model or load a model from file, output regression predictions for a test set, and save the trained model to a file. The LARS algorithm is described in more detail below:

Let X be a matrix where each row is a point and each column is a dimension, and let y be a vector of targets.

The Elastic Net problem is to solve


min_beta 0.5 || X * beta - y ||_2^2 + lambda_1 ||beta||_1 +
0.5 lambda_2 ||beta||_2^2
If lambda1 > 0 and lambda2 = 0, the problem is the LASSO. If lambda1 > 0 and lambda2 > 0, the problem is the Elastic Net. If lambda1 = 0 and lambda2 > 0, the problem is ridge regression. If lambda1 = 0 and lambda2 = 0, the problem is unregularized linear regression.

For efficiency reasons, it is not recommended to use this algorithm with ’--lambda1 (-l)' = 0. In that case, use the 'linear_regression' program, which implements both unregularized linear regression and ridge regression.

To train a LARS/LASSO/Elastic Net model, the '--input_file (-i)' and ’--responses_file (-r)' parameters must be given. The '--lambda1 (-l)', ’--lambda2 (-L)', and '--use_cholesky (-c)' parameters control the training options. A trained model can be saved with the '--output_model_file (-M)'. If no training is desired at all, a model can be passed via the ’--input_model_file (-m)' parameter.

The program can also provide predictions for test data using either the trained model or the given input model. Test points can be specified with the ’--test_file (-t)' parameter. Predicted responses to the test points can be saved with the '--output_predictions_file (-o)' output parameter.

For example, the following command trains a model on the data 'data.csv' and responses 'responses.csv' with lambda1 set to 0.4 and lambda2 set to 0 (so, LASSO is being solved), and then the model is saved to 'lasso_model.bin':

$ mlpack_lars --input_file data.csv --responses_file responses.csv --lambda1 0.4 --lambda2 0 --output_model_file lasso_model.bin

The following command uses the 'lasso_model.bin' to provide predicted responses for the data 'test.csv' and save those responses to ’test_predictions.csv':

$ mlpack_lars --input_model_file lasso_model.bin --test_file test.csv --output_predictions_file test_predictions.csv

OPTIONAL INPUT OPTIONS

Default help info.
Print help on a specific option. Default value ''.
Matrix of covariates (X).
Trained LARS model to use.
Regularization parameter for l1-norm penalty. Default value 0.
Regularization parameter for l2-norm penalty. Default value 0.
Do not fit an intercept in the model.
Do not normalize data to unit variance before modeling.
Matrix of responses/observations (y).
Matrix containing points to regress on (test points).
Use Cholesky decomposition during computation rather than explicitly computing the full Gram matrix.
Display informational messages and the full list of parameters and timers at the end of execution.
Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

Output LARS model.
If --test_file is specified, this file is where the predicted responses will be saved.

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

23 September 2024 mlpack-4.5.0