LIBLINEAR-TRAIN(1)

General Commands Manual

LIBLINEAR-TRAIN(1)

NAME¶

liblinear-train - train a linear classifier and produce a model

SYNOPSIS¶

liblinear-train [options] training_set_file [model_file]

DESCRIPTION¶

liblinear-train trains a linear classifier using liblinear and produces a model suitable for use with liblinear-predict(1).

training_set_file is the file containing the data used for training. model_file is the file to which the model will be saved. If model_file is not provided, it defaults to training_set_file.model.

To obtain good performances, sometimes one needs to scale the data. This can be done with svm-scale(1).

OPTIONS¶

A summary of options is included below.

-s type: Set the type of the solver (default 1):

for multi-class classification

0 ... L2-regularized logistic regression (primal)

1 ... L2-regularized L2-loss support vector classification (dual)

2 ... L2-regularized L2-loss support vector classification (primal)

3 ... L2-regularized L1-loss support vector classification (dual)

4 ... support vector classification by Crammer and Singer

5 ... L1-regularized L2-loss support vector classification

6 ... L1-regularized logistic regression

7 ... L2-regularized logistic regression (dual)

for regression

11 ... L2-regularized L2-loss support vector regression (primal)

12 ... L2-regularized L2-loss support vector regression (dual)

13 ... L2-regularized L1-loss support vector regression (dual)

for outlier detection

21 ... one-class support vector machine (dual)

-c cost: Set the parameter C (default: 1)
-p epsilon: Set the epsilon in loss function of epsilon-SVR (default: 0.1)
-e epsilon: Set the tolerance of the termination criterion
-n nu: Set the parameter nu of one-class SVM (default 0.5)
-s 0 and 2:

|f'(w)|_2 <= epsilon*min(pos,neg)/l*|f'(w0)|_2, where f is
the primal function and pos/neg are the number of positive/negative data
(default: 0.01)

: -s 11:

|f'(w)|_2 <= epsilon*|f'(w0)|_2 (default 0.0001)

: -s 1, 3, 4, 7 and 21:

Dual maximal violation <= epsilon; similar to libsvm (default: 0.1
except 0.01 for -s 21)

: -s 5 and 6:

|f'(w)|_inf <= epsilon*min(pos,neg)/l*|f'(w0)|_inf, where f is the primal
function (default: 0.01)

-s 12 and 13:

|f'(alpha)|_1 <= epsilon |f'(alpha0)|, where f is the dual function (default 0.1)

-B bias

If bias >= 0, then instance x becomes [x; bias]; if bias < 0, then
no bias term is added (default: -1)

-R

not regularize the bias; must with -B 1 to have the bias; DON'T use this unless you
know what it is (for -s 0, 2, 5, 6, 11)

-wi weight

Weights adjust the parameter C of different classes (see README for details)

-v n

n-fold cross validation mode

-C

Find parameters (C for -s 0, 2 and C, p for -s 11)

-q

Quiet mode (no outputs).

Option -v randomly splits the data into n parts and calculates cross validation accuracy on them.

Option -C conducts cross validation under different parameters and finds the best one. This option is supported only by -s 0, -s 2 (for finding C) and -s 11 (for finding C, p). If the solver is not specified, -s 2 is used.

EXAMPLES¶

Train a linear SVM using L2-loss function:



 liblinear-train data_file

Train a logistic regression model:



 liblinear-train -s 0 data_file

Train a linear one-class SVM which selects roughly 10% data as outliers.



 liblinear-train -s 21 -n 0.1 data_file

Do five-fold cross-validation using L2-loss SVM. Use a smaller stopping tolerance 0.001 than the default 0.1 if you want more accurate solutions:



 liblinear-train -v 5 -e 0.001 data_file

Conduct cross validation many times by L2-loss SVM and find the parameter C which achieves the best cross validation accuracy:



 train -C datafile

For parameter selection by -C, users can specify other solvers (currently -s 0, -s 2 and -s 11 are supported) and different number of CV folds. Further, users can use the -c option to specify the smallest C value of the search range. This option is useful when users want to rerun the parameter selection procedure from a specified C under a different setting, such as a stricter stopping tolerance -e 0.0001 in the above example. Similarly, for -s 11, users can use the -p option to specify the maximal p value of the search range.



 train -C -s 0 -v 3 -c 0.5 -e 0.0001 datafile

Train four classifiers:

positive negative Cp Cn
class 1 class 2,3,4 20 10
class 2 class 1,3,4 50 10
class 3 class 1,2,4 20 10
class 4 class 1,2,3 10 10



 liblinear-train -c 10 -w1 2 -w2 5 -w3 2 four_class_data_file

If there are only two classes, we train ONE model. The C values for the two classes are 10 and 50:



 liblinear-train -c 10 -w3 1 -w2 5 two_class_data_file

Output probability estimates (for logistic regression only) using liblinear-predict(1):



 liblinear-predict -b 1 test_file data_file.model output_file

AUTHORS¶

liblinear-train was written by the LIBLINEAR authors at National Taiwan university for the LIBLINEAR Project.

This manual page was written by Christian Kastner <ckk@debian.org> for the Debian project (and may be used by others).

October 21, 2019

Source file:	liblinear-train.1.en.gz (from liblinear-tools 2.43+dfsg-1)
Source last updated:	2022-03-07T17:13:44Z
Converted to HTML:	2024-10-30T00:10:01Z