Scroll to navigation

mlpack_nmf(1) User Commands mlpack_nmf(1)

NAME

mlpack_nmf - non-negative matrix factorization

SYNOPSIS


mlpack_nmf -i unknown -r int [-q unknown] [-p unknown] [-m int] [-e double] [-s int] [-u string] [-V bool] [-H unknown] [-W unknown] [-h -v]

DESCRIPTION

This program performs non-negative matrix factorization on the given dataset, storing the resulting decomposed matrices in the specified files. For an input dataset V, NMF decomposes V into two matrices W and H such that

V = W * H

where all elements in W and H are non-negative. If V is of size (n x m), then W will be of size (n x r) and H will be of size (r x m), where r is the rank of the factorization (specified by the '--rank (-r)' parameter).

Optionally, the desired update rules for each NMF iteration can be chosen from the following list:

  • multdist: multiplicative distance-based update rules (Lee and Seung 1999)
  • multdiv: multiplicative divergence-based update rules (Lee and Seung 1999)
  • als: alternating least squares update rules (Paatero and Tapper 1994)

The maximum number of iterations is specified with '--max_iterations (-m)', and the minimum residue required for algorithm termination is specified with the '--min_residue (-e)' parameter.

For example, to run NMF on the input matrix 'V.csv' using the 'multdist' update rules with a rank-10 decomposition and storing the decomposed matrices into 'W.csv' and 'H.csv', the following command could be used:

$ mlpack_nmf --input_file V.csv --w_file W.csv --h_file H.csv --rank 10 --update_rules multdist

REQUIRED INPUT OPTIONS

Input dataset to perform NMF on.
Rank of the factorization.

OPTIONAL INPUT OPTIONS

Default help info.
Print help on a specific option. Default value ''.
Initial H matrix.
Initial W matrix.
Number of iterations before NMF terminates (0 runs until convergence. Default value 10000.
The minimum root mean square residue allowed for each iteration, below which the program terminates. Default value 1e-05.
Random seed. If 0, 'std::time(NULL)' is used. Default value 0. --update_rules (-u) [string] Update rules for each iteration; ( multdist | multdiv | als ). Default value 'multdist'.
Display informational messages and the full list of parameters and timers at the end of execution.
Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

Matrix to save the calculated H to.
Matrix to save the calculated W to.

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

11 January 2024 mlpack-4.3.0