mlpack_kernel_pca(1) - Linux man page
Name
kernel_pca - kernel principal components analysis
Synopsis
kernel_pca [-h] [-v] -i string -k string -o string [-b double] [-d double] [-S double] [--new_dimensionality int] [-O double] -s
Description
This program performs Kernel Principal Components Analysis (KPCA) on the specified dataset with the specified kernel. This will transform the data onto the kernel principal components, and optionally reduce the dimensionality by ignoring the kernel principal components with the smallest eigenvalues.
For the case where a linear kernel is used, this reduces to regular PCA.
The kernels that are supported are listed below:
- • 'linear': the standard linear dot product (same as normal PCA): K(x, y) = x^T y
• 'gaussian': a Gaussian kernel; requires bandwidth: K(x, y) = exp(-(|| x - y || ^ 2) / (2 * (bandwidth ^ 2)))
• 'polynomial': polynomial kernel; requires offset and degree: K(x, y) = (x^T y + offset) ^ degree
• 'hyptan': hyperbolic tangent kernel; requires scale and offset: K(x, y) = tanh(scale * (x^T y) + offset)
• 'laplacian': Laplacian kernel; requires bandwidth: K(x, y) = exp(-(|| x - y ||) / bandwidth)
• 'cosine': cosine distance: K(x, y) = 1 - (x^T y) / (|| x || * || y ||)
- The parameters for each of the kernels should be specified with the options --bandwidth, --kernel_scale, --offset, or --degree (or a combination of those options).
Required Options
--input_file (-i) [string]
- Input dataset to perform KPCA on.
- --kernel (-k) [string]
- The kernel to use; see the above documentation for the list of usable kernels.
- --output_file (-o) [string]
- File to save modified dataset to.
Options
--bandwidth (-b) [double]
- Bandwidth, for 'gaussian' and 'laplacian' kernels. Default value 1.
- --degree (-d) [double]
- Degree of polynomial, for 'polynomial' kernel. Default value 1.
- --help (-h)
- Default help info.
- --info [string]
- Get help on a specific module or option. Default value ''. --kernel_scale (-S) [double] Scale, for 'hyptan' kernel. Default value 1.
- --new_dimensionality [int]
- If not 0, reduce the dimensionality of the output dataset by ignoring the dimensions with the smallest eigenvalues. Default value 0.
- --offset (-O) [double]
- Offset, for 'hyptan' and 'polynomial' kernels. Default value 0.
- --scale (-s)
- If set, the data will be scaled before performing KPCA such that the variance of each feature is 1.
- --verbose (-v)
- Display informational messages and the full list of parameters and timers at the end of execution.
Additional Information
For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of MLPACK.