JML
JML is a pure Java library for machine learning. The goal of JML is to make machine learning methods easy to use and speed up the code translation from MATLAB to Java. [Tutorial-JML.pdf]
JML v.s. LAML:
LAML is much faster than JML (more than 3 times faster) due to two implementation considerations. First, LAML allows full control of dense and full matrices and vectors. Second, LAML extensively uses in-place matrix and vector operations thus avoids too much memory allocation and garbage collection.
JML relies on third party linear algebra library, i.e. Apache Commons-math. Sparse matrices and vectors have been deprecated in Commons-math 3.0+, and will be ultimately eliminated. Whereas LAML has its own built-in linear algebra library.
Like JML, LAML also provides a lot of commonly used matrix functions in the same signature to Matlab, thus can also be used to manually convert MATLAB code to Java code.
In short, JML has been replaced by LAML.
Current version implements logistic regression, Maximum Entropy modeling (MaxEnt), AdaBoost, LASSO, KMeans, spectral clustering, Nonnegative Matrix Factorization (NMF), sparse NMF, Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA) (by Gibbs sampling based on LdaGibbsSampler.java by Gregor Heinrich), joint l_{2,1}-norms minimization, Hidden Markov Model (HMM), Conditional Random Field (CRF), Robust PCA, Matrix Completion (MC), etc. for examples of implementing machine learning methods by using this general framework. The SVM package LIBLINEAR is also incorporated. I will try to add more important models such as Markov Random Field (MRF) to this package if I get the time:)
JML library's another advantage is its complete independence from feature engineering, thus any preprocessed data could be run. For example, in the area of natural language processing, feature engineering is a crucial part for MaxEnt, HMM, and CRF to work well and is often embedded in model training. However, we believe that it is better to separate feature engineering and parameter estimation. On one hand, modularization could be achieved so that people can simply focus on one module without need to consider other modules; on the other hand, implemented modules could be reused without incompatibility concerns.
JML also provides implementations of several efficient, scalable, and widely used general purpose optimization algorithms, which are very important for machine learning methods be applicable on large scaled data, though particular optimization strategy that considers the characteristics of a particular problem is more effective and efficient (e.g., dual coordinate descent for bound constrained quadratic programming in SVM). Currently supported optimization algorithms are limited-memory BFGS, projected limited-memory BFGS (non-negative constrained or bound constrained), nonlinear conjugate gradient, primal-dual interior-point method, general quadratic programming, accelerated proximal gradient, and accelerated gradient descent. I would always like to implement more practical efficient optimization algorithms.
Several dimensionality reduction algorithms are implemented, they are PCA, kernel PCA, Multi-dimensional Scaling (MDS), Isomap, and Locally Linear Embedding (LLE).
Two online classification algorithms are incorporated, they are Perceptron and Winnow.
I hope this library could help engineers and researchers speed up their productivity cycle.
Github:
https://github.com/MingjieQian/JML
Documentation:
For more details about JML API, please refer to the online documentation.
Tutorial:
Brief overview: Tutorial-JML.pdf
Examples:
# Multi-class SVM (linear kernel)
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{-1.2, 0.4, 3.2} };
double[][] labels = { {1, 0, 0},
{0, 1, 0},
{0, 0, 1} };
double C = 1;
double eps = 0.01;
Classifier multiClassSVM = new MultiClassSVM(C, eps);
multiClassSVM.feedData(data);
multiClassSVM.feedLabels(labels);
multiClassSVM.train();
RealMatrix Y_pred = multiClassSVM.predictLabelMatrix(data);
display(Y_pred);
# Predicted label matrix
1 0 0
0 1 0
0 0 1
# Predicted label score matrix:
0.7950 -0.2050 -0.5899
-0.3301 0.6635 -0.3333
-0.6782 -0.1602 0.8383
# Projection matrix (with bias):
-0.1723 0.2483 -0.0760
0.2337 -0.1901 -0.0436
-0.0152 -0.1310 0.1462
-0.1525 0.0431 0.1094
-0.0207 0.0114 0.0093
# -------------------------------------------------------------------------- #
# Multi-class Logistic Regression
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{-1.2, 0.4, 3.2} };
double[][] labels = { {1, 0, 0},
{0, 1, 0},
{0, 0, 1} };
Options options = new Options();
options.epsilon = 1e-6;
// Multi-class logistic regression by using limited-memory BFGS method
Classifier logReg = new LogisticRegressionMCLBFGS(options);
logReg.feedData(data);
logReg.feedLabels(labels);
logReg.train();
RealMatrix Y_pred = logReg.predictLabelScoreMatrix(data);
display(Y_pred);
# Output
# Predicted probability matrix:
1.0000 0.0000 0.0000
0.0000 1.0000 0.0000
0.0000 0.0000 1.0000
# Projection matrix:
-2.0348 3.0363 -1.0015
2.9002 -2.1253 -0.7749
-0.4261 -1.5264 1.9524
-2.0445 0.4998 1.5447
# -------------------------------------------------------------------------- #
# AdaBoost with Logistic Regression
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{5.3, 2.2, -1.5},
{-1.2, 0.4, 3.2} };
int[] labels = {1, 1, -1, -1, -1};
RealMatrix X = new BlockRealMatrix(data);
X = X.transpose();
Options options = new Options();
options.epsilon = 1e-5;
Classifier logReg = new LogisticRegressionMCLBFGS(options);
logReg.feedData(X);
logReg.feedLabels(labels);
logReg.train();
RealMatrix Xt = X;
double accuracy = Classifier.getAccuracy(labels, logReg.predict(Xt));
fprintf("Accuracy for logistic regression: %.2f%%\n", 100 * accuracy);
int T = 10;
Classifier[] weakClassifiers = new Classifier[T];
for (int t = 0; t < 10; t++) {
options = new Options();
options.epsilon = 1e-5;
weakClassifiers[t] = new LogisticRegressionMCLBFGS(options);
}
Classifier adaBoost = new AdaBoost(weakClassifiers);
adaBoost.feedData(X);
adaBoost.feedLabels(labels);
adaBoost.train();
Xt = X.copy();
display(adaBoost.predictLabelScoreMatrix(Xt));
display(full(adaBoost.predictLabelMatrix(Xt)));
display(adaBoost.predict(Xt));
accuracy = Classifier.getAccuracy(labels, adaBoost.predict(Xt));
fprintf("Accuracy for AdaBoost with logistic regression: %.2f%%\n", 100 * accuracy);
// Save the model
String modelFilePath = "AdaBoostModel";
adaBoost.saveModel(modelFilePath);
// Load the model
Classifier adaBoost2 = new AdaBoost();
adaBoost2.loadModel(modelFilePath);
accuracy = Classifier.getAccuracy(labels, adaBoost2.predict(Xt));
fprintf("Accuracy: %.2f%%\n", 100 * accuracy);
# Output
Accuracy for logistic regression: 60.00%
Accuracy for AdaBoost with logistic regression: 100.00%
Model saved.
Loading model...
Model loaded.
Accuracy: 100.00%
# -------------------------------------------------------------------------- #
# Spectral Clustering
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{-1.2, 0.4, 3.2} };
SpectralClusteringOptions options = new SpectralClusteringOptions();
options.nClus = 2;
options.verbose = false;
options.maxIter = 100;
options.graphType = "nn";
options.graphParam = 2;
options.graphDistanceFunction = "cosine";
options.graphWeightType = "heat";
options.graphWeightParam = 1;
Clustering spectralClustering = new SpectralClustering(options);
spectralClustering.feedData(data);
spectralClustering.clustering();
display(spectralClustering.getIndicatorMatrix());
# Output
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 2.
KMeans complete.
Spectral clustering complete.
1 0
1 0
0 1
# -------------------------------------------------------------------------- #
# KMeans
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{-1.2, 0.4, 3.2} };
KMeansOptions options = new KMeansOptions();
options.nClus = 2;
options.verbose = true;
options.maxIter = 100;
KMeans KMeans= new KMeans(options);
KMeans.feedData(data);
KMeans.clustering(null); // Use null for random initialization
System.out.println("Indicator Matrix:");
Matlab.printMatrix(Matlab.full(KMeans.getIndicatorMatrix()));
# Output
Iter 1: sse = 3.604 (0.127 secs)
KMeans complete.
Indicator Matrix:
1 0
1 0
0 1
# -------------------------------------------------------------------------- #
# NMF
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{1.2, 0.4, 3.2} };
NMFOptions NMFOptions = new NMFOptions();
NMFOptions.nClus = 2;
NMFOptions.maxIter = 50;
NMFOptions.verbose = true;
NMFOptions.calc_OV = false;
NMFOptions.epsilon = 1e-5;
Clustering NMF = new NMF(NMFOptions);
NMF.feedData(data);
NMF.clustering(null); // If null, KMeans will be used for initialization
System.out.println("Basis Matrix:");
Matlab.printMatrix(Matlab.full(NMF.getCenters()));
System.out.println("Indicator Matrix:");
Matlab.printMatrix(Matlab.full(NMF.getIndicatorMatrix()));
# Output
Iter 1: sse = 3.327 (0.149 secs)
KMeans complete.
Iteration 10, delta G: 0.000103
Converge successfully!
Basis Matrix:
1.5322 4.3085
0.5360 4.4548
4.7041 0.2124
3.6581 0.9194
Indicator Matrix:
0 1.0137
0.0261 0.7339
0.8717 0.0001
# -------------------------------------------------------------------------- #
# Limited-memory BFGS
double fval = ...; // Current objective function value
double epsilon = ...; // Tolerance
RealMatrix G = ...; // Gradient at the current matrix variable
RealMatrix W = ...; // Current matrix (vector) you want to optimize
while (true) {
flags = LBFGS.run(G, fval, epsilon, W); // Update W
if (flags[0]) { // flags[0] means if LBFGS converges
break;
}
fval = ...; // Compute the new objective function value at W
if (flags[1]) { // flags[1] means if gradient at W is required
G = ...; // Compute the new gradient at W
}
}
# -------------------------------------------------------------------------- #
# LASSO
double[][] data = {{1, 2, 3, 2},
{4, 2, 3, 6},
{5, 1, 2, 1}};
double[][] depVars = {{3, 2},
{2, 3},
{1, 4}};
Options options = new Options();
options.maxIter = 600;
options.lambda = 0.05;
options.verbose = !true;
options.epsilon = 1e-5;
Regression LASSO = new LASSO(options);
LASSO.feedData(data);
LASSO.feedDependentVariables(depVars);
LASSO.train();
fprintf("Projection matrix:\n");
display(LASSO.W);
RealMatrix Yt = LASSO.predict(data);
fprintf("Predicted dependent variables:\n");
display(Yt);
# Output
Projection matrix:
-0.2295 0.5994
0 0
1.1058 0.5858
-0.0631 -0.1893
Predicted dependent variables:
2.9618 1.9782
2.0209 3.0191
1.0009 3.9791
# -------------------------------------------------------------------------- #
# LDA
int[][] documents = { {1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 6},
{2, 2, 4, 2, 4, 2, 2, 2, 2, 4, 2, 2},
{1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 0},
{5, 6, 6, 2, 3, 3, 6, 5, 6, 2, 2, 6, 5, 6, 6, 6, 0},
{2, 2, 4, 4, 4, 4, 1, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 0},
{5, 4, 2, 3, 4, 5, 6, 6, 5, 4, 3, 2} };
LDAOptions LDAOptions = new LDAOptions();
LDAOptions.nTopic = 2;
LDAOptions.iterations = 5000;
LDAOptions.burnIn = 1500;
LDAOptions.thinInterval = 200;
LDAOptions.sampleLag = 10;
LDAOptions.alpha = 2;
LDAOptions.beta = 0.5;
LDA LDA = new LDA(LDAOptions);
LDA.readCorpus(documents);
LDA.train();
fprintf("Topic--term associations: \n");
display(LDA.topicMatrix);
fprintf("Document--topic associations: \n");
display(LDA.indicatorMatrix);
# Output
Topic--term associations:
0.1258 0.0176
0.1531 0.0846
0.0327 0.3830
0.0418 0.1835
0.0360 0.2514
0.2713 0.0505
0.3393 0.0294
Document--topic associations:
0.2559 0.7441
0.1427 0.8573
0.8573 0.1427
0.6804 0.3196
0.5491 0.4509
0.4420 0.5580
# -------------------------------------------------------------------------- #
# Joint l_{2,1}-norms minimization: Supervised Feature Selection
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{-1.2, 0.4, 3.2} };
double[][] labels = { {1, 0, 0},
{0, 1, 0},
{0, 0, 1} };
SupervisedFeatureSelection robustFS = new JointL21NormsMinimization(2.0);
robustFS.feedData(data);
robustFS.feedLabels(labels);
robustFS.run();
System.out.println("Projection matrix:");
display(robustFS.getW());
# Output
Projection matrix:
-0.0144 0.1194 0.0066
0.1988 -0.0777 -0.0133
-0.0193 -0.0287 0.2427
-0.0005 0.0004 0.0009
# -------------------------------------------------------------------------- #
# Hidden Markov Models
int numStates = 3;
int numObservations = 2;
double epsilon = 1e-6;
int maxIter = 1000;
double[] pi = new double[] {0.33, 0.33, 0.34};
double[][] A = new double[][] {
{0.5, 0.3, 0.2},
{0.3, 0.5, 0.2},
{0.2, 0.4, 0.4}
};
double[][] B = new double[][] {
{0.7, 0.3},
{0.5, 0.5},
{0.4, 0.6}
};
// Generate the data sequences for training
int D = 10000;
int T_min = 5;
int T_max = 10;
int[][][] data = HMM.generateDataSequences(D, T_min, T_max, pi, A, B);
int[][] Os = data[0];
int[][] Qs = data[1];
// Train HMM
HMM HMM = new HMM(numStates, numObservations, epsilon, maxIter);
HMM.feedData(Os);
HMM.feedLabels(Qs); // If not given, random initialization will be used
HMM.train();
HMM.saveModel("HMMModel.dat");
// Predict the single best state path
int ID = new Random().nextInt(D);
int[] O = Os[ID];
HMM HMMt = new HMM();
HMMt.loadModel("HMMModel.dat");
int[] Q = HMMt.predict(O);
fprintf("Observation sequence: \n");
HMMt.showObservationSequence(O);
fprintf("True state sequence: \n");
HMMt.showStateSequence(Qs[ID]);
fprintf("Predicted state sequence: \n");
HMMt.showStateSequence(Q);
double p = HMMt.evaluate(O);
System.out.format("P(O|Theta) = %f\n", p);
# Output
True Model Parameters:
Initial State Distribution:
0.3300
0.3300
0.3400
State Transition Probability Matrix:
0.5000 0.3000 0.2000
0.3000 0.5000 0.2000
0.2000 0.4000 0.4000
Observation Probability Matrix:
0.7000 0.3000
0.5000 0.5000
0.4000 0.6000
Trained Model Parameters:
Initial State Distribution:
0.3556
0.3511
0.2934
State Transition Probability Matrix:
0.5173 0.3030 0.1797
0.2788 0.4936 0.2276
0.2307 0.3742 0.3951
Observation Probability Matrix:
0.6906 0.3094
0.5324 0.4676
0.3524 0.6476
Observation sequence:
0 0 0 0 1 1 1 1 1
True state sequence:
0 2 0 0 2 0 2 2 2
Predicted state sequence:
0 0 0 0 2 2 2 2 2
P(O|Theta) = 0.001928
# -------------------------------------------------------------------------- #
# Maximum Entropy Modeling Using Limited-memory BFGS
double[][][] data = new double[][][] {
{{1, 0, 0}, {2, 1, -1}, {0, 1, 2}, {-1, 2, 1}},
{{0, 2, 0}, {1, 0, -1}, {0, 1, 1}, {-1, 3, 0.5}},
{{0, 0, 0.8}, {2, 1, -1}, {1, 3, 0}, {-0.5, -1, 2}},
{{0.5, 0, 0}, {1, 1, -1}, {0, 0.5, 1.5}, {-2, 1.5, 1}},
};
/*double [][] labels = new double[][] {
{1, 0, 0},
{0, 1, 0},
{0, 0, 1},
{1, 0, 0}
};*/
int[] labels = new int[] {1, 2, 3, 1};
MaxEnt maxEnt = new MaxEnt();
maxEnt.feedData(data);
maxEnt.feedLabels(labels);
maxEnt.train();
fprintf("MaxEnt parameters:\n");
display(maxEnt.W);
String modelFilePath = "MaxEnt-Model.dat";
maxEnt.saveModel(modelFilePath);
maxEnt = new MaxEnt();
maxEnt.loadModel(modelFilePath);
fprintf("Predicted probability matrix:\n");
display(maxEnt.predictLabelScoreMatrix(data));
fprintf("Predicted label matrix:\n");
display(full(maxEnt.predictLabelMatrix(data)));
fprintf("Predicted labels:\n");
display(maxEnt.predict(data));
# Output
Training time: 0.087 seconds
MaxEnt parameters:
12.1659
-1.8211
-4.4031
-1.8199
Model saved.
Loading model...
Model loaded.
Predicted probability matrix:
1.0000 0.0000 0.0000
0.0000 1.0000 0.0000
0.0000 0.0000 1.0000
1.0000 0.0000 0.0000
Predicted label matrix:
1 0 0
0 1 0
0 0 1
1 0 0
Predicted labels:
1
2
3
1
# -------------------------------------------------------------------------- #
# Conditional Random Field Using L-BFGS
// Number of data sequences
int D = 10;
// Minimal length for the randomly generated data sequences
int n_min = 5;
// Maximal length for the randomly generated data sequences
int n_max = 10;
// Number of feature functions
int d = 5;
// Number of states
int N = 3;
// Sparseness for the feature matrices
double sparseness = 0.8;
// Randomly generate labeled sequential data for CRF
Object[] dataSequences = CRF.generateDataSequences(D, n_min, n_max, d, N, sparseness);
RealMatrix[][][] Fs = (RealMatrix[][][]) dataSequences[0];
int[][] Ys = (int[][]) dataSequences[1];
// Train a CRF model for the randomly generated sequential data with labels
double epsilon = 1e-4;
CRF CRF = new CRF(epsilon);
CRF.feedData(Fs);
CRF.feedLabels(Ys);
CRF.train();
// Save the CRF model
String modelFilePath = "CRF-Model.dat";
CRF.saveModel(modelFilePath);
fprintf("CRF Parameters:\n");
display(CRF.W);
// Prediction
CRF = new CRF();
CRF.loadModel(modelFilePath);
int ID = new Random().nextInt(D);
int[] Yt = Ys[ID];
RealMatrix[][] Fst = Fs[ID];
fprintf("True label sequence:\n");
display(Yt);
fprintf("Predicted label sequence:\n");
display(CRF.predict(Fst));
# Output
Initial ofv: 117.452
Iter 1, ofv: 84.5760, norm(Grad): 4.71289
Iter 2, ofv: 11.9435, norm(Grad): 3.75200
Iter 3, ofv: 11.7764, norm(Grad): 0.619218
Objective function value doesn't decrease, iteration stopped!
Iter 4, ofv: 11.7764, norm(Grad): 0.431728
Model saved.
CRF Parameters:
1.1178
3.1087
-2.3664
-0.3754
-0.8732
Loading model...
Model loaded.
True label sequence:
2 1 1 0 1
Predicted label sequence:
P*(YPred|x) = 0.382742
2 1 1 0 1
# -------------------------------------------------------------------------- #
# General Quadratic Programming by Primal-dual Interior-point Methods
/*
* Number of unknown variables
*/
int n = 5;
/*
* Number of inequality constraints
*/
int m = 6;
/*
* Number of equality constraints
*/
int p = 3;
RealMatrix x = rand(n, n);
RealMatrix Q = x.multiply(x.transpose()).add(times(rand(1), eye(n)));
RealMatrix c = rand(n, 1);
double HasEquality = 1;
RealMatrix A = times(HasEquality, rand(p, n));
x = rand(n, 1);
RealMatrix b = A.multiply(x);
RealMatrix B = rand(m, n);
double rou = -2;
RealMatrix d = plus(B.multiply(x), times(rou, ones(m, 1)));
/*
* General quadratic programming:
*
* min 2 \ x' * Q * x + c' * x
* s.t. A * x = b
* B * x <= d
*/
GeneralQP.solve(Q, c, A, b, B, d);
# Output
Phase I:
Terminate successfully.
x_opt:
4280.0981 2366.3295 -5241.6878 -3431.7929 -1243.1621
s_opt:
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
lambda for the inequalities s_i >= 0:
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
B * x - d:
-1622.6174 -3726.7628 -326.3261 -2610.0973 -3852.1984 -3804.4964
lambda for the inequalities fi(x) <= s_i:
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
nu for the equalities A * x = b:
-0.0000 -0.0000 -0.0000
residual: 1.15753e-11
A * x - b:
-0.0000 -0.0000 0.0000
norm(A * x - b, "fro"): 0.000000
fval_opt: 4.78833e-11
The problem is feasible.
Computation time: 0.083000 seconds
halt execution temporarily in 1 seconds...
Phase II:
Terminate successfully.
residual: 4.97179e-12
Optimal objective function value: 201.252
Optimizer:
7.0395 21.8530 -12.9418 -5.8026 -14.0337
B * x - d:
-0.0000 -9.1728 -0.0000 -2.9475 -0.5927 -7.7247
lambda:
17.6340 0.0000 186.0197 0.0000 0.0000 0.0000
nu:
7.4883 -55.7357 -57.6134
norm(A * x - b, "fro"): 0.000000
Computation time: 0.048000 seconds
# -------------------------------------------------------------------------- #
# Matrix Recovery: Robust PCA
int m = 8;
int r = m / 4;
RealMatrix L = randn(m, r);
RealMatrix R = randn(m, r);
RealMatrix A_star = mtimes(L, R.transpose());
RealMatrix E_star = zeros(size(A_star));
int[] idx = randperm(m * m);
int nz = m * m / 20;
int[] nz_idx = new int[nz];
for (int i = 0; i < nz; i++) {
nz_idx[i] = idx[i] - 1;
}
RealMatrix E_vec = vec(E_star);
setSubMatrix(E_vec, nz_idx, new int[] {0}, (minus(rand(nz, 1), 0.5).scalarMultiply(100)));
E_star = reshape(E_vec, size(E_star));
// Input
RealMatrix D = A_star.add(E_star);
double lambda = 1 * Math.pow(m, -0.5);
// Run Robust PCA
RobustPCA robustPCA = new RobustPCA(lambda);
robustPCA.feedData(D);
robustPCA.run();
RealMatrix A_hat = robustPCA.GetLowRankEstimation();
RealMatrix E_hat = robustPCA.GetErrorMatrix();
fprintf("A*:\n");
disp(A_star, 4);
fprintf("A^:\n");
disp(A_hat, 4);
fprintf("E*:\n");
disp(E_star, 4);
fprintf("E^:\n");
disp(E_hat, 4);
fprintf("rank(A*): %d\n", rank(A_star));
fprintf("rank(A^): %d\n", rank(A_hat));
fprintf("||A* - A^||_F: %.4f\n", norm(A_star.subtract(A_hat), "fro"));
fprintf("||E* - E^||_F: %.4f\n", norm(E_star.subtract(E_hat), "fro"));
# Output
A*:
-0.3167 -0.9318 -0.0798 -0.7203 -0.8664 -0.6440 1.0025 -0.0680
-0.6284 0.9694 -0.4561 0.3986 0.5307 0.9091 0.0128 0.1906
-1.4436 2.5714 -1.0841 1.1390 1.4941 2.3557 -0.2121 0.4776
1.6382 1.8747 0.7239 1.8157 2.1306 1.0458 -3.1202 0.0115
2.0220 -3.8960 1.5496 -1.7862 -2.3277 -3.5280 0.5034 -0.7029
0.1649 -0.5466 0.1505 -0.2941 -0.3726 -0.4653 0.2016 -0.0837
-0.3799 -0.9994 -0.1082 -0.7872 -0.9449 -0.6807 1.1196 -0.0679
0.4860 0.1538 0.2572 0.2777 0.3109 -0.0019 -0.6435 -0.0430
A^:
-0.3167 -0.9318 -0.0798 -0.7203 -0.8664 -0.6440 1.0025 -0.0680
-0.6284 0.9694 -0.4561 0.3986 0.5307 0.9091 0.0128 0.1906
-1.4436 2.5714 -1.0841 1.1390 1.4941 2.3557 -0.2121 0.4776
0.9864 1.6651 0.3792 1.4410 1.7109 1.0458 -2.2548 0.0688
2.0220 -3.8960 1.5496 -1.7862 -2.3277 -3.5280 0.5034 -0.7029
0.1649 -0.5466 0.1505 -0.2941 -0.3726 -0.4653 0.2016 -0.0837
-0.3799 -0.9994 -0.1082 -0.7872 -0.9449 -0.6807 1.1196 -0.0679
0.4860 0.1538 0.2572 0.2777 0.3109 -0.0019 -0.6435 -0.0430
E*:
0 0 0 0 0 0 0 0
0 0 0 0 0 45.2970 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
26.9119 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 -9.2397 0 0
E^:
0 0 0 0 0 0 0 0
0 0 0 0 0 45.2970 0 0
0 0 0 0 0 0 0 0
0.6518 0.2097 0.3447 0.3747 0.4196 0 -0.8654 -0.0574
0 0 0 0 0 0 0 0
26.9119 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 -9.2397 0 0
rank(A*): 2
rank(A^): 2
||A* - A^||_F: 1.2870
||E* - E^||_F: 1.2870
# -------------------------------------------------------------------------- #
# Matrix Completion
int m = 6;
int r = 1;
int p = (int) Math.round(m * m * 0.3);
RealMatrix L = randn(m, r);
RealMatrix R = randn(m, r);
RealMatrix A_star = mtimes(L, R.transpose());
int[] indices = randperm(m * m);
minusAssign(indices, 1);
indices = linearIndexing(indices, colon(0, p - 1));
RealMatrix Omega = zeros(size(A_star));
linearIndexingAssignment(Omega, indices, 1);
RealMatrix D = zeros(size(A_star));
linearIndexingAssignment(D, indices, linearIndexing(A_star, indices));
RealMatrix E_star = D.subtract(A_star);
logicalIndexingAssignment(E_star, Omega, 0);
// Run matrix completion
MatrixCompletion matrixCompletion = new MatrixCompletion();
matrixCompletion.feedData(D);
matrixCompletion.feedIndices(Omega);
matrixCompletion.run();
// Output
RealMatrix A_hat = matrixCompletion.GetLowRankEstimation();
fprintf("A*:\n");
disp(A_star, 4);
fprintf("A^:\n");
disp(A_hat, 4);
fprintf("D:\n");
disp(D, 4);
fprintf("rank(A*): %d\n", rank(A_star));
fprintf("rank(A^): %d\n", rank(A_hat));
fprintf("||A* - A^||_F: %.4f\n", norm(A_star.subtract(A_hat), "fro"));
# Output
A*:
0.3070 -0.3445 -0.2504 0.0054 0.4735 -0.0825
0.0081 -0.0091 -0.0066 0.0001 0.0125 -0.0022
1.4322 -1.6069 -1.1681 0.0252 2.2090 -0.3848
-0.6194 0.6950 0.5052 -0.0109 -0.9554 0.1664
-0.3616 0.4057 0.2949 -0.0064 -0.5577 0.0971
-0.3382 0.3795 0.2758 -0.0059 -0.5216 0.0909
A^:
0.2207 0.0000 -0.2504 0.0039 0.4735 0
0.0081 0.0000 -0.0066 0.0001 0.0125 0
1.0296 0.0000 -1.1681 0.0181 2.2090 0
-0.6194 -0.0000 0.5052 -0.0109 -0.9554 0
0 0 0 0 0 0
-0.2431 -0.0000 0.2758 -0.0043 -0.5216 0
D:
0 0 0 0 0.4735 0
0 0 -0.0066 0.0001 0.0125 0
0 0 -1.1681 0 2.2090 0
-0.6194 0 0.5052 -0.0109 0 0
0 0 0 0 0 0
0 0 0.2758 0 -0.5216 0
rank(A*): 1
rank(A^): 2
||A* - A^||_F: 2.0977
# -------------------------------------------------------------------------- #
Features:
A general framework is provided for users to implement machine learning tools from MATLAB code.
jml.clustering package implements clustering related models.
jml.classification package implements classification related methods.
jml.regression package implements regression models.
jml.topics includes topic modeling and topic mining methods.
jml.data provides reading and writing functions for dense or sparse matrix which can be easily read by MATLAB.
tmj.kernel compute kernel matrix between two matrices, currently supported kernel types are linear, poly, rbf, and cosine.
jml.manifold implements manifold learning related functions, i.e., computation of adjacency matrix and Laplacian matrix. This package is very useful for semi-supervised learning.
jml.matlab implements some frequently used Matlab matrix functions with the same function input signature such as sort, sum, max, min, kron, vec, repmat, reshape, and colon. Thus Matlab code could be more easily converted to Java code.
jml.optimization provides implementations for several most important general purpose optimization algorithms.
jml.feature.selection provides feature selection algorithms (supervised, unsupervised, or semi-supervised).
jml.sequence implements sequential learning algorithms (e.g., HMM or CRF).
jml.random implements random distributions. Currently supported is multivariate Gaussian distribution.
jml.subspace includes several dimensionality reduction algorithms (PCA, kernel PCA, MDS, Isomap, and LLE).
jml.online package implements several online learning methods (Perceptron and Winnow).
jml.recovery implements matrix recovery and matrix completion methods.
Feature engineering and model training are separated completely, which increases the applicability and flexibility of the included learning models and methods in the library. For feature generation, we suggest using TextProcessor package.
Well documented source code.
Note that the advantage of this library is that it is very convenient for users to translate a Matlab implementation into a Java implementation by using jml.matlab functions.
Dependencies:
JML depends on Apache Commons-Math library (commons-math-2.2 or later) and LIBLINEAR.
Note:
I choose Commons-Math because it supports both dense and sparse matrix and it has a good numerical computation performance by using BlockRealMatrix class in term of both speed and memory. For moderate scaled data, JMLBLAS is faster than JML because it exploits jblas library for basic matrix operations but JMLBLAS doesn't support sparse matrices.
-----------------------------------
Author: Mingjie Qian
Version: 2.8
Date: Nov. 22nd, 2013