Mahout was specially built as a scalable machine learning library that implements many different approaches to machine learning.
The project currently contains implementations of algorithms for classification, clustering, genetic programming and collaborative filtering, all enabled to scale by leveraging the power of Hadoop's Map-Reduce implementation.
Here are some key features of "Mahout":
· Collaborative Filtering
· K-Means, Fuzzy K-Means clustering
· Mean Shift clustering
· Dirichlet process clustering
· Latent Dirichlet Allocation
· Singular value decomposition
· Parallel Frequent Pattern mining
· Complementary Naive Bayes classifier
· Random forest decision tree based classifier
· High performance java collections (previously colt collections)
· A vibrant community
· and many more cool stuff to come by this summer thanks to Google summer of code
Requirements:
· Java
What's New in This Release: [ read full changelog ]
· Improved Lanczos solver: graceful restarts, better scalability
· LDA improvements: document-topic distribution output, graceful restarts
· Stochastic Singular Value Decomposition implementation
· Incremental SVD implementation
· Alternating Least Squares with Weighted Regularization collaborative filtering implementation, both distributed and non-distributed
· SVDRecommender enhancements
· Initial work at merging clustering and classification infrastructure
· Better control over candidate item selection in item-based recommenders
· Significant removal of deprecated or dead code
· Many bug fixes, refactorings and other small improvements