Apache Mahout
Apache Mahout is an open-source project that is primarily used in producing scalable machine learning algorithms. Companies such as Adobe, Facebook, LinkedIn, Foursquare, Twitter, and Yahoo use Mahout internally. The IT market is still facing a shortage of Mahout practitioners across the world.
Duration:30hrs
Course Content:
Introduction to Machine Learning and Mahout
- Machine Learning Fundamentals, Apache Mahout Basics, History of Mahout, Supervised and Unsupervised Learning techniques, Mahout and Hadoop, Introduction to Clustering and Classification
Apache Mahout and Hadoop
- Mahout on Apache Hadoop, Setup Mahout, and Myrrix
Recommendation Engine in Mahout Training
- Recommendations using Apache Mahout, Introduction to Recommendation systems, Content Based Mahout Optimizations
Implementing a Recommender and Recommendation Platform
- User-based recommendation, User Neighbourhood, Item-based Recommendation, Implementing a Recommender using MapReduce Platforms, Similarity Measures, Manhattan Distance, Euclidean Distance, Cosine Similarity, Pearson’s Correlation Similarity, Log-likelihood Similarity, Tanimoto Evaluating, Recommendation Engines (Online and Offline), Recommenders in Production
Clustering
- Clustering, Common Clustering Algorithms in Apache mahout training, K-means Canopy Clustering, Fuzzy K-means and Mean Shift, etc., Representing Data Feature Selection, Vectorization in Apache Mahout training, Representing Vectors, Clustering documents through example TF-IDF and Implementing clustering in Hadoop Classification
Classification
- Examples, Basic Predictor variables and Target variables, Common Algorithms, SGD, SVM, Navie Bayes, Random Forests, Training and evaluating a Classifier, Developing a Classifier
Apache Mahout and Amazon EMR
- Mahout on Amazon, EMR Mahout Vs R, Introduction to tools like Weka, Octave, Matlab, and SAS
The project included in the Mahout training
- A complete recommendation engine built on application logs and transactions