Machine Learning with Data Science

Course Overview

 Data Science and Machine Learning course will help you master the data science and analytics using different machine learning techniques and further gain deep understanding in data manipulation using R , also get introduced to hadoop architecture . 

At the end of the training, participants will be able to:

  1. Manipulate and Visualise data using machine learning techniques
  2. Write, optimize java code using Hadoop Framework

Pre-requisite

  1. A background in Java is required
  2. This machine learning and data science course is appropriate for developers, who wish to write, maintain and/or optimize Java code using rnrnHadoop framework
  3. Hands on experience on writing Java programs using rnrnEclipse editor would be a plus

Duration

3  days

Course Outline

  1. Introduction
  2. Understanding Big Data
  3. Understand how different companies use big data for their business need
  4. Big Data Challanges
  5. Introduction to Data Science
  6. Types of Data Scientists
  7. Data Science Components
  8. Data Science Use Cases
  9. Introduction to R and Hadoop
  10. R and Hadoop Integration
  11. Machine Learning with Mahout

  1. HDFS- Hadoop Distributed File System
  2. Assumptions and Goals
  3. CAP principle
  4. Anatomy of Hadoop Cluster
  5. Anatomy of a File Write
  6. Anatomy of a File Read
  7. MapReduce Framework Architecture
  8. Hadoop Processes
  9. Understanding Various configuration Properties of Hadoop
  1. Introduction to R
  2. Describe why R is Used?
  3. Implement R programing concepts
  4. Learn Data Import techniques
  5. Analyze the processing of the Data
  1. Observation and Experiments
  2. Sampling Methods
  3. Quantitative Variables
  4. Skewness,Modality and Measures of Center
  5. Variance, Standard Deviation, Interquartile Range
  6. Probability Rules
  7. Disjoint,Non Disjoint events, Independence
  8. Conditional Probability
  9. Probability Distributions
  1. Understand Machine Learning
  2. Use Cases Walkthrough
  3. Machine Learning Techniques
  4. Describe Clustering
  5. Analyze Clustering Scenarios using Clustering Algorithms
  6. Learn TF-IDF and cosine Similarity
  1. Understand Supervised Learning Technique
  2. Classification
  3. Recommendation
  4. Learn Decision Tree Classifier
  5. Implement how various Decision Tree algorithms work.
  6. Implement Application of Techniques on a smaller datasets for better understanding using R.
  1. Understand Unsupervised Learning Technique
  2. Understand the implementation of Random Forest Classifier
  3. Understand the implementation of Na-ve Bayer’s Classifier
  4. Apply both techniques on smaller datasets using R
  5. Understand Association Rule Mining
  1. Understand the need for R integration with Hadoop
  2. Learn the ways to integrate R and Hadoop
  3. Understand the usage of RHadoop package
  4. Perform R integration with Hadoop and Run MapReduce examples
  1. Understand Mahout
  2. Gain insight on implementing Machine Learning with Mahout
  3. Understand Learning, Classification and Clustering techniques with Mahout
  4. Implement Recommendation technique and Frequent Pattern Mining in Mahout
  1. Understand Mahout Algorithms and Parallel proicessing
  2. Learn Advanced techniques in R
  3. Implement Parallel Random Forest
  4. Understand Data Visualization

Reviews