Apache Spark and Scala

Course Overview

Spark and Scala training module will equip candidates with the necessary skills to create applications in Spark with the implementation of Scala programming. Additionally, this training will also provide a clear comparison between Spark and Hadoop and cover techniques to increase candidates’ application performance and enable high-speed processing.

With the use of advanced cloud-labs, this training will help candidates to gain seamless hands-on experience by enabling them to work on various use cases.

At the end of the training, participants will be able to:

  1. Describe Scala and its implementation
  2. Explain Control Structures, Loops, Collection, etc.
  3. Apply the concepts of Traits and OOPS in Scala
  4. Explain Functional programming in Scala
  5. Interpret Big Data challenges

Pre-requisite

  1. Prior Programming experience in Java or other languages required
  2. Basic familiarity with Linux or Unix preferred 
  3. Intermediate-level of Hadoop understanding is good to have

Duration

3 days

Course Outline

  1. Spark Overview
  2. Map Reduce vs. Spark
  3. Advantages of Spark over Map Reduce
  4. Spark Components and full-stack
  5. Working with Spark
  6. Demo-Spark Installation
  7. Spark Comparison with Hadoop
  1. Introduction to RDDs
  2. Working on Spark Project
  3. Demo- Building scala project with the SBT tool
  4. Demo-Run Scala application using jar file
  5. Demo- Scala application to read Hadoop data
  6. Working with RDDs
  7. Demo- Scala application that performs GroupBy operation
  1. Introduction to RDDs
  2. Working on Spark Project
  3. Demo- Building scala project with the SBT tool
  4. Demo-Run Scala application using jar file
  5. Demo- Scala application to read Hadoop data
  6. Working with RDDs
  7. Demo- Scala application that performs GroupBy operation
  1. Spark SQL Overview
  2. Working with Spark Session
  3. Demo- Wordcount using Dataset API
  4. Working with DataFrames
  5. Demo-Spark SQL using DataFrame operations
  6. Interoperability using different Approaches
  7. Demo-Spark SQL using reflection-based approach
  8. Demo- Run Spark SQL programmatically
  9. Working with Datasets
  10. Demo- Ways of creating datasets
  11. Demo-Datasets Operations and Joining Datasets
  12. Operating on various Data Sources
  13. Demo- infer JSON dataset schema and load as a Dataset
  14. Demo- Run Hive queries using Spark SQL
  15. Catalog API
  1. Introduction to Spark Streaming
  2. Demo- Execute word count operation in streaming
  3. Introduction to DStreams
  4. Spark Streaming Sources
  5. Transformation and Operations on DStreams
  6. Demo-Perform Dataframe and SQL operations
  7. Demo-Perform join operations
  8. Performance Tuning
  9. Demo-Capture and process netcat data
  10. Demo-Capture and process flume data
  11. Demo- Capture twitter data
  1. Introduction to Spark Structured Streaming
  2. Demo- Batch vs. Streaming
  3. Structured Streaming Architecture, model and its Components
  4. Demo- Wordcount steps in Structured streaming
  5. Structured Streaming APIs
  6. Demo- Operations on dataframes/datasets
  7. Demo- Data parsing with schema inference
  8. Demo- Column construction in Structured Streaming
  9. Demo- Using “groupBy” and “aggregation”
  10. Demo- Capturing and processing of real data
  1. Machine Learning Applications and its Types
  2. Machine Learning using Spark Mllib& Spark ML
  3. ML pipeline
  4. Spark Mllib Supported Types and Algorithms
  5. Demo-Perform clustering using k-means
  6. Demo-Perform classification using Linear Regression
  7. Demo- Run linear regression
  8. Demo- Perform Recommendation using Collaborative filtering
  9. Demo- Run reccomendation system
  1. Graph and Graph Parallel System
  2. GraphX and Property Graph
  3. Demo- Create a Graph using GraphX
  4. Graph Operator
  5. Demo- Perform graph operations using graphX
  6. Demo-Perform subgrpah operations
  7. Graph Analytics
  8. Introduction to GraphFrames
  9. Demo- Implement presidential election results using GraphFrames
  10. Demo- Create GraphFrame
  11. Demo- Perform operations on GraphFrames
  12. Demo-Working with GraphFrames
  13. Demo- Select subgrpah on motif finding
  14. GraphFrame algorithms

Reviews