Apache Storm Training

Live Online & Classroom Enterprise Training Course

Become an expert on Apcahe Storm . Unlock the full potential of Storm and gain expertise in real-time data processing through our comprehensive course.Take your Storm expertise to new heights by exploring advanced topics that amplify your real-time data processing capabilities

Introduction to Apache Storm: Features and Benefits

Course Overview :

Apache Storm is a distributed real-time processing system that is designed to process high volumes of data quickly. It can be used to process streams of data in real time, making it suitable for use cases such as real-time analytics, online machine learning, and more. Apache Storm Training is a course that teaches participants how to use Apache Storm to build and deploy distributed real-time processing systems. The training covers topics such as the architecture of Apache Storm, how to write and deploy Storm topologies, and how to use Storm with other big data technologies.

Apache Storm vs Spark vs Storm

At the end of the training, participants will be able to:

  1. Recognize differences between batch and real-time data processing
  2. Define Storm elements including tuples, streams, spouts, topologies, worker processes, executors, and stream groupings
  3. Recognize/interpret Java code for a spout, bolt, or topology
  4. Identify how to develop and submit a topology to a local or remote distributed cluster
  5. Recognize and explain the differences between reliable and unreliable Storm operation
  6. Manage and monitor Storm using the command-line client or browser-based Storm User Interface (UI)
  7. Define Trident elements including tuples, streams, batches, partitions, topologies, Trident spouts, and operations
  8. Recognize the differences between the different types of Trident state
  9. Recognize the differences in fault tolerance between different types of Trident spouts
  10. Define Kafka topics, producers, consumers, and brokers
  11. Publish Kafka messages to Storm or Trident topologies
  12. Work on Real World Projects using Storm

Pre-requisite

  1. Prior Programming experience and must be familiar with basic concepts of Core Java
  2. Prior knowledge of Object Oriented Programming Concepts
  3. Should have a basic understanding of Hadoop.

Duration

2 days

Course Outline

  1. Baysean Law
  2. Hadoop Distributed Computing
  3. Legacy Architecture of Real-Time System
  4. Difference b/w Storm and Hadoop
  5. The fundamental concept of storm
  6. Storm Development Environment
  • Real Life Storm Project
  1. Apache Storm Installation
  2. Storm Architecture
  3. Logical Dynamic and Components in Storm
  4. Topology in Storm
  5. Storm Execution Components
  6. Stream Grouping
  7. Tuple
  8. Spout
  9. Reliable versus Unreliable Messages
  10. Getting Data: Direct connection, Enqueued Messages and DRPC
  11. Bolt Lifecycle
  12. Bolt Structure
  13. Bolt-normalization bolt
  14. Reliable versus Unreliable Bolts
  15. Multiple Streams
  16. Multiple Anchoring
  17. Using IBasicBolt to Ack Automatically
  18. Hands-On:
  19. Creating Storm project in eclipse
  20. Running Storm bolt and spouts
  21. Running twitter example using Storm
  1. Grouping and its different types
  2. Reliable and unreliable messaging
  3. How to get Data – Direct connection and Enqueued message
  4. Life cycle of bolt
  1. Stream Grouping
  2. Fields Grouping
  3. All Grouping
  4. Custom Grouping
  5. Direct Grouping
  6. Global Grouping
  7. None Grouping
  8. Hands-On:
  9. Using different grouping techniques in Storm topologies
  1. What is Trident
  2. Trident Spouts
  3. Types of Trident Spouts
  4. Trident Spout components
  5. Trident spout Interface
  6. Trident filter, function & Aggregator
  7. Hands-On:
  8. Implementing Trident Spouts and Bolts
  1. Transactional Topologies
  2. Partitioned Transactional Spouts
  3. Opaque Transactional Topologies
  4. Hands-On:
  5. Implementing transactional system using Transactional topologies
  1. Basic Kafka Concepts
  2. Kafka vs Other Messaging Systems
  3. Intra-Cluster Replication
  4. An Inside Look at Kafka’s Components
  5. Log Administration, Retention, and Compaction
  6. Hardware and Runtime Configurations
  7. Monitoring and Alerting
  8. Cluster Administration
  9. Securing Kafka
  10. Using Kafka Connect to Move Data

Reviews