Apache Hive

CloudLabs

Projects

Assignment

24x7 Support

Lifetime Access

Course Overview

Master core concepts on hadoop distributed file system and Understand apache pig and advanced apache hive programming concepts as you learn with our certified experts. Learn how to use Hcatalog, joining datasets in apache hive and HDFS Commands.Gain practical experience to import and export RDBMS data into HDFS, analyze clickstream data. Data using quantiles. With our cloudlabs get hands-on experience to run a YARN application, apache hive programming, analyzing big data with apache hive, join datasets with apache pig and starting an HDP cluster.

At the end of the training, participants will be able to:

Explain Hadoop and the Hadoop Distributed File System (HDFS)
Interpret Common HDFS Commands Types
Export Table
Distinguish between Relational Databases and Hadoop
Explain Purpose of NameNodes, DataNode, MapReduce and Reduce Phases
Differentiate Pig Latin Relation Names and Field Names
Explain programming concepts using PIG and HIVE.
Perform Inner, Outer and Replicated Join
Demonstrate the Use of HCatLoader and HCatStorer with Apache Pig
Explain Lifecycle of YARN Applications
Common use cases of Spark
Load Data and Perform a Word Count
Perform SQL Queries
Perform DataFrame Operations
Submit an Apache Oozie Workflow

Pre-requisite

Should be familiar with programming principles and have experience in software development.
SQL knowledge is also helpful.
No prior Hadoop knowledge is required.

Duration

2 days

Course Outline

Understanding Hadoop and HDFS

List the Three “V”s of Big Data
List the Six Key Hadoop Data Types
Describe Hadoop, YARN and Use Cases for Hadoop
Describe Hadoop Ecosystem Tools and Frameworks
Describe the Differences Between Relational Databases and Hadoop
Describe What is New in Hadoop 2.x
Describe the Hadoop Distributed File System (HDFS)
Describe the Differences Between HDFS and an RDBMS
Describe the Purpose of NameNodes and DataNodes
List Common HDFS Commands
Describe HDFS File Permissions
List Options for Data Input
Describe WebHDFS
Describe the Purpose of Sqoop and Flume
Describe How to Export to a Table
Describe the Purpose of MapReduce
Define Key/Value Pairs in MapReduce
Describe the Map and Reduce Phases
Describe Hadoop Streaming
Starting an HDP Cluster
Demonstration: Understanding Block Storage (Lab)
Using HDFS Commands (Lab)
Importing RDBMS Data into HDFS (Lab)
Exporting HDFS Data to an RDBMS (Lab)
Importing Log Data into HDFS Using Flume (Lab)
Demonstration: Understanding MapReduce (Lab)
Running a MapReduce Job (Lab)

Pig Programming

Describe the Purpose of Apache Pig
Describe the Purpose of Pig Latin
Demonstrate the Use of the Grunt Shell
List Pig Latin Relation Names and Field Names
List Pig Data Types
Define a Schema
Describe the Purpose of the GROUP Operator
Describe Common Pig Operators ( ORDER BY, CASE, DISTINCT, PARALLEL, FLATTEN, FOREACH)
Perform an Inner, Outer and Replicated Join
Describe the Purpose of the DataFu Library
Demonstration: Understanding Apache Pig (Lab)
Getting Starting with Apache Pig (Lab)
Exploring Data with Apache Pig (Lab)
Splitting a Dataset (Lab)
Joining Datasets with Apache Pig (Lab)
Preparing Data for Apache Hive (Lab)
Demonstration: Computing Page Rank (Lab)
Analyzing Clickstream Data (Lab)
Analyzing Stock Market Data Using Quantiles (Lab)

Hive Programming

Describe the Purpose of Apache Hive
Describe the Differences Between Apache Hive and SQL
Describe the Apache Hive Architecture
Demonstrate How to Submit Hive Queries
Describe How to Define Tables
Describe How to Load Date Into Hive
Define Hive Partitions, Buckets and Skew
Describe How to Sort Data
List Hive Join Strategies
Describe the Purpose of HCatalog
Describe the HCatalog Ecosystem
Define a New Schema
Demonstrate the Use of HCatLoader and HCatStorer with Apache Pig
Perform a Multi-table/File Insert
Describe the Purpose of Views
Describe the Purpose of the OVER Clause
Describe the Purpose of Windows
List Hive Analytics Functions
List Hive File Formats
Describe the Purpose of Hive SerDe
Understanding Hive Tables (Lab)
Understanding Partition and Skew (Lab)
Analyzing Big Data with Apache Hive (Lab)
Demonstration: Computing NGrams (Lab)
Joining Datasets in Apache Hive (Lab)
Computing NGrams of Emails in Avro Format (Lab)
Using HCatalog with Apache Pig (Lab)

Advanced Hive Programming, Hadoop 2 and YARN

Describe the Purpose HDFS Federation
Describe the Purpose of HDFS High Availability (HA)
Describe the Purpose of the Quorum Journal Manager
Demonstrate How to Configure Automatic Failover
Describe the Purpose of YARN
List the Components of YARN
Describe the Lifecycle of a YARN Application
Describe the Purpose of a Cluster View
Describe the Purpose of Apache Slider
Describe the Origin and Purpose of Apache Spark
List Common Spark Use Cases
Describe the Differences Between Apache Spark and MapReduce
Demonstrate the Use of the Spark Shell
Describe the Purpose of an Resilient Distributed Dateset (RDD)
Demonstrate How to Load Data and Perform a Word Count
Define Lazy Evaluation
Describe How to Load Multiple Types of Data
Demonstrate How to Perform SQL Queries
Demonstrate How to Perform DataFrame Operations
Describe the Purpose of the Optimization Engine
Describe the Purpose of Apache Oozie
Describe Apache Pig Actions
Describe Apache Hive Actions
Describe MapReduce Actions
Describe How to Submit an Apache Oozie Workflow
Define an Oozie Coordinator Job
Advanced Apache Hive Programming (Lab)
Running a YARN Application (Lab)
Getting Started with Apache Spark (Lab)
Exploring Apache Spark SQL (Lab)
Defining an Apache Oozie Workflow (Lab)

+91-81029 35454

info@greaterinsights.in

GREATERINSIGHTS LLP

Apache Hive

CloudLabs

Projects

Assignment

24x7 Support

Lifetime Access

Course Overview

At the end of the training, participants will be able to:

Pre-requisite

Duration

Course Outline

Reviews

EXPLORE

All Courses

About Us

Privacy Policy

Resources

Terms & Conditions

LOCATION

GET IN TOUCH!

768, 14th Cross Rd, 2nd Stage, Kumaraswamy Layout, Bengaluru, Karnataka 560078

+91-81029 35454

info@greaterinsights.in

Need help with Corporate Training?

© Copyright 2025 by GREATERINSIGHTS LLP. All rights Reserved