Apache Impala

Course Overview

Apache Impala is an open-source distributed SQL query engine that is used to analyze data stored in Apache Hadoop. It is designed to be fast and flexible, allowing users to run queries on data stored in HDFS and Apache HBase in real-time. Apache Impala Training is a course that teaches participants how to use Apache Impala to analyze data stored in Hadoop. The training covers topics such as how to install and configure Impala, how to write and optimize Impala SQL queries, and how to integrate Impala with other Hadoop technologies. Upon completing the training, participants will have the skills and knowledge needed to use Impala to perform fast, interactive analysis of large data sets stored in Hadoop.

At the end of the training, participants will be able to:

  1. Explain the architecture of Impala and explain its business use cases
  2. Install & configure Impala and integrate with your organization’s Hadoop ecosystem
  3. Configure Impala for data access and to manage metadata
  4. Query Impala or HIVE
  5. Partition tables, optimize performance
  6. Work with Hadoop clusters in existing file systems and types

Pre-requisite

  1. Knowledge of Apache Hadoop ecosystem and SQL is required.
  2. Basic understanding of database administration is good to have.
 

Duration

2 days

Course Outline

  1. What is Impala
  2. Benefits of Impala
  3. Exploratory Business Intelligence
  4. Impala Installation
  5. Starting and Stopping Impala
  6. Data Storage
  7. Managing Metadata Preview
  8. Controlling Access to Data Preview
  9. Impala Shell Commands and Interface
  1. Querying with Hive and Impala
  2. SQL Language Statements
  3. DDL Statements
  4. DML Statements
  5. CREATE DATABASE
  6. CREATE TABLE Preview
  7. CREATE TABLE – Examples Preview
  8. Internal and External Tables
  9. Loading Data into Impala Table
  10. ALTER TABLE
  11. DROP TABLE
  12. DROP DATABASE
  13. DESCRIBE Statement Preview
  14. EXPLAIN Statement Preview
  15. SHOW TABLE Statement
  16. INSERT Statement
  17. INSERT Statement – Examples
  18. SELECT Statement
  19. Data Type
  20. Operators Preview
  21. Functions
  22. CREATE VIEW in Impala
  23. Hive and Impala Query Syntax
  1. Data Storage and File Format
  2. Partitioning Tables Preview
  3. SQL Statements for Partitioned Tables
  4. File Format and Performance Considerations
  5. Choosing File Type and Compression Technique
  1. Working with Impala
  2. Impala Architecture Preview
  3. Impala Daemon
  4. Impala Statestore
  5. Impala Catalog Service
  6. Query Execution Flow in Impala
  7. User – Defined Functions Preview
  8. Hive UDFs with Impala
  9. Demo – UDF in Impala
  10. Improving Impala Performance

Reviews