Apache Cassandra
Course Overview
Gain a practical working knowledge of Cassandra architecture, interfaces and data model. Master the deployment of Apache Cassandra – an open-source distributed NoSQL database. Build scalable database solutions with high availability and performance. Deploy Cassandra to manage your big data with tunable consistency.
At the end of the training, participants will be able to:
- Explain Big data and NoSQL databases
- List the features of Cassandra
- Explain the architecture and data model of Cassandra
- Discuss Hadoop ecosystem of products around Cassandra
- Deploy NoSQL database solutions using Cassandra
Pre-requisite
Exposure to SQL databases and Java programming.
Duration
3 days
Course Outline
- What is Big data?
- Three V’s of Big data
- Data Volume
- Data Velocity
- Data Variety
- Evolution of Data
- Features of Big Data
- What is Apache Hadoop?
- Components of Hadoop: MapReduce, HDFS
- What is NoSQL?
- Difference between RDBMS and NoSQL Databases
- CAP Principle
- Types of NoSQL
- NoSQL Cassandra Database
- What is Cassandra?
- Cassandra: Use Cases
- Cassandra: Use in Industry
- Features of Cassandra
- Advantages of Cassandra
- Cassandra Commands for LINUX
- Cassandra Architecture
- Cassandra Architecture Components
- Data replication
- Simple strategy
- Network Topology
- Data Partition
- Snitches
- Gossip Protocol
- Seed Nodes
- What is Token?
- What is Virtual Node?
- Write Process
- Read Process
- Introduction to Data Model
- Features of Cassandra Data Model
- Cassandra Data Model Rules
- Cassandra Data Model Components
- UUIS and Time UUID
- Counter
- Features of Counter
- Compound Key
- Indexes
- Collections
- Types of Collections
- CQL
- DML Statements
- DDL Statements
- What is CQL?
- What is cqlsh?
- cqlsh Options
- cqlsh Shell Commands
- CQL Data Definition
- CQL Data Manipulation
- Java Interfaces
- Queries using java Interfaces
- ODBC Driver for Cassandra
- What is Partitioning?
- Features of Partitioners
- Types of Partitioners
- Replication of Data
- Replication strategy
- Types of Common replication Strategies
- Tunable Consistency
- Read Consistency
- Write Consistency
- Hinted handoff
- Time to Live
- Tombstones
- Monitoring the Cluster
- Monitoring the NodeTool
- Monitoring with OpsCenter
- CassandraStream
- Apache Storm
- Apache Kafka
- Real Time Data Analysis Platform
- Apache spark
- Spark and scala
Accordion Content