DATA ANALYTICS - SPARK/STORM
Spark and Storm are a part of big data environment and is used for big data processing. This course will help the learners in understanding the concept of in-memory distributed dataset and how they are associated in big data processing.
Our Data Analytics – Spark and Storm course is perfectly sutiable for the individual who wants to start career in Data Analytics or organisation seeking to train thier employees.
Training course can be tailored to your organisations’ or individual needs. Please contact us for details and prices of private in-house training services.
Scheduled start date: Contact us for dates
Introduction to Big Data, Hadoop, HDFS, MapReduce
- Rise of Big Data
- Compare Hadoop vs traditional systems
- Hadoop Master-Slave Architecture
- Understanding HDFS Architecture
- Name Node, Data Node, Secondary Node
- Learn about Job Tracker, Task Tracker
- Understanding MapReduce Architecture
Apache
Spark
- Introduction to Apache Spark
- Compare Apache Spark and the Map Reduce computational framework
- Spark Basics
- Working with RDDs in Spark
- Aggregating Data with Pair RDDs
- Writing and Deploying Spark Applications
- Parallel Processing
- Spark RDD Persistence
- Basic Spark Streaming
- Advanced Spark Streaming
- Common Patterns in Spark Data Processing
- Improving Spark Performance
- Spark SQL and DataFrames
- Practice lab exercises using Apache Spar
Apache
Storm
- Storm Basics
- Storm Core Concepts
- Understanding Architecture of Storm
- Storm Workflow
- Installation of Apache storm
- Grouping
- Overview of Trident
- Boot Stripping
- Storm Working Example
- Storm in Twitter
- Improving Storm Performance
- Practice lab exercises using Apache Storm