Duration:
- 2 Days (8 Hours/day)
Day | Topics – Using Hadoop’s Core | |
---|---|---|
Day 1 | HDFs: What it is, and how it works | Demo + Lab |
Analytics overview | ||
Install the MovieLens dataset into HDFS using the Ambari UI/Using the command line | ||
MapReduce: What it is, and how it works | ||
Installing Python, MR Job, and nano | ||
Programming Hadoop with Pig | ||
Introducing Ambari | ||
Introducing Pig/More Pig Latin | ||
Programming Hadoop with Spark | ||
Why Spark? | ||
The Resilient Distributed Datasets (RDD) | ||
Datasets and Spark 2.0 | ||
Topics – Using Relational/non- Relational Data Stores with Hadoop | ||
Day 2 | What is Hive and how Hive works? | Demo + Lab |
Integrating MySQL with Hadoop | ||
Why NoSQL? | ||
What is HBase | ||
Cassandra Overview | ||
MongoDB Overview | ||
Querying Your Data Interactively | ||
Overview of Drill, Phoenix, and Presto | ||
Setting up Drill, Phoenix, and Presto | ||
Managing Your Cluster | ||
Yarn, Tez, Mesos, Zookeeper, Oozie, Zeppelin and Hue explained | ||
Feeding Data to Your Cluster | ||
Kafka and Flume Explained | ||
Setting up Kafka and Flume | ||
Analyzing Streams of Data | ||
Spark Streaming, Apache Storm and Flink Overview |
Notes:
- E-kit is not available for custom courses
- The trainer will provide slides deck(in PDF format), used to deliver theory sessions.
- Custom lab environment (if applicable) will be available only for training duration.
There are no reviews yet.