HADOOP Administration (Cloudera & Apache)

Hadoop admin is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the Hadoop cluster. Hadoop admin is also responsible for deciding the size of the Hadoop cluster based on the data to be stored in HDFS.

Course Overview

What will you learn?

Who should go for this training?
The following professionals can go for this course:

What are the pre-requisites for this Course?

Course Content

  • Hadoop Introduction
  • Big Data usage & benefits
  • Hadoop Architecture
  • Mapreduce 1.0 & 2.0
  • Hadoop cluster setup
  • File read / write operations
  • Creating Yum Repository
  • .repo files
  • Host inspectors
  • Block Size
  • Mount directories for DataNodes , NameNode, Secondary Namenode
  • Fsimage
  • Edit logs
  • Core-site.xml
  • Hdfs-site.xml
  • Secondary Namenode
  • Cloudera Manager
  • Setup Zookeeper services Cloudera Manager
  • Setup Journal Nodes using Cloudera Manager
  • Active & Standby Name Nodes
  • Understanding Zookeeper & Journal Nodes
  • High availability configurations (core-site.xml / hdfs-site.xml)
  • Understanding cdhns
  • Add YARN service
  • YARN Architecture
  • Application Masters
  • Resource Managers
  • Containers
  • Cluster Metrics
  • Typical Configuration, HA Configuration, Federation Configurations
  • Hardware consideration for slaves and servers
  • Awareness of hyperthreading, blade servers
  • Operating system considerations
  • Workload, CPU, memory, i/o, storage, disks, JBOD, RAID’s etc
  • Recommendations by Cloudera
  • Storage Intensive, Compute intensive, Balanced configurations
  • Network designs for Hadoop
  • Understanding Pig latin
  • Grunt shell, local mode and MR mode
  • Testing & Executing PIG jobs
  • Executing Big Data analysis using PIG
  • Decommission/Recommission nodes
  • Add/remove Hadoop nodes
  • Working with cloudera-scm-agent
  • Running balancer
  • Creating and applying host templates
  • Deploy Client Configuration
  • Hadoop Damemon logs
  • Understanding common error & troubleshooting them
  • Setup Environment Variables
  • Setup Core-site.xml
  • Hdfs-site.xml
  • Make datanode directories
  • dfsadmin
  • File system commands (fs, jar, fsck)
  • HDFS Balancer
  • Quota
  • HDFS health checks
  • Configure yarn-site.xml & mapred-site.xml
  • Running MRv2 jobs
  • HDFS logs directories
  • Observing charts
  • Setup root user
  • Dfs.blocksize, Data.dir etc properties
  • property
  • Fault tolerance
  • Checksum
  • Working with HDFS Blocksize
  • HDFS replication
  • Observing Namenode UI
  • Block pool
  • Dfsadmin
  • Safemode
  • saveNamespace
  • fsimage / edits log
  • Namenode recovery after server crash process
  • Datanode registrations
  • File system commands (fs, jar, fsck)
  • Fs – appendToFile,
  • put, copyToLocal,
  • chown, chgrp,
  • df, du
  • moveFromLocal, moveToLocal
  • stat etc
  • mysql-connector-java
  • mysql_secure_installation
  • Setup Metastore on MySql DB
  • HIVE architecture
  • Understanding HIVE warehouse directory
  • Hive logs
  • Creating HIVE tables & testing HIVE
  • Executing Big Data analysis using HIVE
  • Data backup using SNAPSHOTS
  • Data migrations using ‘distcp’ utility
  • Getting Ready
  • Setup Environment Variables
  • Test HDFS
  • Understanding Gateway node functionality
  • Setup hdfs-site.xml
  • Understanding rack awareness
  • Setup topology.script
  • Topology.data
  • Zookeeper configuration
  • Setup Journal Nodes
  • Setup automatic failover
  • Setup folder structures for HA
  • About QuorumPeerMain

Modes of Training

Classroom Training

Live interactive sessions delivered in our classroom by our expert trainers with real-time scenarios.

Online Training

Learn from anywhere over the internet, joining the live sessions delivered by our expert trainers.

Self-Pace Training

Learn through pre-recorded video sessions delivered by experts at your own pace and timing.

Frequently Asked Questions

Our trainer is an OCP & OCM certified consultant and has a significant amount of experience in working with the technology, having 18yrs of experience.

Once you get registered, our back-end team will share you the details to join the session live over an online portal which can be accessed through a browser.

Each of our live sessions is recorded. In case if you miss any, you can request us to share the link to that particular session.

For practical execution, our trainer/technical team will provide server access details to the student

Yes. We do provide the step-by-step document which you can follow and if required our technical team will assist you.

Live-Online training is where you can have a live session with the trainer and clarify queries parallelly.

Pre-recorded sessions are the recorded videos that will be provided to you that you can see, listen, and learn anytime at ur feasible place. For doubts in the videos, you can mail the trainer regarding the same.

You can contact our support team, or just drop an email to help@sarinfotechindia.com with your queries.

The course material and recorded videos which are provided during the course period. You can download it anytime.

Visit our website regularly to check discounts offers from time to time. However, we provide a discount for single participants & special discounts for 2 or more participants.

* If the request for cancelation is made within 2 days of enrolment for class, 100% refunded.

* If the request made after 2 days, then Refund is made after deduction of the administration fee.