Hadoop for Developers (4 days) Training Course

Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. This course will introduce a developer to various components (HDFS, MapReduce, Pig, Hive and HBase) Hadoop ecosystem.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Section 1: Introduction to Hadoop

hadoop history, concepts
eco system
distributions
high level architecture
hadoop myths
hadoop challenges
hardware / software
lab : first look at Hadoop

Section 2: HDFS

Design and architecture
concepts (horizontal scaling, replication, data locality, rack awareness)
Daemons : Namenode, Secondary namenode, Data node
communications / heart-beats
data integrity
read / write path
Namenode High Availability (HA), Federation
labs : Interacting with HDFS

Section 3 : Map Reduce

concepts and architecture
daemons (MRV1) : jobtracker / tasktracker
phases : driver, mapper, shuffle/sort, reducer
Map Reduce Version 1 and Version 2 (YARN)
Internals of Map Reduce
Introduction to Java Map Reduce program
labs : Running a sample MapReduce program

Section 4 : Pig

pig vs java map reduce
pig job flow
pig latin language
ETL with Pig
Transformations & Joins
User defined functions (UDF)
labs : writing Pig scripts to analyze data

Section 5: Hive

architecture and design
data types
SQL support in Hive
Creating Hive tables and querying
partitions
joins
text processing
labs : various labs on processing data with Hive

Section 6: HBase

concepts and architecture
hbase vs RDBMS vs cassandra
HBase Java API
Time series data on HBase
schema design
labs : Interacting with HBase using shell; programming in HBase Java API ; Schema design exercise

Requirements

comfortable with Java programming language (most programming exercises are in java)
comfortable in Linux environment (be able to navigate Linux command line, edit files using vi / nano)

Lab environment

Zero Install : There is no need to install hadoop software on students’ machines! A working hadoop cluster will be provided for students.

Students will need the following

an SSH client (Linux and Mac already have ssh clients, for Windows Putty is recommended)
a browser to access the cluster. We recommend Firefox browser

28 Hours

Need help picking the right course?

Testimonials (5)

The live examples

Ahmet Bolat - Accenture Industrial SS

Course - Python, Spark, and Hadoop for Big Data

During the exercises, James explained me every step whereever I was getting stuck in more detail. I was completely new to NIFI. He explained the actual purpose of NIFI, even the basics such as open source. He covered every concept of Nifi starting from Beginner Level to Developer Level.

Firdous Hashim Ali - MOD A BLOCK

Course - Apache NiFi for Administrators

Trainer's preparation & organization, and quality of materials provided on github.

Hadoop for Developers (4 days) Training Course

Course Outline

Section 1: Introduction to Hadoop

Section 2: HDFS

Section 3 : Map Reduce

Section 4 : Pig

Section 5: Hive

Section 6: HBase

Requirements

Lab environment

Testimonials (5)

Ahmet Bolat - Accenture Industrial SS

Course - Python, Spark, and Hadoop for Big Data

Firdous Hashim Ali - MOD A BLOCK

Course - Apache NiFi for Administrators

Mateusz Rek - MicroStrategy Poland Sp. z o.o.

Course - Impala for Business Intelligence

Peter Scales - CACI Ltd

Course - Apache NiFi for Developers

Dominik Mazur - Capgemini Polska Sp. z o.o.

Course - Hadoop Administration on MapR

Upcoming Courses

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Hadoop for Developers (4 days) Training Course

Course Outline

Section 1: Introduction to Hadoop

Section 2: HDFS

Section 3 : Map Reduce

Section 4 : Pig

Section 5: Hive

Section 6: HBase

Requirements

Lab environment

Testimonials (5)

Ahmet Bolat - Accenture Industrial SS

Course - Python, Spark, and Hadoop for Big Data

Firdous Hashim Ali - MOD A BLOCK

Course - Apache NiFi for Administrators

Mateusz Rek - MicroStrategy Poland Sp. z o.o.

Course - Impala for Business Intelligence

Peter Scales - CACI Ltd

Course - Apache NiFi for Developers

Dominik Mazur - Capgemini Polska Sp. z o.o.

Course - Hadoop Administration on MapR

Upcoming Courses

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Hadoop for Developers (4 days)

Related Courses

Administrator Training for Apache Hadoop

Audience:

Goal:

Big Data Analytics in Health

Hadoop Administration

Course goal:

Hadoop For Administrators

Audience

Format

Advanced Hadoop for Developers

Hadoop Administration on MapR

Audience:

Hadoop and Spark for Administrators

HBase for Developers

Hortonworks Data Platform (HDP) for Administrators

Data Analysis with Hive/HiveQL

Impala for Business Intelligence

Infomatica with Big Data (BDM)

Apache NiFi for Administrators

Apache NiFi for Developers

Python, Spark, and Hadoop for Big Data

Related Categories

Hadoop

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites