Administrator Training for Apache Hadoop

Hadoop Administration

Apache® Hadoop™ is the most popular framework for processing Big Data on clusters of servers. Our high-quality Hadoop Administrator Training is delivered by our consultants who work with Hadoop technology on daily basis.

Whether we speak about pure Apache Hadoop or one of 3rd party commercial distributions like Teradata, Cloudera, Hortonworks, MapR, participants learn to plan, deploy, manage and optimize Hadoop environment but also learn to monitor and troubleshoot applications running on Apache Hadoop.


This course is great for administrators, DevOps and IT managers interested in setting up and managing a large Hadoop cluster. If you are an administrator, or want to be an administrator, and you are ready to build and maintain a production-level Hadoop cluster running Cloudera, Teradata or Hortonworks, then this course is for you.This  training is best suited to systems administrators, DevOps and IT managers who have basic Linux experience.

Teaching method

Through instructor-led training with hands-on exercises participants get practical hands-on experience with Apache Hadoop and its ecosystem.

Course outline

  1. Hadoop as distributed computing engine
  2. Internals of Yarn scheduler, MapReduce paradigm, HDFS
  3. Determining the correct hardware and infrastructure for your cluster
  4. Loading data into cluster from various sources like DBs, Message queues, File logs
  5. Best practices for preparing and maintaining Apache Hadoop in production
  6. Monitoring, diagnosing, troubleshooting and solving Hadoop issues
  7. Application use cases that might run your cluster


Instuructor Vladimir Smida

Vladimir Smida is a Big Data Engineer at Comiit. Over the years Vlad has architected and developed enterprise ready production systems based on Apache Hadoop, Apache Spark, HBase deployed on various commercial distributions like Cloudera, Hortonworks, Azure. He worked as big data consultant with DevOps of the biggest IT companies in the world and his experience ranges from Proof of Concepts to large (600+) node hadoop clusters. Outside working hours Vlad runs the biggest community of data scientists in Scandinavia called