Are you interested in this course? Please let us know.
 Book nowWaitinglist
Prices are displayed without VAT by default.
  • Global training info
  • Category Big Data
  • Price (excl. VAT)
  • Language {{course.language}}
  • Duration 4 days
  • Time 09:00 - 17:00
  • Lunch Included

Cloudera Developer for Spark & Hadoop I

Learn how Apache Spark integrates with the entire Hadoop ecosystem. This four-day training taught in English gives you the skills you need to ingest data on a Hadoop cluster and process it with Spark, Hive, Flume, Sqoop, Impala, and other Hadoop ecosystem tools. Through instructor-led discussions and interactive, hands-on exercises, you learn to identify the right tool(s) for any situation, and how to use them. You’ll walk away from this training with the practical knowledge you need to tackle the real-world challenges Hadoop developers face every day.

“The training was interesting, and the trainers were very knowledgeable.” - Data Scientist

Audience: Cloudera Developer for Spark & Hadoop l

You will benefit from Cloudera Developer for Spark & Hadoop I training if you:

  • Work as a developer or engineer
  • Have programming experience
  • Can program in Scala and/or Python (Apache Spark examples and hands-on exercises are presented in those languages)
  • Have basic familiarity with the Linux command line
  • Have some knowledge of SQL
  • Prior knowledge of Hadoop is NOT required

Achievements Upon Completion

Cloudera Developer for Spark & Hadoop I gives you skills, knowledge, tools and training in the following areas:

You will learn:

  • How data is distributed, stored, and processed in a Hadoop cluster
  • How to use Sqoop and Flume to ingest data
  • How to process distributed data with Apache Spark
  • How to model structured data as tables in Impala and Hive
  • How to choose the best data storage format for different data usage patterns
  • Best practices for data storage

You will gain experience in:

  • Know the right tool(s) for any situation, and how to use them
  • Best practices for data storage
  • Work with RDDs in Spark

You will develop the skills to:

  • Process distributed data with Apache Spark
  • Choose the best data storage format for different data usage patterns

Additional Information

Requirements

  • Please bring your own laptop

Certification

  • This training is an excellent place to begin working towards the CCP: Data Engineer certification and covers many of the subjects tested. However, further study is required; we recommend Developer Training for Spark and Hadoop II: Advanced Techniques before taking the exam.

Xebia Academy (based in Hilversum, Amsterdam) is an official training partner of Cloudera, the leader in Apache Hadoop-based software and services.


http://training.xebia.com/big-data/cloudera-developer-for-spark-and-hadoop-1