Cloudera Data Analyst
This training shows analysts and database administrators how to apply traditional data analytics and business intelligence skills to Big Data. Learn the tools data professionals need to access, manipulate, and analyze complex data sets using SQL and familiar scripting languages.
Good overview of what is possible with Hadoop, Hive and Pig.
Data Scientist, Elmar Reizen
Audience Profile: Cloudera Data Analyst training
This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Knowledge of SQL is assumed, as is basic Linux command-line familiarity. Knowledge of at least one scripting language (such as Bash scripting, Perl, Python, or Ruby) would be helpful but is not essential. No prior knowledge of Hadoop is required.
Achievements upon completion
This four-day hands-On course, focusing on Apache Pig, Apache Hive, and Apache Impala (incubating), will teach you to access, manipulate, and analyze massive data sets in your Hadoop cluster. Using SQL and a simple scripting language you will learn how to perform ETL tasks and gain valuable insight from your data.
- The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysis
- The fundamentals of data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
- How Pig, Hive, and Impala improve productivity for typical analysis tasks
- Joining diverse datasets to gain valuable business insight
- Performing real-time, complex queries on datasets
You will have hands-on experience in:
- Joining multiple data sets and analyzing disparate data with Pig
- Organizing data into tables, performing transformations, and simplifying complex queries with Hive
- Making multi-structures data accessible with Hive
You will have the skills to:
- perform real-time interactive analyses on massive datasets stored in HDFS or HBase using SQL with Impala
- pick the best analysis tool for a given task in Hadoop
- enable real-time interactive analysis of the data stored in Hadoop via a native SQL environment with Cloudera Impala
!Please note, that you need to bring your own laptop for this training.
Xebia Academy (based in Hilversum, Amsterdam area) is an official training partner of Cloudera, the leader in Apache Hadoop-based software and services.