This three-day training is for data engineers, analysts, architects; software engineers; IT operations; and technical managers interested in a thorough, hands-on overview of the Apache Spark platform.
Each topic includes slide and lecture content along with the hands-on use of Spark through the elegant Databricks web-based notebook environment. Inspired by tools like IPython/Jupyter and Matlab, Databricks notebooks allow attendees to code jobs, data analysis queries and generate visualizations using their own Spark cluster, accessed through a web browser.
Q: Is Spark Programming training right for me?
- Yes - if you are a data analyst, engineer, architect, software developers, IT/Operation, or a technical manager looking to use Spark to build data pipelines and create streaming and machine learning jobs
- Yes - if you have a basic understanding of Python or Scala. Knowledge of SQL is helpful but not required
Q: What will I achieve by completing this training?
The training covers the core APIs for using Spark, fundamental mechanisms and basic internals of the platform, SQL, and other high-level data access tools, as well as Spark's streaming capabilities and machine learning APIs.
You will learn:
- How to describe Spark’s fundamental mechanics
You will gain hands-on experience in:
- Using the core Spark APIs to operate on data
- Experimenting with typical use cases for Spark
- Building data pipelines with SparkSQL and DataFrames
- Analyzing Spark jobs using the UIs and logs
You will develop the skills to:
- Create Streaming and machine learning jobs
Q: What else should I know?
Xebia Academy is a Databricks certified Training Partner delivering Databricks Spark training with a Databricks certified Spark instructor.
Please, bring your own laptop to the training. We will use the following software during this training. Please, make sure your laptop has them installed: Chrome