Spark Programming

This three-day training is for data engineers, analysts, architects; software engineers; IT operations; and technical managers interested in a thorough, hands-on overview of the Apache Spark platform. Each topic includes slide and lecture content along with the hands-on use of Spark through the elegant Databricks web-based notebook environment. Inspired by tools like IPython/Jupyter and Matlab, Databricks notebooks allow attendees to code jobs, data analysis queries and generate visualizations using their own Spark cluster, accessed through a web browser.

Q: Is Spark Programming training right for me?

  • Yes - if you are a data analyst, engineer, architect, software developers, IT/Operation, or a technical manager looking to use Spark to build data pipelines and create streaming and machine learning jobs
  • Yes - if you have a basic understanding of Python or Scala. Knowledge of SQL is helpful but not required

Q: What will I achieve by completing this training?

The training covers the core APIs for using Spark, fundamental mechanisms and basic internals of the platform, SQL, and other high-level data access tools, as well as Spark's streaming capabilities and machine learning APIs. 

You will learn:

  • How to describe Spark’s fundamental mechanics

You will gain hands-on experience in:

  • Using the core Spark APIs to operate on data
  • Experimenting with typical use cases for Spark
  • Building data pipelines with SparkSQL and DataFrames
  • Analyzing Spark jobs using the UIs and logs

You will develop the skills to:

  • Create Streaming and machine learning jobs

Q: What else should I know?

Xebia Academy is a Databricks certified Training Partner delivering Databricks Spark training with a Databricks certified Spark instructor.

Prerequisites

Basic knowledge of programming in Python (i.e., knowledge of the concepts stated under Basics on learnpython.org) or Scala

Familiarity with the basics of SQL, this is not a requirement

Please, bring your own laptop to the training. We will use the following software during this training. Please, make sure your laptop has them installed: Chrome.

This training is taught by our training partner GoDataDriven