Intro to Building PySpark Applications Bootcamp

Los Angeles Campus

GA Los Angeles (DTLA)
360 E. 2nd Street, Suite 400
Los Angeles CA 90012

Past Locations for this Workshop

Intro to Building PySpark Applications Bootcamp | Los Angeles

Los Angeles Campus

GA Los Angeles (DTLA)
360 E. 2nd Street, Suite 400
Los Angeles CA 90012

Past Locations for this Workshop

About this workshop

The advent of ubiquitous mobile devices and increasing number of users joining online everyday, the age of Big Data is upon us. While the term Big Data is widely used, there are lots of misconceptions about it.

Businesses today have a need to collect, store, analyze and visualize data in ways they haven't done so before. It's important to learn and stay relevant in this evolving stream of changes. We will begin with an introduction to the emerging challenges of Big Data, review concepts from distributed systems and discover how Spark as a framework meets these challenges. This tutorial is hands-on, pragmatic, and intended to give you working knowledge with common Spark workflows.

In this bootcamp, students will learn:

  • Intro to distributed systems
  • How to install and set up a Spark environment locally
  • Spark framework internals and Execution Context
  • Explore various APIs for large scale data parsing, transformation & analysis: RDD, DataFrame, SparkSQL
  • Functions, Transformations & Actions
  • Spark Deployment Modes

Takeaways

  • What Spark is and how it works
  • Process, transform and visualize data
  • Working knowledge of how to build Spark Applications
  • Practical case studies demonstrating most common features
  • Avoid common mistakes

Prereqs & Preparation

A very basic understanding of programming is helpful but not necessary. Everyone is welcome and can take this introductory bootcamp.

Let’s Keep You Updated

Enter your email to start following

I have read and acknowledge General Assembly's Privacy Policy and Terms of Service. SMS message and data rates may apply.