Intermediate Python for Data Science | Online | December 11

$199 AUD each
Friday, 10 December, 2021
8 pm – 2 am EST
Perth (Online)

About This Workshop

Python is an excellent programming language for data science, due to its rich set of data structures, the maturity of its data-focused libraries, and its ability to execute all aspects of a data science pipeline.

This workshop will train you towards becoming a well-rounded data scientist, with a good balance between understanding the data exploration aspect needed for modelling and the engineering part necessary to create machine learning pipelines.

This class is also a recommended preparatory workshop for students planning to enroll in General Assembly’s Data Science course.

Working with real-world examples, you’ll:

  • Learn to read, write and analyse any dataset with Python & Pandas.
  • Learn keys techniques for modelling such as data pre-processing, handling missing data, identify outliers, and data transformations.
  • Build a machine learning pipeline with an understanding of performance bottlenecks

This bootcamp is broken down into 3 modules

Part1 : Using pandas for data cleaning and pre-processing [2.5 hours]

  • Loading & Saving data: from flat files , apis, databases.
  • Data cleaning: Filtering data, casting data types, impute missing data, identify outliers.
  • Combining data: merge and concatenate
  • Transformations: group by & apply, create custom transformations
  • Basic data visualisation (and how to think about data viz)
  • Understanding vectorized operations
  • Clean pandas: method chaining
  • When to use pandas vs other tools

Part2: Performance Python For Data Science [1.5 hour]

  • Thinking about performance: when it matters, type of performance issues
  • Caching strategies
  • Multiprocessing with python

Part 3: Machine Learning Pipeline [2 hours]

  • What is a machine learning pipeline ? When is it needed ?
  • Understanding the building blocks of an ML pipeline: Load/Clean, Feature engineering, Train Model, Validate Model, Hyper-params optimization, Save Model, Predict & persist results
  • Building a complete ML pipeline

Takeaways

Ability to use Python for basic data analysis tasks: read & write data, data cleaning, aggregations, handling missing data and outliers.

Advanced understanding of pandas, to transform datasets for analysis or modelling

Practical knowledge of performance strategies: caching and multiprocessing

Skills to prepare a simple machine learning pipelines Greater preparedness for GA’s Data Science course

Preparation

Laptop with a working install of python and Jupyter notebook

Understanding of Python’s data structures and control flow

Ability to use Python for basic data manipulation tasks

This workshop will be completely live and online. In preparation, please download Zoom and Slack.

Please note: This session will not be recorded, however slides will be shared following the class.

About the Instructor

Jireh Tan Photo

Data Science Instructor, General Assembly

Jireh is a data scientist with a passion for applied statistics. He has a decade of experience working in tech, running the gamut from seed-stage startups to publicly-traded companies. His favorite Python abstraction is the iterable and his favorite Python keyword is ‘yield’. In his spare time Jireh enjoys bouldering and eating very spicy food.

Refund Policy

We understand that, sometimes, plans change. If you can no longer make it to a class or workshop, please email us at least 7 days before the scheduled event date. No refunds will be given to cancellations made within a week of the class or workshop.

Community Code of Conduct

Your registration for or attendance at any General Assembly offering indicates your agreement to abide by this Community Code of Conduct policy and its terms.

Join Us
By providing us with your email, you agree to General Assembly’s Terms of Service, and to receive email updates on courses, special events and GA news. You can change your mind at any time and unsubscribe from GA marketing emails by clicking the "unsubscribe" link located at the bottom of every marketing email or by emailing no.marketing@ga.co.
Can't Make It?
Follow this workshop to get an email the next time it’s scheduled.
Sign-up not required
Follow
Spread the Word

You May Also Like

Can't Make It?

Follow this workshop to get an email the next time it’s scheduled.

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.
Join Us
By providing us with your email, you agree to General Assembly’s Terms of Service, and to receive email updates on courses, special events and GA news. You can change your mind at any time and unsubscribe from GA marketing emails by clicking the "unsubscribe" link located at the bottom of every marketing email or by emailing no.marketing@ga.co.