Intermediate Python for Data Science | Online
22

Friday, 21 May

9 pm – 3 am EDT

GA Singapore (Livestream)

Online Class
Livestream
$250 SGD
Regular Ticket
$250 SGD
Total

Questions? Read our FAQs

Intermediate Python for Data Science | Online | May 22

Conor Smyth Photo


Lead Instructor, General Assembly

22

Friday, 21 May

9 pm – 3 am EDT

GA Singapore (Livestream)

Online Class
Livestream
$250 SGD
Regular Ticket
$250 SGD
Total

Questions? Read our FAQs

About This Workshop

Python is an excellent programming language for data science, due to its rich set of data structures, the maturity of its data-focused libraries, and its ability to execute all aspects of a data science pipeline.

This workshop will train you towards becoming a well-rounded data scientist, with a good balance between understanding the data exploration aspect needed for modelling and the engineering part necessary to create machine learning pipelines.

This class is also a recommended preparatory workshop for students planning to enroll in General Assembly’s Data Science course.

Working with real-world examples, you’ll: - Learn to read, write and analyse any dataset with Python & Pandas. - Learn keys techniques for modelling such as data pre-processing, handling missing data, identify outliers, and data transformations. - Build a machine learning pipeline with an understanding of performance bottlenecks

This bootcamp is broken down into 3 modules

Part1 : Using pandas for data cleaning and pre-processing [2.5 hours]

  • Loading & Saving data: from flat files , apis, databases.
  • Data cleaning: Filtering data, casting data types, impute missing data, identify outliers.
  • Combining data: merge and concatenate
  • Transformations: group by & apply, create custom transformations
  • Basic data visualisation (and how to think about data viz)
  • Understanding vectorized operations
  • Clean pandas: method chaining
  • When to use pandas vs other tools

Part2: Performance Python For Data Science [1.5 hour]

  • Thinking about performance: when it matters, type of performance issues
  • Caching strategies
  • Multiprocessing with python

Part 3: Machine Learning Pipeline [2 hours]

  • What is a machine learning pipeline ? When is it needed ?
  • Understanding the building blocks of an ML pipeline: Load/Clean, Feature engineering, Train Model, Validate Model, Hyper-params optimization, Save Model, Predict & persist results
  • Building a complete ML pipeline

Takeaways

  • Ability to use Python for basic data analysis tasks: read & write data, data cleaning, aggregations, handling missing data and outliers.
  • Advanced understanding of pandas, to transform datasets for analysis or modelling
  • Practical knowledge of performance strategies: caching and multiprocessing
  • Skills to prepare a simple machine learning pipelines
  • Greater preparedness for GA’s Data Science course
  • Preparation

  • Laptop with a working install of python and Jupyter notebook
  • Understanding of Python’s data structures and control flow
  • Ability to use Python for basic data manipulation tasks
  • This workshop will be completely live and online. In preparation, please download Zoom and Slack.

    Please note: This session will not be recorded, however slides will be shared following the class.

    About the Instructor

    Conor Smyth Photo

    Lead Instructor,
    General Assembly

    Conor is a research mathematician/data scientist with 7 years experience applying statistical machine learning, artificial intelligence, and topological data analysis to large heterogeneous data sets. Conor is adept at communicating data science, machine learning, as well as other high-level mathematical concepts to both technical and non-technical/business audiences. He has diverse industry experience including Sainsbury Bank and Mark and Spencer. He holds a degree in mathematics and physics, and both a Master's and PhD in pure mathematics.

    Refund Policy

    We understand that, sometimes, plans change. If you can no longer make it to a class or workshop, please email us at least 7 days before the scheduled event date. No refunds will be given to cancellations made within a week of the class or workshop.

    Community Code of Conduct

    Your registration for or attendance at any General Assembly offering indicates your agreement to abide by this Community Code of Conduct policy and its terms.

    Coming up near you

    Let’s Keep You Updated

    Enter your email to start following

    By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.