Python is an excellent programming language for data science, due to its rich set of data structures, the maturity of its data-focused libraries, and its ability to execute all aspects of a data science pipeline.
This workshop will train you towards becoming a well-rounded data scientist, with a good balance between understanding the data exploration aspect needed for modelling and the engineering part necessary to create machine learning pipelines.
This class is also a recommended preparatory workshop for students planning to enroll in General Assembly’s Data Science course.
Working with real-world examples, you’ll:
Ability to use Python for basic data analysis tasks: read & write data, data cleaning, aggregations, handling missing data and outliers.
Advanced understanding of pandas, to transform datasets for analysis or modelling
Practical knowledge of performance strategies: caching and multiprocessing
Skills to prepare a simple machine learning pipelines Greater preparedness for GA’s Data Science course
Laptop with a working install of python and Jupyter notebook
Understanding of Python’s data structures and control flow
Ability to use Python for basic data manipulation tasks
Please note: This session will not be recorded, however slides will be shared following the class.
Jireh is a data scientist with a passion for applied statistics. He has a decade of experience working in tech, running the gamut from seed-stage startups to publicly-traded companies. His favorite Python abstraction is the iterable and his favorite Python keyword is ‘yield’. In his spare time Jireh enjoys bouldering and eating very spicy food.
Your registration for or attendance at any General Assembly offering indicates your agreement to abide by this Community Code of Conduct policy and its terms.