Have you ever visited a website rich with the data you want, but the website provided no "download" button for you to retrieve it in a convenient way? Or have you ever found the perfect source for the CSV or other data files you need, but you must tediously click the "next" and "download" buttons many dozens or even hundreds of time to get what you need?
In this course, you'll learn how to build Python web scraping programs to programmatically navigate through a website and retrieve data in a structured format. You'll learn how to harness Python to automate and streamline data collection from sites that require logins, have tables, and more to help make your job easier and more efficient. We'll also discuss the ethics surrounding these practices, so you understand when it's okay to use scraping and when you need to find an alternative route.
Whether you work with data for personal, professional, or academic reasons, you'll walk away with a concrete new skill that helps you automate and streamline tasks.
This workshop is designed for students with a basic knowledge of Python, or experience programming in another language. Anyone who has taken a Python workshop at GA will be well-equipped for this course, but self-taught learners and anyone who is willing to follow along are welcome! Additionally, knowledge of basic HTML syntax will be incredibly useful, but it will not be assumed.
What to bring to class: All students must bring their own laptops with an installation of Anaconda 3.6, a free distribution of Python that includes libraries of open source Python tools. In case of technical difficulties on your local computer, opening an account on Google Colaboratory, a cloud-based Python environment, is highly encouraged.