Web Scraping with Python

Online Campus

Online
Anywhere
Online

Past Locations for this Workshop

Web Scraping with Python | Online

Online Campus

Online
Anywhere
Online

Past Locations for this Workshop

About this workshop

Have you ever visited a website rich with the data you want, but the website provided no "download" button for you to retrieve it in a convenient way? Or have you ever found the perfect source for the CSV or other data files you need, but you must tediously click the "next" and "download" buttons many dozens or even hundreds of time to get what you need?

In this course, you'll learn how to build Python web scraping programs to programmatically navigate through a website and retrieve data in a structured format. You'll learn how to harness Python to automate and streamline data collection from sites that require logins, have tables, and more to help make your job easier and more efficient. We'll also discuss the ethics surrounding these practices, so you understand when it's okay to use scraping and when you need to find an alternative route.

Whether you work with data for personal, professional, or academic reasons, you'll walk away with a concrete new skill that helps you automate and streamline tasks.

Takeaways

  • Explore the ethical debate surrounding web scraping
  • Understand how web scraping works and why Python is an excellent tool to programmatically extract data from websites
  • Gain practice scraping web pages with Python using Requests, BeautifulSoup, and Selenium
  • Learn how to properly format and store the scraped data as a CSV

Prereqs & Preparation

This workshop is designed for students with a basic knowledge of Python, or experience programming in another language. Anyone who has taken a Python workshop at GA will be well-equipped for this course, but self-taught learners and anyone who is willing to follow along are welcome! Additionally, knowledge of basic HTML syntax will be incredibly useful, but it will not be assumed.

What to bring to class: All students must bring their own laptops with an installation of Anaconda 3.6, a free distribution of Python that includes libraries of open source Python tools. In case of technical difficulties on your local computer, opening an account on Google Colaboratory, a cloud-based Python environment, is highly encouraged.

Coming up near you

Let’s Keep You Updated

Enter your email to start following

I have read and acknowledge General Assembly's Privacy Policy and Terms of Service. SMS message and data rates may apply.