11-week technology course

Apply Now Request Syllabus

Learn Core Skills

Math and programming skills

Applying your math and programming skills to make meaning out of large data sets

Data manipulation tools

Learning how to analyze and manipulate data with Python

Learn to make predictions with modeling

Learning how to make predictions about data using fundamental modeling techniques that will help you make better informed business decisions

Meet the team who's here for you

Our educational excellence is a community effort. When you learn at GA, you can always rely on an in-house team of experts to provide guidance and support, whenever you need it.

  • Your Instructor Is An Experienced Professional

    Learn industry-grade frameworks, tools, vocabulary, and best practices from a teacher whose daily work involves using them expertly.

  • Your TAs Are Always Available For Extra Help

    Taking on new material isn't always easy. Through office hours and other channels, our TAs are here to provide you with answers, tips, and more.

  • Your Course Producer Puts Your Success First

    Our alumni love their Course Producers. Yours will be in constant touch to check in one-on-one and keep you motivated throughout the course.

My team at Amazon couldn't have built its recommendation system without the foundational data mining and machine learning skills taught in this course. When contributing to the curriculum, I was careful to balance the theory with the real-world challenges of applying it to big data.

Frank Kane
Former Senior Manager,

Frank Kane,

Embrace The Details

Unit 1: The Basics

Introduction to Data Exploration

  • Describe the data mining workflow and the key traits of a successful data scientist.
  • Extract, format, and preprocess data using UNIX command-line tools.
  • Explore & visualize data.

Introduction to Machine Learning

  • Explain the concepts and applications of supervised & unsupervised learning techniques.
  • Describe categorical and continuous feature spaces, including examples and techniques for each.
  • Discuss the purpose of machine learning and the interpretation of predictive modeling results.

Unit 2: Fundamental Modeling Techniques

K-Nearest Neighbors Classification

  • Describe the setting and goal of a classification task.
  • Minimize prediction error using training & test sets, optimize predictive performance using cross-validation.
  • Understand the kNN classification algorithm, its intuition and implementation.
  • Implement the "hello world" of machine learning (kNN classification of iris dataset).

Naive Bayes Classification

  • Outline the basic principles of probability, including conditional probability and Bayes’ theorem.
  • Describe inference in the Bayesian setting, including the prior and posterior distributions and the likelihood function.
  • Understand the naive Bayes classifier and its assumptions.
  • Implement a spam filter using the naive Bayes technique.

Regression & Regularization

  • Explain the concepts of regression models, including their assumptions and applications.
  • Discuss the motivation for regularization techniques and their use.
  • Implement a regularized fit.

Logistic Regression

  • Describe the applications of logistic regression to classification problems and probability estimation.
  • Introduce the concepts underlying logistic regression, including its relation to other regression models.
  • Predict the probability of a user action on a website using logistic regression.

K-Means Clustering

  • Explain the purpose of exploratory data analysis, its applications in continuous and categorical feature spaces, and the interpretation and use of clustering results.
  • Discuss the importance of the distance function in cluster formation, as well as the importance of scale normalization.
  • Implement a k-means clustering algorithm.

Unit 3: Further Modeling Techniques

Ensemble Techniques

  • Describe general ensemble techniques such as bagging and boosting.
  • Build an enhanced classification algorithm using AdaBoost.

Decision Trees & Random Forests

  • Describe the use and construction of decision trees for classification tasks.
  • Create a random forest model for ensemble classification.

Dimensionality Reduction

  • Explain the practical and conceptual difficulties in working with very high-dimensional data.
  • Understand the application and use of dimensionality reduction techniques.
  • Draw inferences from high-dimensional datasets using principal components analysis.

Recommendation Systems

  • Explain the use of recommendation systems, and discuss several familiar examples.
  • Understand the underlying concepts, including collaborative & content-based filtering.
  • Implement a recommendation system.

Unit 4: Other Tools

Database Technologies

  • Introduce concepts and use of relational databases, alternative database technologies such as NoSQL, and popular examples of each.

Network Analysis

  • Describe the use of graphs and graph theory to analyze problems in network analysis.
  • Explore network visualization.


  • Describe the concepts of parallel computing and applications to problems in big data.
  • Introduce the map-reduce framework.
  • Implement and explore examples of map-reduce tasks.

Request a detailed syllabus

Get Syllabus
I learned more from the hands-on methods at General Assembly than I did during my entire three years in law school. GA's courses allow students to focus on the ideas that are most interesting to them and that will help propel their careers forward.

Whitney Meers, Digital Content Specialist

Students working at laptops

Upcoming in 

Sep 22 – Dec 8

Tue & Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.

Apply Now

Nov 30 – Feb 15

Mon & Wed

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.

Apply Now

Financing Options

Need payment assistance? Our financing options allow you to focus on your goals instead of the barriers that keep you from reaching them.

Let us figure out the best option for you.

¹Must be a US citizen; approval pending state of residency.
⁵Must be a US citizen; approval pending state of residency.

Financing options differ in each market. Contact a local admissions officer for more info.

Join an info session

We’re holding an info session on Monday, September 21 at 6:30pm EDT

  • Meet the GA team and potential classmates
  • Get an overview of the curriculum and course objectives
  • Get your questions answered about the course and GA
  • Discover the perks of being a student at GA

GA Washington D.C. (1776 8th Floor)

1133 15th Street NW, 8th Floor
Washington , DC 20005

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.

You’re on the list!

Keep an eye on your inbox for your ticket and we’ll see you at the event.

Get Answers

We love questions, almost as much as we love providing answers. Here are a few samplings of what we’re typically asked, along with our responses:

Q: Why is this course relevant today?

Given the prevalence of technologies and the amount of data available in the online world about users, products, and the content that we generate, businesses can be making so much more well-informed decisions if this vast amount of data was more deeply analyzed through the use of data science. The data science course provides the tools, methods, and practical experience to enable you to make accurate predictions about data, which ultimately leads to better decision-making in business, and the use of smarter technology (think recommendation systems or targeted ads).

Q: What practical skill sets can I expect to have upon completion of the course?

This course will provide you with technical skills in machine learning, algorithms, and data modeling which will allow you to make accurate predictions about your data. You will be creating your models using Python so you will gain a good grasp of this programming language. Furthermore, you will learn how to parse and clean your data which can take up to 70% of your time as a data scientist.

Q: Who will I be sitting next to in this course?

Individuals who have a strong interest in manipulating large data sets, finding patterns in data, and making predictions.

Software developers who want to solve problems that involve large data sets, such as predicting user behavior on their website, making decisions, or the best way to classify content.

Individuals with a good grasp of programming, a solid knowledge of statistics and probability but missing the intersection of them both.

Q: Are there any prerequisites?

  • A good grasp of college-level statistics and probability.
  • Ability to program in a scripting language such as Python or R.

View all FAQs

Ask Admissions

Call us at +1 (202) 517-1777
or email us.

Dig Deeper Into The Curriculum

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.

Speak with admissions about your options.

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.

Let’s keep you updated.

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.

Join our community and receive 50% off an introductory workshop up to $100.

Never Stop Learning

Join our community and receive half off any of our offerings priced below $100. (First-time subscribers only; excludes full-time courses, part-time courses, and Circuits).

Clear some room in your inbox!

We aim to be relevant. To start us off right, tell us a little more about yourself.