Data
Science

11-week technology course

Apply Now Request Info

Learn Core Skills

Math and programming skills

Applying your math and programming skills to make meaning out of large data sets

Data manipulation tools

Learning how to analyze and manipulate data with Python

Learn to make predictions with modeling

Learning how to make predictions about data using fundamental modeling techniques that will help you make better informed business decisions

My team at Amazon couldn't have built its recommendation system without the foundational data mining and machine learning skills taught in this course. When contributing to the curriculum, I was careful to balance the theory with the real-world challenges of applying it to big data.

Frank Kane
Former Senior Manager, Amazon.com

Frank Kane, Amazon.com

Embrace The Details

Unit 1: The Basics

Introduction to Data Exploration

  • Describe the data mining workflow and the key traits of a successful data scientist.
  • Extract, format, and preprocess data using UNIX command-line tools.
  • Explore & visualize data.

Introduction to Machine Learning

  • Explain the concepts and applications of supervised & unsupervised learning techniques.
  • Describe categorical and continuous feature spaces, including examples and techniques for each.
  • Discuss the purpose of machine learning and the interpretation of predictive modeling results.

Unit 2: Fundamental Modeling Techniques

K-Nearest Neighbors Classification

  • Describe the setting and goal of a classification task.
  • Minimize prediction error using training & test sets, optimize predictive performance using cross-validation.
  • Understand the kNN classification algorithm, its intuition and implementation.
  • Implement the "hello world" of machine learning (kNN classification of iris dataset).

Naive Bayes Classification

  • Outline the basic principles of probability, including conditional probability and Bayes’ theorem.
  • Describe inference in the Bayesian setting, including the prior and posterior distributions and the likelihood function.
  • Understand the naive Bayes classifier and its assumptions.
  • Implement a spam filter using the naive Bayes technique.

Regression & Regularization

  • Explain the concepts of regression models, including their assumptions and applications.
  • Discuss the motivation for regularization techniques and their use.
  • Implement a regularized fit.

Logistic Regression

  • Describe the applications of logistic regression to classification problems and probability estimation.
  • Introduce the concepts underlying logistic regression, including its relation to other regression models.
  • Predict the probability of a user action on a website using logistic regression.

K-Means Clustering

  • Explain the purpose of exploratory data analysis, its applications in continuous and categorical feature spaces, and the interpretation and use of clustering results.
  • Discuss the importance of the distance function in cluster formation, as well as the importance of scale normalization.
  • Implement a k-means clustering algorithm.

Unit 3: Further Modeling Techniques

Ensemble Techniques

  • Describe general ensemble techniques such as bagging and boosting.
  • Build an enhanced classification algorithm using AdaBoost.

Decision Trees & Random Forests

  • Describe the use and construction of decision trees for classification tasks.
  • Create a random forest model for ensemble classification.

Dimensionality Reduction

  • Explain the practical and conceptual difficulties in working with very high-dimensional data.
  • Understand the application and use of dimensionality reduction techniques.
  • Draw inferences from high-dimensional datasets using principal components analysis.

Recommendation Systems

  • Explain the use of recommendation systems, and discuss several familiar examples.
  • Understand the underlying concepts, including collaborative & content-based filtering.
  • Implement a recommendation system.

Unit 4: Other Tools

Database Technologies

  • Introduce concepts and use of relational databases, alternative database technologies such as NoSQL, and popular examples of each.

Network Analysis

  • Describe the use of graphs and graph theory to analyze problems in network analysis.
  • Explore network visualization.

Map-Reduce

  • Describe the concepts of parallel computing and applications to problems in big data.
  • Introduce the map-reduce framework.
  • Implement and explore examples of map-reduce tasks.
I learned more from the hands-on methods at General Assembly than I did during my entire three years in law school. GA's courses allow students to focus on the ideas that are most interesting to them and that will help propel their careers forward.

Whitney Meers, Digital Content Specialist

Students working at laptops

Upcoming in 

Set as default location

Apr 28 – Jul 7


Tue, Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor

Arun Ahuja
Data Scientist, Mount Sinai

Arun Ahuja recently joined Mount Sinai to research the applications of Hadoop, data analysis, and machine learning in medicine and genetics. Prior to that, Arun was a Data Scientist at Integral Ad Science, where he investigated advertiser effectiveness and webpage classification for optimizing ad placement using machine learning. He was also in the Electronic Trading and Real Time Applications group at Morgan Stanley.


Apply Now

Apr 28 – Jul 9


Tue, Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now

May 5 – Jul 16


Tue, Thu

6pm - 9pm

£3,000 GBP

Payment plans available

Instructor

Dr. Jesús Rogel-Salazar
Principal Data Scientist, Prudential Assurance

Curiosity and enquiry has helped Jesús follow a winding path that has passed from physics, chemistry, finance, analytics and even data journalism. He obtained a PhD at Imperial College for work on quantum atom optics and has worked in academic and research posts for Imperial College London and the University of Hertfordshire, as well as for organisations such as Dow Jones and Prudential. He has published a number of academic papers and is the author of a beginners Programming book "Essential Matlab and Octave". When not working you can find Jesús tuned into sci-fi films (good and bad), watching rugby or writing bits and pieces for his blog http://quantumtunnel.wordpress.com.


Apply Now

Jun 1 – Aug 12


Mon, Wed

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor

Kevin Markham
Co-Founder, Causetown

Kevin Markham has spent his career building technology for social good, ranging from an innovative online bookstore for low-income children to a donation platform connecting companies with excess goods and the charities that need them. Most recently, he co-founded Causetown to enable small businesses to easily launch charitable promotions that drive sales while doing good. Kevin is passionate about learning from data and is excited to introduce students to the fundamentals of data science. He enjoys cooking, theatre, cycling, and participating in machine learning competitions. He graduated from Vanderbilt University with a Bachelor's degree in Computer Engineering.


Apply Now
SANTA MONICA

Jun 1 – Aug 12


Mon, Wed

7pm - 10pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now
WEDNESDAYS 7PM-10PM

Jun 6 – Aug 29


Sat

10am - 1pm

$28,000 HKD

Payment plans available

Instructor

Mart van de Ven
Data Architect, Technologist

For over a decade Mart has used web technologies to build services with a purpose. Following his Masters in Linguistics and Information Design, he developed ontologies for SAP, built social media apps with Onoko, and most recently was a Data Architect at Demand Analytics. He teaches technologies as he does linguistics: with an orientation towards uncovering the implicit structures and contexts, surpassing the technicalities, and ultimately communicating a greater purpose, above mere technical skills. He is an advocate for open source, social coding and web standards, especially given how they enrich and accelerate your web-dev learning process.


Apply Now

Jun 9 – Aug 20


Tue, Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now

Jun 15 – Aug 26


Mon, Wed

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now

Aug 18 – Oct 29


Tue, Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now

Sep 8 – Nov 19


Tue, Thu

6:30pm - 9:30pm

$4,000 USD

Payment plans available

Instructor info will be announced soon

Apply now to reserve your spot.


Apply Now

Fund Your Future

Need payment assistance? Our financing options allow you to focus on your goals instead of the barriers that keep you from reaching them.

¹Must be a US citizen; approval pending state of residency.

Financing options differ in each market. Contact a local admissions officer for more info.

Get Answers

We love questions, almost as much as we love providing answers. Here are a few samplings of what we’re typically asked, along with our responses:

Q: Why is this course relevant today?

Given the prevalence of technologies and the amount of data available in the online world about users, products, and the content that we generate, businesses can be making so much more well-informed decisions if this vast amount of data was more deeply analyzed through the use of data science. The data science course provides the tools, methods, and practical experience to enable you to make accurate predictions about data, which ultimately leads to better decision-making in business, and the use of smarter technology (think recommendation systems or targeted ads).

Q: What practical skill sets can I expect to have upon completion of the course?

This course will provide you with technical skills in machine learning, algorithms, and data modeling which will allow you to make accurate predictions about your data. You will be creating your models using Python so you will gain a good grasp of this programming language. Furthermore, you will learn how to parse and clean your data which can take up to 70% of your time as a data scientist.

Q: Who will I be sitting next to in this course?

Individuals who have a strong interest in manipulating large data sets, finding patterns in data, and making predictions.

Software developers who want to solve problems that involve large data sets, such as predicting user behavior on their website, making decisions, or the best way to classify content.

Individuals with a good grasp of programming, a solid knowledge of statistics and probability but missing the intersection of them both.

Q: Are there any prerequisites?

  • A good grasp of college-level statistics and probability.
  • Ability to program in a scripting language such as Python or R.

View all FAQs

Ask Admissions

Call us at +44 (0) 20 3818 3251
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at 1300 779 580
or email us.

Call us at 852 9473-3608
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at 1300 779 650
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Call us at +1 (877) 348-5665
or email us.

Dig Deeper Into The Curriculum

Your syllabus is on its way.

Meanwhile, we’d love to know a little more about you.

Thank you!

We have sent you a copy of our course syllabus to your email. Click here to view the document now.

If you have any other questions about the course, please call us at +1 (877) 348-5665 or email admissions.

Speak with admissions about your options.

Thanks!

Our admissions team will be in touch soon.

Let’s keep you updated.

On Notice.

We’ll be reaching out once a new session is scheduled in your city.