Data Category Archives - General Assembly Blog | Page 3

7 Tips to Learn Tableau Fast

By

Featuring Insights From Iun Chen & Vish Srivastava

Read: 2 Minutes

Let’s get it straight: How difficult is it to learn Tableau for a complete beginner? Are there shortcuts to learning Tableau? Any tips, tricks, or time-saving work-arounds? Thankfully, the answer is yes. Try these top tips, approved by our expert instructors, and start data viz now.

“It’s a little overwhelming at first but as soon as you understand the basics, like what are dimensions and measures, everything falls into place pretty quickly,” says Vish Srivastava, product leader at Evidation Health and GA instructor.

“In essence, you need to understand two things: The basics on how data works — for example, what are common formats of data and what is a primary key? And a basic understanding of data visualization in a business setting. Can you answer the question: When is a time series vs. a pie chart valuable for decision making?”

But can you really learn the basics of Tableau in an afternoon?

“The best way to learn is to download a sample dataset and dive right in and start creating data visualizations. To keep going from there, check out various portfolios online to get inspiration, and try to build those.”

According to Iun Chen, who conducts internal Tableau training at LinkedIn, Tableau is easy to learn, but hard to master.

“The basic concepts of charting and color theory are easy to pick up and can take just a few weeks. However, if you are looking to be a subject matter expert, this can take years to perfect,” she says. 

Chen preps students in her Intro to Data Analytics course to achieve close-to-mastery in these key areas.

  1. Can they quickly prep and analyze large volumes of data?
  2. Identify key information and determine the best visual method to present them?
  3. Take business questions and determine which visualizations to use?
  4. Translate raw datasets to storylines with a beginning, middle, and end? 
  5. Format charts, graphs, titles, text, and images for a polished deliverable? 
  6. Articulate best practices on design and visualization techniques?
  7. Provide feedback on ineffective visualizations and how to improve them?

    This checklist is the closest thing to a Tableau cheat sheet you’ll find. Prioritize these skills, and you’ll waste no time learning Tableau. Now that you know what you need to succeed, you can choose whether to take our Data Analytics course fast or slow. Learn Tableau — along with data analytics tools SQL and Excel — in a 1-week accelerated format, or over 10 weeks in the evening.

Chen sums it up perfectly: “As long as you are actively learning, applying your learnings, and ensuring innovation of your work, you will be a data visualization expert in no time.”

Want to learn more about Iun?
https://www.linkedin.com/in/iunchen 

Want to learn more about Vish? https://www.linkedin.com/in/vishrutps

Top 3 Reasons To Learn Tableau

By

Featuring Insights From GA instructor Candace Pereira-Roberts

Read: 2 Minutes

Do you communicate data? Do you want to create more effective data visualizations? Tableau is the data analytics tool you’re looking for. Here are the top three reasons why you should learn how to use Tableau, the popular data viz software focused on business intelligence. Read on for the advantages of being a Tableau professional.

#1 Tableau Is Easy

Data can be complicated. Tableau makes it easy. Tableau is a data visualization tool that takes data and presents it in a user-friendly format of charts and graphs. And here’s the rub: There is no code writing required. You’ll easily master the end-to-end cycle of data analytics.


Need to showcase trends or surface findings? Tableau will make you an expert. Proficiency in business intelligence is a transferable skill that is quickly becoming the lifeblood of organizations. 

“I see students who are new to analytics learn Tableau desktop and be able to develop Tableau worksheets, interactive dashboards, and story points in a couple of weeks — essentially a complete data analysis project,” says Candace Pereira-Roberts, FinServ data engineer and one of our Data Analytics course instructors. She adds, “I like to share knowledge and watch people grow. I learn from my students as well.” 

 #2 Tableau Is Tremendously Useful

Would you rather tell visual stories with data? Or present the same old boring reports and tables? Is that even a question?

“Anyone who works in data should learn tools that help tell data stories with quality visual analytics.” Full stop.

The smart data analyst, data scientist, and data engineer were quick to adopt and use Tableau tool by tool, and it has given those roles a key competitive advantage in the recent data-related hiring frenzy. But their secret is out. And the advantages go beyond the usual tech roles. Having a working knowledge of data, and specifically knowing how to use Tableau, can help many more tech professionals become more attractive to recruiters and hiring managers.

Plus, it has a built-in career boost. Tableau’s visualizations are so elegant, you’ll be confident presenting the business intelligence and actionable insights to key stakeholders. Improving your presentation skills is par for the course.

#3 Tableau Data Analysts Are in Demand

As more and more businesses discover the value of data, the demand for analysts is growing. One advantage of Tableau is that it is so visually pleasing and easy for busy executives — and even the tech-averse — to use and understand. Tableau presents complicated and sophisticated data in a simple visualization format. In other words, CEOs love it.

Think of Tableau as your secret weapon. Once you learn it, you can easily surface critical information to stakeholders in a visually compelling format. That will make you a rockstar in any organization. 

“Tableau helps organizations leverage business intelligence to become more data-driven in their decision-making process.” Pereira-Roberts says. She recommends participating in Makeover Monday to take your skills to an even higher level. 


Want to learn more about Candace? Check out her thoughts on how to become a business intelligence analyst, or connect with her on LinkedIn.

What Is Data Visualization?

By

An Interview With Iun Chen

Read: 4 Minutes

Data is big, and it’s getting bigger. How do you parse and understand data when the sheer amount of information can be overwhelming? The answer is data visualization. Using concepts of design theory like elements of color and layout, the discipline of data visualization, or data viz, is essentially the graphic representation of data. We called on one of our data viz experts, Iun Chen, to break it down further. 

Let’s start with an introduction and how you came to the world of data viz.

IC: I’m Iun (pronounced ‘yoon’), and I work in the data analytics space focusing on business intelligence tools and building scalable resources for LinkedIn. I also teach the 10-week Intro to Data Analytics course for GA, which includes the professional skills of SQL, Tableau, and Excel.

In college, I was a business major with a specialization in marketing and advertising. I became more interested in how the ad business model worked behind the scenes and in how software and systems worked. As a result, I worked at many major media companies in a quantitative capacity — revenue planning, ad pricing, finance, ad sales strategy. That led me into a formalized analytics route.

How do you define data visualization?

IC: Data visualization is the idea of communicating information graphically. It’s the science of information design, in which you take massive amounts of data in whatever format it comes in and use it to surface high-level insights and findings in a visually compelling way so audiences can easily understand the main points.

How does data visualization differ from data analytics?

IC: Data analytics is the process of cleaning, prepping, analyzing, and presenting data. Data visualization is part of the presenting data step and is defined as the act of visually organizing data through the use of charts, graphs, and dashboards. Concepts of data visualization are closely aligned with concepts of design theory: color, font, scale, layout, organization.

Why is data viz important?

IC: Data visualization is easy to learn but hard to master. In my classes, I heavily emphasize the design element of data visualization. It’s easy to whip together a quick bar or pie chart, but is it the best way to communicate the point you are trying to make? The goal of collecting mass amounts of data is to be able to quickly translate it into insights that can help make smart business decisions. The final form of this translation is often a chart or graph, which is why the ability to design and visualize these mass amounts of data grows as we collect more of it.

What is a data narrative?

IC: People think in stories and narratives, not in black and white figures. Just like you would share a story with a friend using a beginning, middle, and endpoint, you would do the same when sharing details about data analysis. Here’s a simple example.

  1. Beginning: Sales are down year-over-year; identify the symptoms.
  2. Middle: Furniture sales — our largest segment — are doing poorly in the last six months; conduct the analysis to investigate reasons and uncover root causes.
  3. End: Review retail store reports and conduct manufacturer visits; recommend next steps.

The key point to any data narrative is that it should present a compelling business case and surface unrealized insights to the audience. The business challenges, rationale, and next steps should be clearly presented, and people in the room should be able to walk away and know what to action on. 

Which tech roles use data visualization?

Data visualization — like data analytics — is a skill set that can be applied to any job. But if you are looking for a job that has data visualization skills as part of the function and responsibilities, look for roles like business analyst, data analyst, business intelligence analyst, data scientist, and data engineer. Keep in mind that the formal skill of data visualization is still relatively new, so depending on the maturity of the company, those functions may not be fully established yet. However, with the increase of data in the world, there’s a growing need for experts who understand data visualization techniques more and more.           

Check out this Medium post which details how Spotify’s business has evolved with the creation of their data visualization roles.

What’s the future of data visualization?

As we continue to collect more and more data, the need for people with the skills to analyze and present data becomes ever-growing and critical in the workplace environment. More companies will need to generate insights quickly to keep up with advances and competition in their respective industries. The skill of data visualization will become more and more attractive as teams and organizations seek to translate their data into insights more efficiently and effectively. The ability to work with data is increasingly critical to the success of any company in any job function. 

Iun Chen’s Recommended Data Viz Reading List

FlowingData

StorytellingWithData

InformationIsBeautiful

Tableau Public Gallery

New York Times Data Journalism

The WSJ Guide to Information Graphics

Storytelling with Data: A Data Visualization Guide for Business Professionals 

Good Charts: The HBR Guide to Making Smarter, More Persuasive Data Visualizations

Edward Tufte’s The Visual Display of Quantitative Information

Want to learn more about Iun?
https://www.linkedin.com/in/iunchen

Business Analytics Vs Data Analytics: What’s the Difference?

By

Featuring Insights From Iun Chen & Vish Srivastava

Read: 4 Minutes

Data analytics and business analytics are often confused, understandably, because both data analysts and business analysts work with data. What matters — and differentiates these two roles — is what the data is intended to do.

When comparing the roles of business analyst and data analyst, one must consider the audience. Who will be taking action based on the analyses?

Business analysts use data to improve business metrics.

Business analysts work directly with stakeholders to steer company objectives and keep the business on a successful path. They set and maintain key performance indicators for the organization. A business analyst may recommend strategies or business plans to executives, sometimes when a company is at a critical juncture, say quarterly or during a turnaround. Stakes can be high, but so can the rewards. (Think McKinsey analysts or other coveted consultancy jobs.) Business analysts are more likely to use presentation skills as they’ll need to present findings to executives and give recommendations in high-level meetings. 

Data analysts collect, extract, and analyze data.

Data analysts are more technically focused. They are responsible for getting the data and analyzing it, working with datasets and tables. For example, a data analyst at an eCommerce company may analyze customer information, aggregate email marketing lists, or use data to identify demographics for new customer acquisition plans. Data analysts are more likely to work in teams alongside marketing partners or with other technology roles such as programmers or product managers, depending on the size of the company. They also work with business partners across entire organizations, including business analysts, as needed for tasks and projects. 

Different roles mean different salaries.

Both business analysts and data analysts solve business problems. As such, they are in high demand. According to Glassdoor, the average salary for a data analyst in the U.S. is $72K. Compensation for business analysts is a bit more, averaging $79K. Of course, exact amounts depend on location and will vary from country to country. While a business analyst can command a higher salary, there is wider latitude for data analysts to carve out their niche in practically any industry. Since the function of data is increasingly integral to every enterprise, there is more flexibility for data analysts to dig into areas of the business where they can make the most difference, with more potential for creativity.  

In GA’s Intro to Data Analytics course, Iun Chen teaches SQL, Tableau, and Excel, business intelligence tools she uses in her professional role as a data analyst at LinkedIn.

“My formal job function is to build data tools for internal colleagues so they can successfully grow our business,” she says. “I create dashboards, reports, and anything else to ensure revenue keeps going up and anticipated risks go down for the company. In my experience, the skill set and mindset of the individual can define the role of a data analyst in any organization, large or small. Everyone uses data in their day to day so being able to clean, prep, analyze, and report data — regardless of what your actual job title is — is critical to not only the company’s success but your personal success as well.”

Both business analysts and data analysts are storytellers. 

Whether a business analyst’s more strategic and decision-making role is for you, or the technical, numbers-crunching, team-playing data analyst sounds more your speed, know that the two roles share one crucial skill: They use data to tell stories. Those stories lend insights that factor into decisions that affect the bottom line. Translating raw data into digestible and human narratives can be one of the most challenging skills for analysts to master, according to Vish Srivastava, who’s led multidisciplinary teams across tech sectors. So how does an analyst develop this multifaceted skill and set their career on the path for success?

“My recommendation is twofold,” he says. “One, always start your analysis with a hypothesis that you’re testing. You need to know right out of the gate why your analysis is going to matter. Two, after you’ve spent some time with your data, step away and write down your presentation storyline in three to five bullets. The final bullet should be your recommended next step. Of course, make sure you have the analysis and charts to back up your storyline and fill in the gaps as needed.”

When it comes to storytelling with data, the difference between a boring story and a compelling one can come down to data visualization. The tools at your disposal and your proficiency with them can make or break a presentation. Communicating the insights for business intelligence hinges on clear and impactful data viz, whether we’re talking business analytics or data analytics.

One classic example of data visualization’s power is the cholera map by John Snow, an early pioneer of disease mapping. “This is a beautiful example of how collecting data and visually presenting it can generate amazing insight,” says Srivastava. “In this case, the insight was that the sewer systems were spreading disease. This informed public policy and saved so many lives.”

The future of business intelligence will be determined by the democratization of data.

The prevalence of data and its part in tech careers is changing. To hear Srivastava tell it, future conversations on business intelligence will center less on the specificities of data analysis vs. business analysis and more on how data is creeping into even more roles.

“We’ve come a long way, but there is still far to go for data analysis skills to be deeply embedded in all functions across a company. In the future, I think we will see fewer dedicated teams for business analysis and data analysis; instead, all professionals will have these skills and utilize them daily. This democratization of data analysis will be incredibly powerful. It will create even more emphasis on making high-quality data available across every enterprise.”

Want to learn more about Iun?

https://www.linkedin.com/in/iunchen 
Want to learn more about Vish?
https://www.linkedin.com/in/vishrutps

Tableau vs. Power BI

By

Featuring Insights From Matt Brems

Read: 2 Minutes

Tableau and Power BI are powerful tools for business intelligence, with capabilities to take loads of big data and create elegant visualizations that convey key insights to stakeholders in easily digestible presentations. Both help organizations leverage business intelligence to become more data-driven in their decision-making process. So which tool is better? We asked a few industry experts their thoughts on the data analysis tools Tableau and Power BI. Here’s what they had to say.

Candace Pereira-Roberts, Data Engineer & GA Data Analytics Instructor

“Anyone who works in data should learn tools that help tell data stories with quality visualizations. Tableau is a wonderful tool for the technical and nontechnical to build these visualizations. I love how we teach the Tableau unit in the Data Analytics bootcamp. I see students who are new to analytics learn Tableau desktop and be able to develop Tableau worksheets, dashboards, and story points in a couple of weeks to do a complete analysis project.”

Iun Chen, GA Instructor & Data Analyst at LinkedIn 

“In my professional capacity, I lead data visualization workshops to share best practices on charting and design theory, with a focus on Tableau. But with the growth of big data analytics, there are more players in the data viz space. Looker. Qlik, Domo, and Microstrategy are a few with out-of-the-box solutions. Check out other marketplace BI and analytics leaders and their reviews at Gartner.

Alternatively, if you are up for the challenge you can start from scratch and build out completely customized solutions through coding packages, such as with Python plotting libraries Matplotlib, Pandas, and Seaborn.”

Matt Brems, GA Instructor & Data Consultant at BetaVector 

“Most data analyst roles will expect some experience with data visualization. They may prefer your visualization experience be tied to a certain tool like Tableau or Power BI or simply want you to have experience designing graphics or dashboards. As with any platform, the human element is key. A good data analyst is curious and detail-oriented. Diving into the data and spotting anomalies or identifying patterns requires curiosity. Looking at large datasets for long periods of time can invite mistakes, so being detail-oriented ensures you’re interpreting the data correctly.” 

Vish Srivastava, GA Instructor & Product Leader at Evidation Health

 “Most teams I’ve seen are not comparing Tableau and Power BI. Instead, it’s more about whether to adopt a business intelligence tool at all, or whether to use Tableau or Power BI in place of Excel. Tableau is a great option when you need to quickly create data visualizations.Tableau is incredibly powerful because it’s designed for nontechnical users, meaning business users can set up and tweak dashboards and charts without the support of engineering or data science teams.”

When it comes to research, the most common data analytics tool is SQL — no surprise there. But once you get into more niche industries, that can vary, says Brems.

“In academia, R is probably the most prevalent data analysis tool, though Python is quickly gaining popularity. SAS and Stata are often used in specific industries, though their popularity is diminishing. (R and Python are open source tools, which means, among other things, that they are free.)”

Want to learn more about Candace?
https://www.coursereport.com/blog/how-to-become-a-business-intelligence-analyst
https://generalassemb.ly/instructors/candace-roberts/13840
www.linkedin.com/in/candaceproberts

Want to learn more about Iun?
https://www.linkedin.com/in/iunchen 

Want to learn more about Matt?
https://betavector.com/
https://www.linkedin.com/in/matthewbrems

Want to learn more about Vish?
 https://www.linkedin.com/in/vishrutps

Today’s Best Data Analytics Tools

By

Featuring Insights From Matt Brems

Read: 3 Minutes

Our Data-Driven World

We live in a world of data — swimming in statistics, numbers, information — and the amount of data seems to be growing faster than we can keep up. More people are using data points to make decisions large and small. From which restaurant has the highest Yelp rating to which city has the lowest rates of COVID-19, using data to navigate everyday life is now the norm. Indeed, the pandemic has only increased our reliance on data. We have come to expect this tsunami of data to explain, and in some cases solve, many of the most vexing problems faced by society today. But finding key insights takes careful analysis of a staggering amount of data. No small feat.

It’s true that more data is released than ever before. In the U.S., there are currently over 290,000 datasets on data.gov alone. Clearly, there’s a growing need for data analysts and the data analytics tools that help us understand these numbers. From small businesses to the highest levels of governments, decisions turn on interpretations of data. Big data can have big consequences.
 

So how do data analysts find the insights lurking in a database? And what are the best tools to analyze all those numbers? Read on to discover the best data analytics tools in the market.

Data scientist and GA instructor since 2016, Matt Brems currently runs a data science consultancy called BetaVector. We asked him to share his go-to data analysis tools. “People who want to analyze data use many different tools; I like to break these down into three different types,” he says.

Let’s get to it.

Type #1: Tabular Data Tools

Data analysts need to get data out of databases and analyze that information. And to do that, they use tabular data tools. According to Brems, the most important ones to know are Microsoft Excel, Google Sheets, and SQL, or Structured Query Language. Generally considered the best data analysis tool for research, SQL is the most common qualification found in job descriptions for a data analyst.

“Most data that data analysts analyze comes in the form of a table, called tabular data. This just means that data is organized into rows and columns, like a spreadsheet. Most data analysts will use a spreadsheet tool like Microsoft Excel or Google Sheets. When working with significant amounts of data (large tables, many tables, or both), organizations will often use a database. In order to interact with most databases, SQL is by far the language of choice.”

Type #2: Programming Language Tools

Proficiency in a few programming tools, while not a prerequisite for basic data analysis, can give analysts the ability to perform a wide variety of tasks. While the needed programming language tools will vary from company to company and even from job to job, having this skill set as a data analyst is clearly an advantage for job seekers.

“Python and R are the most common programming language tools in data analysis, though Stata and SAS are also used in some industries. These tools can be used to perform automation, statistical modeling, forecasting, and visualization.”

Type #3: Data Visualization Tools

Since data analysts are frequently tasked with presenting results to stakeholders, a good data visualization tool is essential. Brems recommends Tableau and Microsoft PowerBI.

“While you can visualize data using programming languages, Tableau and PowerBI are two standalone tools that are used almost exclusively for the purposes of building static data visualizations and dashboards.”

A Note on Research 

When it comes to research, the most common data analytics tool is SQL — no surprise there. But once you get into more niche industries, that can vary, says Brems.

“In academia, R is probably the most prevalent data analysis tool, though Python is quickly gaining popularity. SAS and Stata are often used in specific industries, though their popularity is diminishing. (R and Python are open source tools, which means, among other things, that they are free.)”

Want to learn more about Matt?

https://betavector.com/

https://www.linkedin.com/in/matthewbrems

Beginner’s Python Cheat Sheet

By

Do you want to be a data scientist? Data Science and machine learning are rapidly becoming a vital discipline for all types of businesses. An ability to extract insight and meaning from a large pile of data is a skill set worth its weight in gold. Due to its versatility and ease of use, Python programming has become the programming language of choice for data scientists.

In this Python crash course, we will walk you through a couple of examples using two of the most-used data types: the list and the Pandas DataFrame. The list is self-explanatory; it’s a collection of values set in a one-dimensional array. A Pandas DataFrame is just like a tabular spreadsheet, it has data laid out in columns and rows.

Let’s take a look at a few neat things we can do with lists and DataFrames in Python!
Get the PDF here.

BEGINNER’S Python Cheat Sheet

Lists

Creating Lists

Let’s start this Python tutorial by creating lists. Create an empty list and use a for loop to append new values. What you need to do is:

#add two to each value
my_list = []
for x in range(1,11):
my_list.append(x+2)

We can also do this in one step using list comprehension:

my_list = [x + 2 for x in range(1,11)]

Creating Lists with Conditionals

As above, we will create a list, but now we will only add 2 to the value if it is even.

#add two, but only if x is even
my_list = []
for x in range(1,11):
if x % 2 == 0:
my_list.append(x+2)
else:
my_list.append(x)

Using a list comp:

my_list = [x+2 if x % 2 == 0 else x \
for x in range(1,11)]

Selecting Elements and Basic Stats

Select elements by index.

#get the first/last element
first_ele = my_list[0]
last_ele = my_list[-1]

Some basic stats on lists:

#get max/min/mean value
biggest_val = max(my_list)
smallest_val = min(my_list)avg_val = sum(my_list) / len(my_list)

DataFrames

Reading in Data to a DataFrame

We first need to import the pandas module.

import pandas as pd

Then we can read in data from csv or xlsx files:

df_from_csv = pd.read_csv(‘path/to/my_file.csv’,
sep=’,’,
nrows=10)
xlsx = pd.ExcelFile(‘path/to/excel_file.xlsx’)
df_from_xlsx = pd.read_excel(xlsx, ‘Sheet1’)

Slicing DataFrames

We can slice our DataFrame using conditionals.

df_filter = df[df[‘population’] > 1000000]
df_france = df[df[‘country’] == ‘France’]

Sorting values by a column:

df.sort_values(by=’population’,
ascending=False)

Filling Missing Values

Let’s fill in any missing values with that column’s average value.

df[‘population’] = df[‘population’].fillna(
value=df[‘population’].mean()
)

Applying Functions to Columns

Apply a custom function to every value in one of the DataFrame’s columns.

def fix_zipcode(x):
”’
make sure that zipcodes all have leading zeros
”’
return str(x).zfill(5)
df[‘clean_zip’] = df[‘zip code’].apply(fix_zipcode)

Ready to take on the world of machine learning and data science? Now that you know what you can do with lists and DataFrames using Python language, check out our other Python beginner tutorials and learn about other important concepts of the Python programming language.

8 Tips for Learning Python Fast

By

It’s possible to learn Python fast. How fast depends on what you’d like to accomplish with it and how much time you can allocate to study and practice Python on a regular basis. Before we dive in further, I’d like to establish some assumptions I’ve made about you and your reasons for reading this article:

First, I’ll address how quickly you should be able to learn Python. If you’re interested in learning the fundamentals of Python programming, it could take you as little as two weeks to learn, with routine practice.

If you’re interested in mastering Python in order to complete complex tasks or projects or spur a career change, then it’s going to take much longer. In this article, I’ll provide tips and resources geared toward helping you gain Python programming knowledge in a short timeframe.

If you’re wondering how much it’s going to cost to learn Python, the answer there is also, “it depends”. There is a large selection of free resources available online, not to mention the various books, courses, and platforms that have been published for beginners.

Another question you might have is, “how hard is it going to be to learn Python?” That also depends. If you have any experience programming in another language such as R, Java, or C++, it’ll probably be easier to learn Python fast than someone who hasn’t programmed before.

But learning a programming language like Python is similar to learning a natural language, and everyone’s done that before. You’ll start by memorizing basic vocabulary and learning the rules of the language. Over time, you’ll add new words to your repertoire and test out new ways to use them. Learning Python is no different.

By now you’re thinking, “Okay, this is great. I can learn Python fast, cheap, and easily. Just tell me what to read and point me on my way.” Not so fast. There’s a fourth thing you need to consider and that’s how to learn Python.

Research on learning has identified that not all people learn the same way. Some learn best by reading, while others learn best by seeing and hearing. Some people enjoy learning through games rather than courses or lectures. As you review the curated list of resources below, consider your own learning preferences as you evaluate options.

Now let’s dig in. Below are my eight tips to help you learn Python fast.

Continue reading

Data Literacy for Leaders

By

For years, the importance of data has been echoed in boardroom discussions and listed on company roadmaps. Now, with 99% of businesses reporting active investment in big data and AI, it’s clear that all businesses are beginning to recognize the power of data to transform our world of work.

While all leaders recognize the needs and benefits of becoming data-driven, only 24% have successfully created a data-driven organization. That is because transformation is not considered holistically and instead leaders focus on business, tools and technology and talent in silos. Usually leaving skill acquisition amongst leaders and the broader organization for last. It’s no wonder that 67% of leaders say they are not comfortable accessing or using data.

We’ve worked with businesses, such as Bloomberg, to help them gain the skills they need to successfully leverage data within their organizations & we haven’t left leaders out of the conversation. In fact, we know that leaders are crucial to the success of data transformation efforts & just like their teams, they need to be equipped with the skills to understand and communicate with data.

Why Should I Train My Leaders on Data?

When embarking on a data transformation, we always recommend that leaders be trained as the first step in company-wide skill acquisition. We recommend this approach for a few reasons:

  • Leaders Need to Understand Their Role in Data Transformation:  Analytics can’t be something data team members do in a silo. They need to be fully incorporated into the business, rather than an afterthought. However, businesses will struggle to make that change if every leader does not understand his or her responsibility in data transformation.
  • Leadership Training Shows a Commitment to Change: According to New Vantage Partners, 92% of data transformation failures are attributed to the inability of leaders to form a data-driven culture. In order for your employees to truly become data-driven, they have to be able to see a real commitment from leaders to organizational goals and operational change. Training your leaders first sends that message that data is here to stay. 
  • Leaders Need to Be Prepared to Work With Data-Driven Teams: Increasingly, leaders are expected to make data-driven decisions that impact the success of the organization. Without literacy, leaders will continue to feel uncomfortable communicating with and using data to make decisions. This discomfort will trickle down to employees and real change will never be felt. 

Just like your broader organization, leaders cannot be expected to understand the role they play or the importance of data transformation without proper training. 

What Does Data Literacy For Leaders Look Like? 

Leaders need to be able to readily identify opportunities to use data effectively. In order to get there leaders need to:

Build a Data-Driven Mindset:

While every leader brings a wealth of experience to your org, many leaders are not data natives, and it can be a big leap to make this shift in thinking. Training leaders all at once gives you the opportunity to get your leaders on the same page and build a shared understanding and vocabulary.

So what does building a data-driven mindset look like in practice? To truly have a data-driven mindset leaders must be aware of the data landscape, as well as the opportunity of data, be mindful of biases inherent in data with an eye towards overcoming that bias, as well as being curious about how data can influence our decisions.

Leaders should walk away from training with a baseline understanding of key data concepts, a shared vocabulary, knowing how data flows through an organization and be able to pinpoint where data can have an impact in the org.

Understand the Data Life Cycle

Leaders are responsible for having oversight of every phase of the data life cycle and must be able to help teams weed out bias at any point. Without this foundation, leaders will have a hard time knowing where to invest in a data transformation and how to lead projects and teams.

All leaders should be equipped to think about and ask questions about each phase of the life cycle. For example:

  • Data Identification: What data do we have, and what form is it in? 
  • Data Generation: Where will the data come from and how reliable is the source? 
  • Data Acquisition: How will the data get from the source to us? 

It is not the role of the leader to know where all the data comes from or what gaps exist, but being able to understand what questions to ask, is important to acquire the necessary insights to inform a sound business strategy.

Get to Know the Role of Data Within the Org

In an organization that’s undergoing a data transformation, there’s no shortage of projects that could command a leader’s attention and investment. Leaders must be equipped to understand where to invest to put their plans into action.

Based on existing structure, leaders need to understand the key data roles, such as data analysts or machine learning engineers, why they are important and how they differ. Once a leader has the knowledge of the data teams, they will be able to identify the opportunity of data within their team and role.

Make Better Data-Driven Decisions

Leaders who rely on intuition alone run the huge risk of being left behind by competitors that use data-driven insights. With more and more companies adjusting to this new world order, it’s imperative that leaders become more data literate in order to make important business-sustaining decisions moving forward. 

Leaders should walk away from training with a baseline understanding of key data concepts, a shared vocabulary, knowing how data flows through an organization and be able to pinpoint where data can have an impact in the org.

Getting Started With Leadership Training 

Including data training specifically for your leaders in your data transformation efforts is crucial. While leaders are busy tackling other important business initiatives, they, just like the rest of your organization must be set up with the right skills to successfully meet the future of work. Investment in data skills for leaders will help you to forge a truly data-driven culture and business.

To learn more about how GA equips leaders and organizations to take on data transformation get in touch with us here.

15 Data Science Projects to get you Started

By

When it comes to getting a job in data science, data scientists need to think like Creatives. Yes, that’s correct. Those looking to enter this field need to have a data science portfolio of previously completed data science projects, similar to those in Creative professions. What better way to prove to your future data science team that you’re capable of being a data scientist than proving you can do the work?

A common problem for data science entrants is that employers want candidates with experience, but how do you get experience without having access to experience? Suppose you’re looking to get that first foot in the door. It will behoove you to undertake a couple of data science projects to show future employers you’ve got what it takes to use big data to identify opportunities and succeed in the field.

The good news is that we live in a time of open and abundant data. Websites like Kaggle offer a treasure trove of free data for deep learning on everything from crime statistics to Pokemon to Bitcoin and more. However, the wealth of easily accessible data can be overwhelming, which is why we’ve taken it upon ourselves to present 15 data science projects you can execute in Python to showcase and improve your skills in data analytics. Our data science project ideas cover various topics, from Spotify songs to fake news to fraud detection and techniques such as clustering, regression, and natural language processing.

Before you dive in, be sure to adhere to these four guidelines no matter which data science project idea you choose:

1. Articulate the Problem and/or Scenario

It’s not enough to do a project where you use “X” to predict “Y”; you need to add some context to your work because data science does not occur in a vacuum. Tell us what you’re trying to solve and how data science can address that. Employers want to know if you can turn a problem into a question and a question into a solution. A good place to start is to depict a real-world scenario in which your data project would be useful.

2. Publish & Explain Your Work

Create a GitHub repository where you can upload your Jupyter Notebooks and data. Write a blog post in which you narrate your project from start to finish. Talk about the problem or question at the heart of the project, and explain your decision to clean the data in a certain way or why you decided to use a certain algorithm. Why all this? Potential employers need to understand your methodology.

3. Use Domain Expertise

If you’re trying to break into a specific field such as finance, health, or sports, use your knowledge of this area to enhance your project. This could mean deriving a useful question to a pressing problem or articulating a well-thought-out interpretation of your project’s results. For example, if you’re looking to become a data scientist in the finance sector, it would be worthwhile to show how your methods can generate a return on investment.

4. Be Creative & Different

Anyone can copy and paste code that trains a machine learning algorithm. If you want to stand out, review existing data science projects that use the same data and fill in the gaps left by them. If you’re working on a prediction project, try coming up with an unexpected variable that you think would be beneficial.

Data Science Projects

1. Titanic Data

Working on the Titanic dataset is a rite of passage in data science. It’s a useful dataset that beginners can work with to improve their feature engineering and classification skills. Try using a decision tree to visualize the relationships between the features and the probability of surviving the Titanic.

2. Spotify Data

Spotify has an amazing API that provides access to rich data on their entire catalog of songs. You can grab cool attributes such as a song’s acoustics, danceability, and energy. The great thing about this data source is that the project possibilities are almost endless. You can use these features to try to predict genre or popularity. One fun idea would be to better understand your music by training a machine learning classifier on two sets of songs; songs you like and songs you do not.

3. Personality Data Clustering

You’ve probably heard the phrase, “There are X types of people.” Well, now you can actually find out how many types of people there really are. Using this dataset of almost 20k responses to the Big Five Personality Test, you can actually answer this question. Throw this data into a clustering algorithm such as KMeans and sort this into K number of groups. Once you decide on the optimal number of clusters, it’s incumbent on you to define each cluster. Come up with labels that add meaning to each group, and don’t be afraid to use plenty of charts and graphs to support your interpretation.

4. Fake News

If you are interested in natural language processing, building a classifier to differentiate between fake and real news is a great way to demonstrate that. Fake news is a problem that social media platforms have been struggling with for the past several years and a project that tackles this problem is a great way to show you care about solving real-world problems. Use your classifier to identify interesting insights about the patterns in fake versus real news; for example, tell us which words or phrases are most associated with fake news articles.

5. COVID-19 Dataset

There probably isn’t a more relevant use of data science than a project analyzing COVID-19. This dataset provides a wealth of information related to the pandemic. It provides a great opportunity to show off your exploratory data analysis chops. Take a deep dive into this data, and through data visualization unearth patterns about the rate of COVID infection by county, state, and country.

6. Telco Customer Churn

If you’re looking for a straightforward project that is extremely applicable to the business world, then this one’s for you. Use this dataset to train a classifier that predicts customer churn. If you can show employers you know how to prevent customers from leaving their business, you’ll most definitely grab their attention. Pro tip: this is a great projection to show your understanding of classification metrics besides accuracies, such as precision and recall.

7. Lending Club Loans

Like the Telco project, the Lending Club loan dataset is extremely relevant to the business world. Here you can train a classifier that predicts whether or not a Lending Club loanee will pay back a loan using a wealth of information such as credit score, loan amount, and loan purpose. There are a lot of variables at your disposal, so I’d recommend starting with a handful of features and working your way up from there. See how far you can get with just the basics.

Also, this is a fairly untidy dataset that will require extensive cleaning and feature engineering, which is a good thing because that is often the case with real-world data. Be sure to explain your methodology behind preparing your dataset for the machine learning algorithm — this informs the audience of your domain expertise.

8. Breast Cancer Detection

This dataset provides a simpler classification scenario in which you can use health-related variables to predict instances of breast cancer. If you’re looking to apply your data science skills to the medical field, this is certainly worth a shot.

9. Housing Regression

If classification isn’t your thing, then might I recommend this ready-made regression project in which you can predict home prices using variables like square footage, number of bedrooms, and year built. A project such as this can help you understand the factors driving home sales and let you get creative in your feature engineering. Try to involve outside data that can serve as proxies for quality of life, education, and other things that might influence home prices. And if you want to show off your scraping skills, you can always create your dataset by scraping Zillow.

10. Seeds Clustering

The seeds dataset from UCI provides a simple opportunity to use clustering. Use the seven attributes to sort the 210 seeds into K number of groups. If you’re looking to go beyond KMeans, try using hierarchical clustering, which can be useful for this dataset because the low number of samples can be easily visualized with a dendrogram.

11. Credit Card Fraud Detection

Another project idea for those of you intent on using business world data is to train a classifier to predict instances of credit card fraud. The value of this project to you comes from the fact that it’s an imbalanced dataset, meaning that one class vastly outweighs the other (in this case, non-fraudulent transactions versus fraudulent). Training a model that is 99% accurate is essentially useless, so it’s up to you to use non-accuracy metrics to demonstrate the success of your model.

12. AutoMPG

This is a great beginner regression project in which you can use car features to predict their fuel efficiency. Given that this data is from the past, an interesting idea you can use is to see how well this model does on data from recent cars to show how car fuel efficiency has evolved over the years.

13. World Happiness

Using data science to unlock what’s behind happiness? Maybe you can with this dataset on world happiness rankings. You can go a number of ways with this project; you can use regression to predict happiness scores, cluster countries based on socio-economic characteristics, or visualize the change in happiness throughout the world from 2015 to 2019.

14. Political Identity

The Nationscape Data Set is an absolute goldmine of data on the demographics and political identities of Americans. If you’re a politics junkie, it’ll be sure to satisfy your fix. Their most recent round of data features over 300,000 instances of data collected from extensive surveys of Americans. If you’re interested in using demographic information for political ideology or party identification this is the dataset for you. This is an especially great project to flex your domain expertise in study design, research, and conclusion. Political analysis is replete with shoddy interpretations that lack empirical data analysis, and you could use this dataset to either confirm or dispel them. But be warned that this data will require plenty of cleaning, which you’ll need to get used to, given that’s the majority of the job.

15. Box Office Prediction

If you’re a movie buff, then we’ve got you covered with the TMDB dataset. See if you can build a workable box office revenue prediction model trained on 5000 movies worth of data. Does genre actually correlate with box office success? Can we use runtime and language to help explain the variation in the revenue? Find out the answers to those questions and more with this project.