The Top Data Science Skills to Develop for Business
Understanding Math and Statistics for Data Science
Using Statistics in Data Science
A Brief History of Programming Languages
Using R Programming for Data Science
What Is Data Wrangling?
The Data Wrangling Process
- Clarify the data ask. The data wrangling process begins with a question, which can come from scientific researchers, academia, business leaders, and anyone else who wants to solve a problem. However, before you can dive into the data, they must clarify what’s being asked. They must confirm the time frame, types of data to collect, data subsets, and more before moving onto the next step.
- Collect the data. In some cases, you’ll need access to data that’s owned by other businesses, research firms, and governments for your wrangling projects. With legislation and pressure to protect this data, other owners may only provide scrubbed versions. The request process can take time, and if the provided data isn’t correct or is corrupted, the process has to start again.
- Remove object ambiguity. Data objects, also known as data entities, are the key data types in a dataset. A common key entity is customer ID. However, you must clarify what data will support this concept, such as customer address, bank account, and email. Without this clarification, later data modeling will be inaccurate.
- Identify relationships. This step of the data wrangling processes leverages data warehouses. Data warehouses are used for examining large amounts of historical data and can be aggregated to show relationships between various data sources.
- Create machine learning features. To leverage machine learning in the new data model, you’ll need to create features. Features are typically structured columns of data that algorithms use to find a result. If data is missing in the columns or time frames are mismatched, the model will fail.
- Explore data. The final step of data wrangling is to parse through the remaining data and remove redundancies. At this stage, redundancies can be tricky to spot. Algorithms may have a hard time selecting the correct data, which is why humans are still needed for the process.
Exploratory Data Analysis
Confirmatory Data Analysis
Data Analysis Techniques in Practice
- Explore available data.
- Remove data anomalies, outliers, and patterns.
- Transform the remaining data visually.
- Identify anomalies, outliers, and patterns.
- Repeat steps 2-4 until data is cleaned.
What Is Data Visualization?
Making Decisions with Data Visualizations
- Viewing sales volumes over time to look for seasonal changes
- Monitoring customer satisfaction rating
- Watching production costs across product lines
- Reporting how customers use the company’s website
Tools for Data Visualization
Data Communication: Effectively Explaining Findings to Management
Inspiring Others with Data
- Speak simply and in business terms. Most employees and executives won’t know programming or advanced mathematical languages. Even with a basic understanding, management may struggle to relate the findings to the business. Data scientists can provide context by speaking the “business language.” Using the same terms will foster collaboration and increase understanding.
- Use emotional storytelling techniques. Numbers don’t tell a story; humans do. That stories over 4,000 years old are still being told today is evidence of our need for stories. To tell an emotionally engaging story, you should structure data in four parts: the situation, problem, solution, and next steps. In each section, remember who the character of the story is and how you’re trying to help the character.