Skip to content

Files

Latest commit

7eaf518 · Aug 7, 2024

History

History
32 lines (25 loc) · 1.82 KB

File metadata and controls

32 lines (25 loc) · 1.82 KB

Module 1: Data Collection

Overview

In this module, we focus on the initial steps of data collection, gathering information from various sources to identify emerging skills and trends in demand. We utilize APIs and web scraping techniques to collect data in different formats. The collected data will then be prepared for further analysis in subsequent modules.

Objectives

  • Work with data in various formats.
  • Collect data from multiple sources including job postings, training portals, and surveys.
  • Prepare the collected data for analysis.

Contents

1. Collecting Data Using APIs

  • Notebook: 01-API_Data_Collection.ipynb
    • Description: This notebook demonstrates how to collect data using APIs with the help of 01-Jobs_API.ipynb.
    • Output: 01-API_Job_Postings.xlsx, 01-API_Languages.xlsx

2. Collecting Data Using Web Scraping

  • Notebook: 02-Web_Scraping_Data_Collection.ipynb
    • Description: This notebook demonstrates how to collect data using web scraping techniques.
    • Output: 02-Web_Scraped_Languages.csv

3. Exploring the Data

  • Notebook: 03-Data_Exploration.ipynb
    • Description: This notebook provides an initial exploration of the collected data to understand its structure and content.

Key Points

  • Data collection is the first step in solving any analysis problem.
  • Different methods like APIs and web scraping are used to gather data.
  • The collected data is saved in various formats like CSV and Excel for further analysis.

Summary

Module 1 covers the essential task of data collection using APIs and web scraping. The collected data is explored to understand its structure and content, laying the groundwork for data wrangling and analysis in the following modules. The notebooks provide detailed steps for each process, ensuring the data is ready for subsequent analysis.