In this module, we focus on the initial steps of data collection, gathering information from various sources to identify emerging skills and trends in demand. We utilize APIs and web scraping techniques to collect data in different formats. The collected data will then be prepared for further analysis in subsequent modules.
- Work with data in various formats.
- Collect data from multiple sources including job postings, training portals, and surveys.
- Prepare the collected data for analysis.
- Notebook:
01-API_Data_Collection.ipynb
- Description: This notebook demonstrates how to collect data using APIs with the help of
01-Jobs_API.ipynb
. - Output:
01-API_Job_Postings.xlsx
,01-API_Languages.xlsx
- Description: This notebook demonstrates how to collect data using APIs with the help of
- Notebook:
02-Web_Scraping_Data_Collection.ipynb
- Description: This notebook demonstrates how to collect data using web scraping techniques.
- Output:
02-Web_Scraped_Languages.csv
- Notebook:
03-Data_Exploration.ipynb
- Description: This notebook provides an initial exploration of the collected data to understand its structure and content.
- Data collection is the first step in solving any analysis problem.
- Different methods like APIs and web scraping are used to gather data.
- The collected data is saved in various formats like CSV and Excel for further analysis.
Module 1 covers the essential task of data collection using APIs and web scraping. The collected data is explored to understand its structure and content, laying the groundwork for data wrangling and analysis in the following modules. The notebooks provide detailed steps for each process, ensuring the data is ready for subsequent analysis.