Skip to content

Commit 792dc5a

Browse files
authored
Merge pull request larymak#286 from gideonclottey/development
Development
2 parents 73d0c76 + 032912b commit 792dc5a

File tree

3 files changed

+107
-0
lines changed

3 files changed

+107
-0
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
Job Title,Location,Salary,Company Name
2+
SEN Tutor,"SW1, South West London, SW1A 2DD",Recently,Deckers
3+
"SW1, South West London, SW1A 2DD",Recently,�28 - �33 per hour,Deckers
4+
Recently,�28 - �33 per hour,SEN Tutor,Targeted Provision Ltd
5+
�28 - �33 per hour,SEN Tutor,"SW1, South West London, SW1A 2DD",Deckers
6+
SEN Tutor,"SW1, South West London, SW1A 2DD",Recently,Deckers
7+
"SW1, South West London, SW1A 2DD",Recently,�28 - �33 per hour,Deckers
8+
Recently,�28 - �33 per hour,Supply Chain Administrator,Deckers
9+
�28 - �33 per hour,Supply Chain Administrator,"WC2, Central London, WC2N 5DU",EMBS
10+
Supply Chain Administrator,"WC2, Central London, WC2N 5DU",Recently,Deckers
11+
"WC2, Central London, WC2N 5DU",Recently,Unspecified,CV Screen Ltd
12+
Recently,Unspecified,Accounts Payable Assistant,Deckers
13+
Unspecified,Accounts Payable Assistant,"St James, WC2N 5DU",Deckers
14+
Accounts Payable Assistant,"St James, WC2N 5DU",Recently,Webhelp UK
15+
"St James, WC2N 5DU",Recently,Unspecified,Applause IT Limited
16+
Recently,Unspecified,Total Rewards Analyst,Johnson & Associates Rec Specialists Ltd
17+
Unspecified,Total Rewards Analyst,"WC2, Central London, WC2N 5DU",Johnson & Associates Rec Specialists Ltd
18+
Total Rewards Analyst,"WC2, Central London, WC2N 5DU",Recently,Johnson & Associates Rec Specialists Ltd
19+
"WC2, Central London, WC2N 5DU",Recently,Unspecified,Johnson & Associates Rec Specialists Ltd
20+
Recently,Unspecified,SEN Tutor,Elliot Marsh
21+
Unspecified,SEN Tutor,"WC2, Central London, WC2N 5DU",Elliot Marsh
22+
SEN Tutor,"WC2, Central London, WC2N 5DU",Recently,Get Recruited (UK) Ltd
23+
"WC2, Central London, WC2N 5DU",Recently,�28 - �33 per hour,Elliot Marsh
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
import csv
2+
import requests
3+
from bs4 import BeautifulSoup
4+
5+
#Url to the jobsite (using tottal job as an examples)
6+
url = 'https://www.totaljobs.com/jobs/in-london'
7+
8+
r = requests.get(url)
9+
10+
# parsing the html to beautiful soup
11+
html_soup= BeautifulSoup(r.content, 'html.parser')
12+
13+
# Targeting the jobs container
14+
job_details = html_soup.find('div', class_='ResultsContainer-sc-1rtv0xy-2')
15+
16+
# Pulling out the needed tags
17+
job_titles =job_details.find_all(['h2','li','dl'])
18+
company_name =job_details.find_all('div', class_='sc-fzoiQi')
19+
20+
total_job_info = job_titles + company_name
21+
22+
# Writing the data to a CSV file
23+
with open('job_data_2.csv', mode='w', newline='') as file:
24+
writer = csv.writer(file)
25+
writer.writerow(['Job Title', 'Location', 'Salary', 'Company Name']) # header row
26+
min_length = min(len(job_titles), len(company_name))
27+
for i in range(0, min_length - 3):
28+
job_title = job_titles[i].text.strip()
29+
location = job_titles[i+1].text.strip()
30+
salary = job_titles[i+2].text.strip()
31+
company = company_name[i+3].text.strip()
32+
writer.writerow([job_title, location, salary, company])
33+
# print(job_title)
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# WebScraping-for-job-Website
2+
3+
In this code we are fetching information from a job website named totaljobs about job listing available, filters them out according to skills and saves the output
4+
in a local file
5+
6+
This program is able to fetch the:
7+
* Job Title/Role needed
8+
* Company name
9+
* location
10+
* salary
11+
12+
### User Story
13+
As a data analyst I want to be able to get web large information in csv file.
14+
15+
### Acceptance Criteria
16+
Acceptance Criteria
17+
18+
- It is done when I can make a request to a specified url.
19+
- It is done when I get response from that url.
20+
- It is done when I get the target content from the url.
21+
- It is done when that content is saved in csv file.
22+
23+
24+
#### Sample Output
25+
![](https://github.com/larymak/Python-project-Scripts/blob/main/WebScraping/posts/Capture.PNG)
26+
27+
### Packages used
28+
- BeautifulSoup
29+
- requests
30+
- csv file
31+
32+
### Challenges encountered:
33+
- The only real difficulty was trying to locate the precise ID and passing robots elements (such as find element by ID, x-path, class, and find_all) that would appropriately transmit the information back.
34+
- In overall our team was succussful to apply python on web scraping to complete our assignment.
35+
36+
37+
## Steps To Execution
38+
- Fork this repository and navigate to the WebScraping-Data-Analytics folder
39+
- Execute the program by running the pydatanalytics.py file using `$ python pydatanalytics.py`
40+
- The program will then fetch the information and put the information into a csv file.
41+
42+
### Team Members
43+
- [@gideonclottey](https://github.com/gideonclottey)
44+
- [@Dev-Godswill](https://github.com/Dev-Godswill)
45+
- [@ozomata](https://github.com/ozomata)
46+
- [@narinder-bit](https://github.com/narinder-bit)
47+
- [@Sonia-devi](https://github.com/Sonia-devi)
48+
49+
50+
51+

0 commit comments

Comments
 (0)