Project: BugHub - Team Collaboration & Knowledge Sharing Platform
BugHub is a SaaS pilot project aiming to integrate and analyze data from GitHub and StackOverflow to:
- Build a data pipeline from both sources.
- Visualize defect distributions over the development life cycle.
- Construct a Knowledge Graph using
NetworkX
. - Map StackOverflow questions to GitHub repositories based on tags and topics.
- Store and query data using a graph-based data model.
π DRE_QM.ipynb
- Visualizes defect distribution over software lifecycle stages using data from:
Release1.csv
Release2.csv
- Uses
matplotlib
to create stacked bar charts for:- Requirements
- Analysis
- Design
- Coding
- Unit Testing
- Integration Testing
- System Testing
- Field Errors
π Assignment_5_Part_2.ipynb
, StackOverflow_Tags.ipynb
- Retrieves and processes:
- GitHub repo metadata, issues, labels
- StackOverflow tags, questions, answers
- Uses the PyGitHub API and StackExchange API
- Implements initial knowledge graph modeling with
networkx
π Helper Scripts:
get_issues_details.py
: Gets issue metadata via GitHub APIget_repositories_details.py
: Retrieves repository-level info
π Assignment_5_Part_3 Sp.ipynb
- Combines GitHub and StackOverflow data into a unified BugHub Knowledge Graph
- Models relationships between:
- Repositories β Issues β StackOverflow Tags
- Contributors β Repos
- Tags β Questions β Answers
- Prepares schema for future ingestion into EdgeDB (EdgeQL/GraphQL)
You can install the required packages manually:
pip install pandas numpy matplotlib networkx requests PyGithub π Learning Outcomes Construct real-world data pipelines
Integrate APIs (GitHub + StackOverflow)
Build and visualize knowledge graphs using networkx
Understand lifecycle defect distribution
Prepare for using graph databases like EdgeDB
π§Ύ Resources Used GitHub REST API Docs
PyGitHub Library
StackExchange API
NetworkX Docs
EdgeDB Docs
π¨βπ« Instructor Atef Bader Course: Software Project Management