RExplain 🔍

AI-Powered GitHub Repository Analysis Platform

RExplain automatically explains any GitHub repository using static analysis, architecture extraction, and Retrieval-Augmented Generation (RAG) — making it dramatically faster to understand large codebases without reading every file manually.

Simply paste a GitHub URL and get an instant, AI-powered breakdown of the codebase.

✨ What It Does

Feature	Description
🧠 Framework Detection	Detects FastAPI, Flask, Django, React, Vue, Angular, SQLAlchemy, PostgreSQL, MongoDB, and more
🏗️ Architecture Diagrams	Generates dependency graphs and visual architecture diagrams
📄 README Extraction	Fetches and renders the repository README with syntax-highlighted code blocks
📊 Commit Analytics	Visualizes commit history, frequency, and contribution graphs
🤖 AI Chatbot (RAG)	Ask natural-language questions about any repository
⚡ Instant Reloads	PostgreSQL-backed cache skips re-analysis on unchanged repositories

🧱 Tech Stack

Backend

FastAPI — REST API framework
PostgreSQL + SQLAlchemy — persistent storage and ORM
SentenceTransformers (all-MiniLM-L6-v2) — embedding generation
Groq API — LLM inference
Graphviz — architecture diagram rendering
GitHub REST API — repository metadata and tree fetching
Docker — containerized deployment

Frontend

React + Vite — fast frontend build
TailwindCSS — utility-first styling
Framer Motion — animations
React Markdown — README rendering
Axios — HTTP client

🚀 How It Works

GitHub URL
   ↓
Repository metadata extraction
   ↓
Selective file fetching (GitHub Tree API)
   ↓
Framework detection
   ↓
Architecture generation
   ↓
README + intelligence extraction
   ↓
RAG indexing
   ↓
Interactive AI chat

🔑 Key Engineering Decisions

GitHub Tree API + Selective Fetching

Instead of cloning entire repositories (which causes bottlenecks for large repos, binaries, node_modules, and datasets), RExplain:

Fetches the repository tree structure via the GitHub API
Identifies and selectively retrieves only important files — README, package.json, requirements.txt, route files, configs, and manifests
Falls back to a shallow clone (depth=1) only if the API is unavailable

Result: Fast analysis regardless of repository size, with no large clone delays.

Persistent Embedding Cache

Embeddings and chunk metadata are stored in PostgreSQL so repeated loads are instant:

Fetch the latest commit SHA and compare with the cached SHA
If unchanged → restore embeddings and chunks, skip full analysis
If changed → run the full pipeline and persist the new state

Startup Model Loading

The embedding model loads once at backend startup rather than per-request, eliminating timeout issues during AI chat.

🤖 AI Chatbot (RAG)

Users can ask natural-language questions about any analyzed repository:

"What frameworks are used?"
"Where is the database set up?"
"How does authentication work?"
"What are the main API routes?"

How it works:

Repository files are chunked and embedded using all-MiniLM-L6-v2
On each question, relevant chunks are retrieved via semantic similarity search
Retrieved context is passed to an LLM (Groq) for a grounded, repository-specific answer

🏗️ Architecture

┌─────────────────────────────────┐
│        React Frontend           │  ← Vercel
│  (Analysis Panel + AI Chat)     │
└────────────────┬────────────────┘
                 │ HTTPS
┌────────────────▼────────────────┐
│        FastAPI Backend          │  ← Render (Docker)
│  Analysis · RAG · Cache Layer   │
└──────┬──────────────────┬───────┘
       │                  │
┌──────▼──────┐   ┌───────▼──────┐
│  PostgreSQL  │   │  GitHub API  │
│  (Neon)    │   │  Groq API    │
└─────────────┘   └──────────────┘

☁️ Deployment

Layer	Platform
Frontend	Vercel
Backend	Render (Docker)
Database	Neon PostgreSQL

Docker is used for the backend because Graphviz requires system-level installation, which Render's standard Python runtime does not support natively.

⚠️ Known Limitations

GitHub API Timeouts — Very large repositories may be slower to analyze; retry logic and graceful fallbacks are in place
Cold Starts — Render's free tier sleeps inactive services; the first request after inactivity may take a few seconds longer

🗺️ Roadmap

Interactive Diagrams — Replace Graphviz with React Flow for clickable, interactive dependency graphs
Advanced RAG — Hybrid retrieval, AST-aware chunking, reranking, and source citations
Streaming Responses — Real-time streaming AI chat
Multi-Repository Analysis — Analyze microservice ecosystems and org-wide dependency relationships
Background Workers — Celery + Redis for async job pipelines

🚀 Try it live at rexplain.vercel.app

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
analysis.html		analysis.html
analysis_split.html		analysis_split.html
landing.html		landing.html
loading.html		loading.html
render.yaml		render.yaml
theme_restorer.py		theme_restorer.py
update_analysis.py		update_analysis.py
update_frontend.py		update_frontend.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RExplain 🔍

✨ What It Does

🧱 Tech Stack

Backend

Frontend

🚀 How It Works

🔑 Key Engineering Decisions

GitHub Tree API + Selective Fetching

Persistent Embedding Cache

Startup Model Loading

🤖 AI Chatbot (RAG)

🏗️ Architecture

☁️ Deployment

⚠️ Known Limitations

🗺️ Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RExplain 🔍

✨ What It Does

🧱 Tech Stack

Backend

Frontend

🚀 How It Works

🔑 Key Engineering Decisions

GitHub Tree API + Selective Fetching

Persistent Embedding Cache

Startup Model Loading

🤖 AI Chatbot (RAG)

🏗️ Architecture

☁️ Deployment

⚠️ Known Limitations

🗺️ Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages