An intelligent Q&A bot for the Underworld3 geodynamics modeling framework, powered by Claude AI and semantic search.
HelpfulBatBot indexes Underworld3 documentation, examples, and tests to answer user questions with context-aware responses. It uses:
- FAISS for fast semantic search
- Sentence Transformers for document embeddings
- Claude AI with prompt caching for intelligent answers
- Multi-repository support via git integration
# Clone this repository
git clone https://github.com/underworldcode/underworld-helpful-batbot.git
cd underworld-helpful-batbot
# Install dependencies
pip install -r requirements.txt
# Create .env file with your API key
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY
# Run the bot (will clone UW3 content on first run)
python HelpfulBat_app.py# In another terminal
python ask.py "How do I create a mesh in Underworld3?"
python ask.py "What are swarms?"
python ask.py statusThe bot automatically clones and indexes content from configured git repositories:
# content_sources.yaml
content_sources:
- name: "underworld3"
url: "https://github.com/underworldcode/underworld3.git"
branch: "main"
local_path: "./content_cache/underworld3"
update_frequency: "daily"
include_paths:
- "docs/beginner/tutorials/*.ipynb"
- "examples/*.py"
- "tests/test_0[0-6]*.py"- On startup: Checks if content needs updating based on
update_frequency - Cached locally: Cloned repos stored in
./content_cache/ - Shallow clones: Fast downloads with
--depth 1 - Automatic pulls: Updates existing content without re-cloning
From the Underworld3 repository (~86 files):
- β Beginner tutorials
- β Example scripts
- β A/B grade test files
- β Main documentation (README, CLAUDE.md)
- β Source code internals (excluded)
- β Developer documentation (excluded)
# Required
ANTHROPIC_API_KEY=sk-ant-...
# Optional (defaults shown)
CLAUDE_MODEL=claude-3-haiku-20240307
CONTENT_UPDATE_FREQUENCY=daily
PORT=8001See content_sources.yaml for full configuration. You can add multiple repositories:
content_sources:
- name: "underworld3"
# ... config ...
- name: "community-examples"
url: "https://github.com/yourorg/uw3-examples.git"
# ... config ...# Install Fly CLI
curl -L https://fly.io/install.sh | sh
# Login
flyctl auth login
# Create app (Sydney region)
flyctl launch --name underworld-helpfulbat --region syd
# Create persistent volume for content cache
flyctl volumes create helpfulbat_content --size 5 --region syd
# Set secrets
flyctl secrets set ANTHROPIC_API_KEY=sk-ant-...
# Deploy
flyctl deploy# Build image
docker build -t helpfulbatbot .
# Run with volume for content cache
docker run -d \
-p 8001:8001 \
-v helpfulbat_content:/app/content_cache \
-e ANTHROPIC_API_KEY=sk-ant-... \
helpfulbatbotOnce running, the bot provides:
- POST /ask - Ask a question
- GET /health - Health check
- GET /docs - API documentation (Swagger UI)
curl -X POST http://localhost:8001/ask \
-H "Content-Type: application/json" \
-d '{"question": "How do I create a mesh?", "max_context_items": 6}'underworld-helpful-batbot/
βββ HelpfulBat_app.py # Main bot server
βββ content_manager.py # Multi-repo content management
βββ content_sources.yaml # Repository configuration
βββ ask.py # CLI client
βββ start_bot.sh # Startup script
βββ demo.sh # Demo script
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker image
βββ fly.toml # Fly.io configuration
βββ .env.example # Example environment config
βββ content_cache/ # Cloned repositories (gitignored)
βββ underworld3/ # UW3 content
-
Edit
content_sources.yaml:content_sources: - name: "new-repo" url: "https://github.com/org/repo.git" branch: "main" local_path: "./content_cache/new-repo" update_frequency: "daily" include_paths: - "**/*.md" exclude_paths: - ".git/**/*"
-
Restart the bot - it will automatically clone and index the new repository
from content_manager import ContentManager
# Load configuration
manager = ContentManager("content_sources.yaml")
# Force update all sources
manager.update_all(force=True)
# Get statistics
stats = manager.get_stats()
print(f"Total files: {stats['total_files']}")With Claude Haiku and prompt caching:
- 100 questions/day: ~$25/month
- 500 questions/day: ~$125/month
- 2000 questions/day: ~$500/month
Hosting on Fly.io free tier: $0/month (within limits)
- Auto-respond to GitHub issues
- React to user feedback with π/π
- Flag low-confidence answers for human review
- Embedded chat widget in documentation
- Context-aware help based on page
- Inline code assistance
- Syntax-aware Q&A
- Track user feedback
- Identify documentation gaps
- Auto-generate planning documents from user needs
[Add your license here]
Contributions welcome! Please see CONTRIBUTING.md for guidelines.