Be the first to know when I publish new AI builds + demos!
An autonomous AI researcher. It takes a research objective, breaks it into experiments, spins up separate agents with access to their own GPUs to run these experiments, and delivers a paper-style writeup with findings.
- Decomposes your prompt into experiments and assigns them to specialist researcher agents.
- Each agent can launch GPU-enabled sandboxes to train models/run inference/etc., evaluate, and collect evidence.
- Based on the results of these experiments, the orchestrator can decide to finalize, or run more experiments.
- The orchestrator goes over all of the results and turns them into a coherent "paper".
The fastest way to use it:
python run_app.py
This installs missing deps, starts the API + frontend, and opens the notebook. If Google/Modal keys aren’t set, the UI will prompt you and save them locally before the run starts.
- LLM key (at least one):
- Google AI Studio:
GOOGLE_API_KEY(for Gemini 3 Pro) - Anthropic:
ANTHROPIC_API_KEY(for Claude Opus 4.5)
- Google AI Studio:
- Modal tokens:
MODAL_TOKEN_IDandMODAL_TOKEN_SECRET(for GPU sandboxes) - Add them to
.envin the repo root, or paste them into the web prompt when asked.
Choose between Gemini 3 Pro and Claude Opus 4.5 from the dropdown in the web UI, or via CLI with --model.
Prefer the terminal?
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python main.py "Does label smoothing improve ViT-Base on CIFAR-10?" --mode single --gpu any --model gemini-3-pro-preview
Orchestrator (multi-agent):
python main.py "Characterize scaling laws for sparse attention transformers" \
--mode orchestrator --num-agents 3 --max-rounds 3 --max-parallel 2 --gpu any
Dry run:
python main.py "Sanity check the pipeline" --mode orchestrator --test-mode
Steps:
- Click the button above (or go to Railway and select "Deploy from GitHub repo")
- Connect your GitHub account and select this repo (or your fork)
- Railway will automatically detect the Dockerfile and build the app
- Once deployed, open the app URL and enter your API keys in the UI
Optional environment variables (if you want server-side defaults):
GOOGLE_API_KEY- Google AI Studio key for Gemini 3 ProANTHROPIC_API_KEY- Anthropic key for Claude Opus 4.5MODAL_TOKEN_IDandMODAL_TOKEN_SECRET- For GPU sandboxes
Note: Users can also enter their own keys directly in the web UI without setting environment variables.
This is a super-early, experimental harness. There are a number of improvements to be worked out (i.e. dataset sharing between agents, key management, etc.), literature search, that would make this way more capable. If anyone wants to add these in, feel free!