AI Researcher

Be the first to know when I publish new AI builds + demos!

An autonomous AI researcher. It takes a research objective, breaks it into experiments, spins up separate agents with access to their own GPUs to run these experiments, and delivers a paper-style writeup with findings.

How it Works

Decomposes your prompt into experiments and assigns them to specialist researcher agents.
Each agent can launch GPU-enabled sandboxes to train models/run inference/etc., evaluate, and collect evidence.
Based on the results of these experiments, the orchestrator can decide to finalize, or run more experiments.
The orchestrator goes over all of the results and turns them into a coherent "paper".

Run it (web notebook, one command)

The fastest way to use it:

python run_app.py

This installs missing deps, starts the API + frontend, and opens the notebook. If Google/Modal keys aren’t set, the UI will prompt you and save them locally before the run starts.

Keys Needed

LLM key (at least one):
- Google AI Studio: GOOGLE_API_KEY (for Gemini 3 Pro)
- Anthropic: ANTHROPIC_API_KEY (for Claude Opus 4.5)
Modal tokens: MODAL_TOKEN_ID and MODAL_TOKEN_SECRET (for GPU sandboxes)
Add them to .env in the repo root, or paste them into the web prompt when asked.

Model Selection

Choose between Gemini 3 Pro and Claude Opus 4.5 from the dropdown in the web UI, or via CLI with --model.

Optional CLI

Prefer the terminal?

python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python main.py "Does label smoothing improve ViT-Base on CIFAR-10?" --mode single --gpu any --model gemini-3-pro-preview

Orchestrator (multi-agent):

python main.py "Characterize scaling laws for sparse attention transformers" \
  --mode orchestrator --num-agents 3 --max-rounds 3 --max-parallel 2 --gpu any

Dry run:

python main.py "Sanity check the pipeline" --mode orchestrator --test-mode

Deploy to Railway

Steps:

Click the button above (or go to Railway and select "Deploy from GitHub repo")
Connect your GitHub account and select this repo (or your fork)
Railway will automatically detect the Dockerfile and build the app
Once deployed, open the app URL and enter your API keys in the UI

Optional environment variables (if you want server-side defaults):

GOOGLE_API_KEY - Google AI Studio key for Gemini 3 Pro
ANTHROPIC_API_KEY - Anthropic key for Claude Opus 4.5
MODAL_TOKEN_ID and MODAL_TOKEN_SECRET - For GPU sandboxes

Note: Users can also enter their own keys directly in the web UI without setting environment variables.

Status/Contribution

This is a super-early, experimental harness. There are a number of improvements to be worked out (i.e. dataset sharing between agents, key management, etc.), literature search, that would make this way more capable. If anyone wants to add these in, feel free!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
frontend		frontend
.env.example		.env.example
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
api_guide.md		api_guide.md
api_server.py		api_server.py
experiment.py		experiment.py
gemini_3_pro_guide.md		gemini_3_pro_guide.md
insights.py		insights.py
logger.py		logger.py
main.py		main.py
modal_guide.md		modal_guide.md
orchestrator.py		orchestrator.py
railway.json		railway.json
requirements.txt		requirements.txt
run_app.py		run_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Researcher

How it Works

Run it (web notebook, one command)

Keys Needed

Model Selection

Optional CLI

Deploy to Railway

Status/Contribution

About

Uh oh!

Releases

Packages

Languages

License

mshumer/autonomous-researcher

Folders and files

Latest commit

History

Repository files navigation

AI Researcher

How it Works

Run it (web notebook, one command)

Keys Needed

Model Selection

Optional CLI

Deploy to Railway

Status/Contribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages