Developed by the AI Verify Foundation, Moonshot is a tool to bring Benchmarking and Red-Teaming together to help AI developers, compliance teams evaluate LLM-based Apps and LLMs.
In the rapidly evolving landscape of Generative AI, ensuring safety, reliability, and performance of LLM applications is paramount. Moonshot addresses this critical need by providing a unified platform for:
- Benchmark Tests: Systematically test LLM Apps or LLMs across critical trust & safety risks using a wide array of open-source benchmark dataset and metrics, including guided workflows to implement IMDA's Starter Kit for LLM-based App Testing.
- Red Team Attacks: Proactively identify vulnerabilities and potential misuse scenarios in your LLM applications through streamlined adversarial prompting.
- User-friendly Interfaces: Interact with Moonshot via an intuitive Web UI for visual insights, and an interactive Command Line Interface (CLI) for quick operations.
- Comprehensive Benchmarking:
- View list of available datasets available
- Test for Performance (e.g., accuracy, BLEU)
- Ensure Trust & Safety e.g., bias, toxicity, hallucination)
- Utilize built-in workflow to implement IMDA's Starter Kit for LLM-based App Testing. View available pre-built Cookbooks
- Powerful Red-Teaming:
- View list of available attack modules
- Simplify adversarial prompt generation using algorithmic strategies or generative LLM to uncover potential misuse.
- Leverage prompt templates, context strategies, and automated attack modules.
- Customizable Recipes: Build your own benchmark tests with custom datasets (input-target pairs), prompt templates (optional), evaluation metric, and grading scales. View available pre-built Recipes
- Insightful Reporting: Use our HTML reports with interactive charts for clear visualization of test results, and download detailed raw JSON results for deeper programmatic analysis.
- Extensible & Modular: Designed for easy extension and integration with new LLM applications, benchmarks, and attack techniques.
Moonshot can be used through several interfaces:
- User-friendly Web UI - Web UI User Guide
- Interactive Command Line Interface - CLI User Guide
- Seamless Integration into your MLOps workflow via Moonshot Library APIs or Moonshot Web APIs - Notebook Examples, Web API Docs
This section will guide you through getting Moonshot up and running.
-
Python: Version 3.11 is required.
-
Git Version Control: Git is essential for cloning the repository.
-
(Optional) Virtual Environment: Highly recommended to manage dependencies.
# Create a virtual environment python -m venv venv # Activate the virtual environment source venv/bin/activate
-
If you plan to install our Web UI, you will also need Node.js version 20.11.1 LTS and above
You can install Moonshot in various ways depending on your needs
1. Using pip
(Recommended for most users)
# Install Project Moonshot's Python Library, which includes Moonshot's full functionalities (Library APIs, CLI and Web APIs)
pip install "aiverify-moonshot[all]"
# Clone and install test assets and Web UI
python -m moonshot -i moonshot-data -i moonshot-ui
πΌοΈ If you plan to install our Web UI, you will also need moonshot-ui
Check out our Installation Guide for more details.
2. From Source Code (For developers and contributors)
# To install from source code (Full functionalities)
git clone [email protected]:aiverify-foundation/moonshot.git
cd moonshot
pip install -r requirements.txt
If you have installation issues, please take a look at the Troubleshooting Guide.
Other installation options
Here's a summary of other installation commands available:# To install Moonshot library APIs only
pip install aiverify-moonshot
# To install Moonshot library APIs and Web APIs only
pip install "aiverify-moonshot[web-api]"
# To install Moonshot library APIs and CLI only
pip install "aiverify-moonshot[cli]"
Check out our Installation Guide for more details.
python -m moonshot web
Open http://localhost:3000/ in a browser and you should see this homepage:
Refer to this guide to discover the rich features available in Moonshot Web UI
python -m moonshot cli interactive
Refer to this Command List to discover the list of CLI commands for Moonshot
For detailed information on configuring, using, and extending Moonshot, please refer to our comprehensive documentation:
- Getting Started with Moonshot Web UI
- Creating Your Custom Cookbook via Moonshot Web UI
- Creating Your Custom Connector Endpoint via Moonshot Web UI
- Running Benchmark Test on Moonshot Web UI
- Running Red Teaming on Moonshot Web UI
- Getting Started with Moonshot Interactive CLI
- Creating Your Custom Benchmark Tests for Your RAG Apps via Moonshot Interactive CLI
- Creating Your Custom Connector Endpoint via Moonshot Interactive CLI
- Running Benchmark Test on Moonshot Interactive CLI
- Running Red Teaming on Moonshot Interactive CLI
Moonshot is an open-source project, and we welcome contributions from the community! Whether fixing a bug, adding a new feature, improving documentation, or suggesting an enhancement, your efforts are highly valued.
Please refer to our Contributor Guide for details on how to get started.
Moonshot is currently in beta. We are actively developing new features, improving existing ones, and enhancing stability. We encourage you to try it out and provide feedback!
Moonshot is released under the Apache Software License 2.0