O.P.E.R.A.T.O.R is an advanced AI assistant that transforms natural language into automated browser and computer tasks. It combines multiple state-of-the-art language models with a powerful automation engine to understand and execute complex workflows across web and desktop applications.
- Diverse AI Models: Choose from leading AI providers including OpenAI, Google, Alibaba, and ByteDance
- Specialized Capabilities: Each model excels in different areas (vision, reasoning, UI interaction, etc.)
- Model Comparison: Easily compare outputs from different models for optimal results
- Natural Language Understanding: Convert plain English instructions into automated actions
- Visual Grounding: Advanced computer vision for precise UI element interaction
- Workflow Automation: Chain multiple tasks into complex, automated workflows
- YAML Integration: Define and execute tasks using structured YAML configuration
- Cross-Platform: Works seamlessly across Windows, macOS, and Linux
- Smart Shopping Assistant: Price comparisons, deal tracking, and purchase automation
- Job Application Manager: Automate job searches, applications, and follow-ups
- Meeting Assistant: Join, transcribe, and summarize meetings with action items
- Workflow Automation: Connect multiple applications and services in custom workflows
- Data Extraction: Scrape and organize web data intelligently
- Node.js 16+ and npm 8+
- Modern web browser (Chrome, Firefox, Edge, or Safari)
- API keys for your preferred AI providers
# Clone the repository
git clone https://github.com/dexters-lab-ai/operator.git
cd operator
# Install dependencies
npm install
O.P.E.R.A.T.O.R uses environment variables for configuration. You must set up the following files:
.env
β for development (not committed).env.production
β for production (not committed).env.example
β template/example (committed)
Required variables:
NODE_ENV
β set todevelopment
orproduction
PORT
β backend API port (default:3420
)FRONTEND_URL
β frontend URL (default:http://localhost:3000
)API_URL
β backend API URL (default:http://localhost:3420
)VITE_FRONTEND_URL
,VITE_API_URL
,VITE_WS_URL
β for Vite frontend (use production URLs in.env.production
)- All secret/API keys as needed
Tip: Copy
.env.example
to.env
and.env.production
, then fill in your values.
cp .env.example .env
cp .env.example .env.production
# Start Vite frontend (port 3000)
npm run dev
# Start Node.js backend (port 3420)
npm run serve:dev
- Access the app at: http://localhost:3000
- Backend API at: http://localhost:3420
O.P.E.R.A.T.O.R provides powerful Android device automation capabilities through multiple connection methods. Choose the option that best fits your needs:
Best for: Local development and debugging
Speed: β‘β‘β‘β‘β‘ (Fastest)
Stability: βββββ (Most stable)
Setup Instructions:
- Enable USB Debugging on your Android device:
- Go to Settings > About Phone
- Tap Build Number 7 times to enable Developer Options
- Go to System > Developer Options
- Enable USB Debugging
- Connect your device via USB
- In O.P.E.R.A.T.O.R, select USB as the connection type
- Click Connect and authorize the connection on your device
Best for: Wireless testing and multi-device setups
Speed: β‘β‘β‘ (Depends on network)
Convenience: βββββ (No cables needed)
Setup Instructions:
- First, connect your device via USB and enable USB debugging
- Open a command prompt/terminal and run:
adb tcpip 5555
- Note your device's IP address in Settings > About Phone > Status
- In O.P.E.R.A.T.O.R, select Network as the connection type
- Enter your device's IP address (port defaults to 5555)
- Click Connect
- You can now disconnect the USB cable
Best for: Teams and CI/CD pipelines
Flexibility: βββββ (Access anywhere)
Setup: βοΈβοΈβοΈ (Advanced configuration)
Setup Instructions:
- On the computer with your Android device connected:
- Open a command prompt/terminal as administrator
- Navigate to your ADB directory (e.g.,
cd C:\platform-tools
) - Start the ADB server:
adb -a -P 5037 nodaemon server
- Note the computer's IP address
- In O.P.E.R.A.T.O.R:
- Select Remote ADB as the connection type
- Enter the ADB server's IP and port (default: 5037)
- If needed, specify the path to ADB executable
- Click Save Settings then Test Connection
- Verify USB debugging is enabled
- Try a different USB cable/port
- Run:
adb kill-server && adb start-server
- Ensure proper USB drivers are installed
- Check WiFi/network stability
- Ensure device IP hasn't changed (for network connections)
- Verify port 5555 is open (network mode)
- Check for firewall/antivirus blocking the connection
When running in a Docker container, network ADB connections may not work due to network namespace isolation. For development and testing, run the app directly on your host machine.
- Android device with USB debugging enabled
- ADB (Android Debug Bridge) installed
- For USB: Proper USB drivers for your device
- For network: Device and computer on the same network
- For remote ADB: ADB server running on the host machine
- Only enable USB debugging for trusted computers
- Be cautious when connecting to public networks
- Use secure connections for remote ADB access
- Keep your ADB version updated to the latest release
npm run build
npm run serve
- App runs with production settings from
.env.production
- Access the app at your configured production URLs
To test your production build locally:
npm run build
cross-env NODE_ENV=production node server.js
- This uses
.env.production
and runs the server on port 3420 by default - Access the frontend at http://localhost:3000 if using Vite preview, or your configured
FRONTEND_URL
- Environment Variables Not Loading?
- Ensure you use the correct
.env
file for the mode (.env
for dev,.env.production
for prod) - On Windows, always use
cross-env
in npm scripts to setNODE_ENV
- Restart your terminal after changing env files
- Ensure you use the correct
- Ports Not Matching?
- Backend defaults to
3420
, frontend to3000
(update your env files if you change these)
- Backend defaults to
- Missing Dependencies?
- Run
npm install
to ensure all dependencies (includingcross-env
) are installed
- Run
O.P.E.R.A.T.O.R relies on WebSocket connections to deliver real-time features such as the Neural Canvas and planLogs. If a user is not connected, some features will not work or render.
- Check WebSocket Connection Status
- In the browser console, look for
[WebSocket] Connection established for userId=...
. - If you see repeated reconnect attempts or errors, the connection is failing.
- On the server, logs like
[WebSocket] Sending update to userId=...
andconnectionCount
show if the backend sees the user as connected.
- In the browser console, look for
- Test with test-websocket.mjs
- Use the provided
test-websocket.mjs
script to simulate a user connection and see if the server accepts and responds.
- Use the provided
- User ID Sync
- The frontend (
CommandCenter.jsx
) syncs userId with/api/whoami
and stores it in local/session storage. If userId is missing or not synced, the WebSocket will not initialize.
- The frontend (
- Missing WebSocket Events
- If the Neural Canvas or planLogs do not render, it may be because the browser is not receiving
functionCallPartial
or related events from the server. - This is often due to a lost or failed WebSocket connection. Check browser and server logs for errors.
- If the Neural Canvas or planLogs do not render, it may be because the browser is not receiving
- Queued Messages
- The server queues messages for users who are temporarily disconnected. When the user reconnects, queued messages are sent. If you see
[WebSocket] No active connections for userId=... Queuing message.
, the user is not currently connected.
- The server queues messages for users who are temporarily disconnected. When the user reconnects, queued messages are sent. If you see
- Refresh the browser and check for connection logs.
- Check your
.env
and.env.production
for correctVITE_WS_URL
andAPI_URL
values. - Make sure your firewall or reverse proxy is not blocking WebSocket traffic (port 3420 by default).
- If using a production deployment, ensure your frontend is connecting to the correct backend WebSocket endpoint.
- For persistent issues, check both client and server logs for
[WebSocket]
errors or warnings.
If the Neural Canvas or other real-time features do not update, it is almost always a user connection/WebSocket issue.
-
Create a DigitalOcean Account
- Sign up at DigitalOcean
- Complete email verification and account setup
-
Create a New App
- Go to the Apps section
- Click "Create" > "Apps"
- Choose GitHub as the source
- Select your repository with this codebase
- Select the main branch
-
Configure App Settings
- Under "App Name", enter your preferred name
- Select the region closest to your users
- Click "Next" to continue
-
Environment Variables
- Go to the "Settings" tab
- Click "Edit" in the "App-Level Environment Variables" section
- Copy and paste the contents of your
.env.production
file - Click "Save"
-
Configure Ports
- In the "App" section of Settings, click "Edit"
- Add the following HTTP ports:
3000
(Frontend)3420
(Backend API)
- Set the health check to use TCP protocol
-
Run Command
- In the "App" section, find "Run Command"
- Enter:
node server.js
- Click "Save"
-
Deploy
- Go to the "Deployments" tab
- Click "Deploy"
- Wait for the deployment to complete
-
Access Your App
- Once deployed, find your app's URL in the "Domains" section
- The app will be available at
https://your-app-name.ondigitalocean.app
O.P.E.R.A.T.O.R supports various AI models, each with unique strengths:
Model | Provider | Strengths | Best For |
---|---|---|---|
GPT-4o | OpenAI | Advanced reasoning, code generation | General tasks, complex workflows |
Qwen-2.5-VL 72B | Alibaba | Visual grounding, UI interaction | Precise element targeting |
Gemini-2.5-Pro | Visual understanding, multimodal | Research, data analysis | |
UI-TARS | ByteDance | End-to-end GUI automation | Complex UI workflows |
Claude 3 Opus | Anthropic | Safety, instruction-following | Sensitive tasks |
Grok-1 | xAI | Real-time data, conversational | Interactive tasks |
-
Step Planning (Default)
- Processes tasks step-by-step with validation
- Provides detailed progress updates
- Ideal for complex or critical tasks
-
Action Planning (Autopilot)
- Plans complete sequence of actions upfront
- More efficient for routine tasks
- Reduces completion time
-
YAML Planning (Recommended)
- Uses structured YAML for precise control
- Enables complex workflow definitions
- Provides transparency and reproducibility
Define complex workflows using YAML:
name: Research Assistant
version: 1.0
tasks:
- name: Search Academic Papers
action: web.search
params:
query: "machine learning applications in healthcare"
source: "google_scholar"
limit: 5
- name: Extract Key Findings
action: ai.analyze
params:
content: "{{task_1.results}}"
instructions: "Summarize key findings and methodologies"
- name: Generate Report
action: docs.create
params:
title: "Research Summary - {{date}}"
content: "{{task_2.summary}}"
format: "markdown"
Create a .env
file in the root directory:
PORT=3400
NODE_ENV=development
API_KEYS={
"openai": "your-openai-key",
"google": "your-google-key",
"qwen": "your-qwen-key"
}
Configure your API keys in the Settings panel:
- Click the gear icon in the top-right corner
- Navigate to "API Keys" tab
- Enter your keys for each provider
- Click "Save"
For detailed documentation, please visit our Documentation Portal.
- Frontend: React with Vite, VanillaJs
- Backend: Node.js with Express
- Real-time: WebSocket integration
- Database: MongoDB (optional)
We welcome contributions! Please read our Contributing Guidelines to get started.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Join our growing community:
- Dorahacks - Follow the Buidl
- GitHub Issues - Report issues
- Twitter - Latest updates
- Telegram - Professional network
- Jesus first. All the amazing open-source projects that made this possible
- Our wonderful community of contributors and users
- The AI/ML community for continuous innovation