- Prerequisites
- Downloading the Project
- Environment Variables
- Signal Bot Registration
- Twilio Configuration
- Domain and DNS Setup
- Cloudflare Tunnel (Alternative to Caddy)
- Starting the Application
- Phone Call Forwarding
- Contact Book (Optional)
- ElevenLabs TTS (Optional)
- Customising the Assistant
- Verifying the Setup
- Call Logs
- Signal Commands During a Call
- Running Costs
- Troubleshooting
- Security
Before starting the installation, make sure you have the following:
Infrastructure:
- A VPS or dedicated server with a public IP address
- Operating system: Linux (Ubuntu 22.04+, Debian 12+, or any OS with Docker support)
- Ports 80 and 443 open and reachable from the internet
- Docker Engine (version 20.10 or later)
- Docker Compose v2 (
docker composecommand) - A domain name with access to DNS settings (A/AAAA records)
Accounts and services:
- Twilio account (https://console.twilio.com) with a purchased phone number
- OpenAI API key (https://platform.openai.com)
- A separate phone number (SIM card) to register as the Signal bot
- Your personal Signal number to receive notifications
Optional:
- ElevenLabs account (https://elevenlabs.io) for higher-quality speech synthesis
Copy the project files to your server:
cd /opt
git clone <repository-url> ava
cd avaIf you are not using git, upload the files via SCP or SFTP to /opt/ava.
Create the required directories:
mkdir -p data/callsCopy the template file and open it in your editor:
cp .env.example .env
nano .envBelow is a description of each variable:
| Variable | Description | Example |
|---|---|---|
TWILIO_ACCOUNT_SID |
Your Twilio account identifier. Found on the Twilio Console dashboard. | ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
TWILIO_AUTH_TOKEN |
Twilio auth token. Also used for webhook signature validation. | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
TWILIO_PHONE_NUMBER |
The Twilio phone number that will receive forwarded calls. | +48123456789 |
| Variable | Description | Example |
|---|---|---|
SIGNAL_CLI_URL |
Internal address of the signal-cli container. Do not change this value. | http://signal-cli:8080 |
SIGNAL_SENDER_NUMBER |
The Signal bot phone number (a separate SIM card, registered in step 4). | +48111222333 |
SIGNAL_RECIPIENT |
Your personal Signal number. AVA sends all notifications here. | +48999888777 |
| Variable | Description | Example |
|---|---|---|
OPENAI_API_KEY |
Your OpenAI API key. | sk-proj-... |
OPENAI_MODEL |
(deprecated, use LLM_MODEL) GPT model for conversations. | gpt-4o |
LLM_PROVIDER |
LLM backend: openai (default) or groq. |
openai |
LLM_MODEL |
Model name. Default: gpt-4o-mini (OpenAI), llama-3.3-70b-versatile (Groq). |
gpt-4o-mini |
LLM_SUMMARY_MODEL |
Model for call summaries. Defaults to LLM_MODEL. |
gpt-4o-mini |
GROQ_API_KEY |
Groq API key. Required when LLM_PROVIDER=groq. |
(blank or key) |
| Variable | Description | Example |
|---|---|---|
ELEVENLABS_API_KEY |
ElevenLabs API key. Leave blank to skip ElevenLabs. | (blank or key) |
ELEVENLABS_VOICE_ID |
Single multilingual voice ID (used for all languages). Browse at https://elevenlabs.io/voice-library | WAhoMTNdLdMoq1j3wf3I |
ELEVENLABS_MODEL |
ElevenLabs model. eleven_multilingual_v2 (best quality) or eleven_turbo_v2_5 (faster, lower latency). |
eleven_turbo_v2_5 |
OPENAI_TTS_VOICE |
Fallback OpenAI TTS voice (used when ElevenLabs is unavailable). Options: alloy, echo, fable, onyx, nova, shimmer. | nova |
| Variable | Description |
|---|---|
OWNER_CONTEXT |
Assistant persona + owner information, injected into the GPT-4o system prompt. Must be a single line (no newlines). This is your private configuration — it stays in .env and is never committed to git. See section 12 for details. |
| Variable | Description | Example |
|---|---|---|
DEFAULT_STT_LANG |
Default Twilio STT language before prefix detection. | en-US |
SIGNAL_LANG |
Language for Signal notifications and summaries (en or pl). |
en |
| Variable | Description | Example |
|---|---|---|
COMPOSE_PROFILES |
Docker Compose profile controlling the ingress method. caddy = Caddy + Let's Encrypt, tunnel = Cloudflare Tunnel. |
caddy |
PUBLIC_URL |
The public HTTPS address where the server is reachable from the internet. | https://ava.your-domain.com |
DOMAIN |
The domain name (without https://). Used by Caddy to obtain an SSL certificate. Required only with the caddy profile. |
ava.your-domain.com |
CLOUDFLARE_TUNNEL_TOKEN |
Tunnel token from the Cloudflare Zero Trust dashboard. Required only with the tunnel profile. |
eyJhIjo... |
AVA communicates with you through Signal. You need a separate SIM card whose number will be registered as the "bot".
Important: Signal requires a CAPTCHA verification during registration. You must complete it in a browser first, then pass the token to the API.
Start the signal-cli container:
docker compose up signal-cli -dWait about 15 seconds for the container to start.
Open this URL in your browser:
https://signalcaptchas.org/registration/generate.html
Complete the CAPTCHA challenge. After solving it, the page will redirect to a URL starting with signalcaptcha://. Copy the entire value after signalcaptcha:// — this is your captcha token.
Tip: In most browsers, the redirect will fail (page not found). That's expected. Just copy the full URL from the address bar and extract everything after
signalcaptcha://.
curl -X POST "http://localhost:8080/v1/register/+48BOT_NUMBER" \
-H "Content-Type: application/json" \
-d '{"use_voice": false, "captcha": "PASTE_CAPTCHA_TOKEN_HERE"}'Replace +48BOT_NUMBER with the phone number of your bot SIM card (E.164 format).
You will receive an SMS with a verification code on the bot SIM card. Enter it:
curl -X POST "http://localhost:8080/v1/register/+48BOT_NUMBER/verify/YOUR_CODE"Verify that the registration was successful:
curl http://localhost:8080/v1/accountsYou should see your number listed among the registered accounts.
Enter this number as SIGNAL_SENDER_NUMBER in the .env file.
- Log in to the Twilio Console: https://console.twilio.com
- Navigate to: Phone Numbers > Manage > Buy a Number
- Select a number with the appropriate country prefix (e.g. +48 for Poland)
- Purchase the number
After starting the server (step 7), return to the Twilio Console:
- Go to: Phone Numbers > Manage > Active Numbers
- Click on your purchased number
- In the "Voice & Fax" section, set:
| Field | Value |
|---|---|
| A Call Comes In | Webhook, POST, https://your-domain.com/twilio/incoming |
| Call Status Changes | https://your-domain.com/twilio/status, POST |
Replace your-domain.com with the actual address of your server.
Twilio requires webhooks to be served over HTTPS. Caddy (included in the project) automatically obtains a Let's Encrypt certificate.
- In your domain registrar's DNS panel, add a record:
| Type | Name | Value |
|---|---|---|
| A | ava (or @) |
Your server's IP address |
-
Wait for DNS propagation (usually a few minutes to a few hours)
-
Make sure that the
DOMAINandPUBLIC_URLvariables in.envare correct:
DOMAIN=ava.your-domain.com
PUBLIC_URL=https://ava.your-domain.com
If you prefer not to open ports 80/443 on your server, you can use Cloudflare Tunnel to securely expose AVA to the internet through Cloudflare's network. The tunnel establishes an outbound connection from your server -- no public IP or open ports required.
- Log in to the Cloudflare Zero Trust dashboard: https://one.dash.cloudflare.com
- Navigate to: Networks > Tunnels > Create a tunnel
- Choose the Cloudflared connector type and give the tunnel a name (e.g.
ava) - Copy the tunnel token and paste it into
.env:
CLOUDFLARE_TUNNEL_TOKEN=eyJhIjoiYWJjZGVmLi4uIn0=...In the tunnel settings, add a route (Public Hostname):
| Subdomain | Domain | Service |
|---|---|---|
ava |
your-domain.com |
http://ava:8000 |
Cloudflare will automatically provision an SSL certificate and proxy traffic to the AVA container.
In .env, set the profile to tunnel:
COMPOSE_PROFILES=tunnelThen start normally — only the cloudflared container will run (Caddy will not):
docker compose up -d- The
COMPOSE_PROFILESvariable in.envcontrols which ingress service runs:caddy(default) ortunnel - Update
PUBLIC_URLin.envto match the hostname configured in the tunnel - Check the tunnel status:
docker compose logs ava-cloudflared
Once steps 3 through 7 are complete, start the full stack:
docker compose up -dCheck the status of the containers:
docker compose psYou should see the running containers: ava, ava-signal-cli, ava-caddy (or ava-cloudflared if using Cloudflare Tunnel).
Follow the logs in real time:
docker compose logs -f avaTest that the server is responding:
curl https://your-domain.com/healthExpected response:
{"status": "ok"}Forward calls from your personal phone to the Twilio number.
- Open the Phone app
- Menu (three dots) > Settings > Call forwarding
- Set forwarding for "When busy" or "When declined" to your Twilio number
Alternatively, dial from the keypad:
**67*TWILIO_NUMBER#
- Settings > Phone > Call Forwarding
- Enter the Twilio number
The exact path may vary depending on your carrier. If you run into issues, contact your carrier and ask them to enable conditional forwarding (CFB/CFNRy).
To allow AVA to recognise callers by name, copy the example file and edit it:
cp data/contacts.json.example data/contacts.json
nano data/contacts.jsonThe file uses a dictionary format. Values can be a simple name string or an object with name and optional lang:
{
"+48123456789": "Jan Kowalski",
"+48987654321": "Anna Nowak",
"+41761234567": {"name": "Hans Müller", "lang": "de"},
"+44207123456": {"name": "John Smith", "lang": "en"}
}The optional lang field forces the STT language for this contact, overriding automatic phone prefix detection. Useful when someone calls from a foreign number but speaks a different language.
By providing lang for a contact, you skip the Whisper-based language detection step on the first turn. The call will be immediately answered by the AI in the specified language, resulting in lower latency.
Notes:
- Direct call access: contacts in the contact book can call the Twilio number directly (without forwarding) and AVA will answer. Unknown callers must go through call forwarding.
- Numbers should be in E.164 format (with country prefix, e.g.
+48) - Bare 9-digit numbers without a prefix are automatically treated as Polish (+48)
- The file is loaded once at container startup; changes require a restart:
docker compose restart ava - If a contact is not found locally, AVA attempts a Twilio CNAM Lookup (approximately $0.01 per query)
By default, AVA uses OpenAI TTS (model tts-1, voice nova). For higher voice quality:
- Create an account at https://elevenlabs.io
- Go to: Profile > API Keys and create a key
- Browse the Voice Library (https://elevenlabs.io/voice-library) and copy the Voice ID
- Enter the key and voice ID in
.env:
ELEVENLABS_API_KEY=your_key
ELEVENLABS_VOICE_ID=WAhoMTNdLdMoq1j3wf3I
ELEVENLABS_MODEL=eleven_multilingual_v2
A single multilingual voice is used for all languages. The eleven_multilingual_v2 model supports 29 languages.
TTS fallback chain: ElevenLabs (with circuit breaker) > OpenAI TTS (OPENAI_TTS_VOICE) > Twilio Polly.
After changing voice/model, clear the cache: docker exec ava sh -c 'rm -f /tmp/tts_cache/*.mp3'
AVA adjusts its behaviour based on the OWNER_CONTEXT variable in the .env file. This text is injected into the GPT-4o system prompt.
Example configuration:
OWNER_CONTEXT=The phone owner is John Smith. \
Birthday: 15 March 1990. \
Working hours: Monday-Friday, 9:00-17:00 CET. \
Expected calls: clients asking about project status, suppliers confirming deliveries, IT team reporting infrastructure issues. \
IT/infrastructure emergencies: always treat as HIGH priority and note as an urgent callback. \
Recruiters and sales calls: politely thank them and end the call. \
Callback policy: "The owner will call back as soon as possible during working hours."
For more advanced changes, edit the SYSTEM_PROMPT variable in app/conversation.py.
After completing the configuration, run the following tests:
- Check server availability:
curl https://your-domain.com/health- Review logs for errors:
docker compose logs ava | tail -50
docker compose logs ava-signal-cli | tail -20
docker compose logs caddy | tail -20- Send a test message to the Signal bot (from the SIGNAL_RECIPIENT number):
status
You should receive a reply: "No active call at the moment."
-
Call your personal number from a different phone (or call the Twilio number directly). AVA should answer, greet the caller, and carry on a conversation.
-
After the call ends, verify that:
- You received a summary on Signal
- A JSON file appeared in the
data/calls/directory
After every call (including missed calls), AVA saves the data to a JSON file in the data/calls/ directory.
File naming: YYYYMMDD_HHMMSS_CALLSID.json
Example contents:
{
"call_sid": "CA1a2b3c4d...",
"caller_number": "+48123456789",
"caller_name": "John Smith",
"start_time": "2026-02-23T14:32:15",
"end_time": "2026-02-23T14:35:02",
"language": "pl-PL",
"summary": "John Smith from Acme Corp called about invoice #456...",
"transcript": [
{"role": "user", "text": "Good afternoon, I'm calling about the invoice...", "time": "..."},
{"role": "assistant", "text": "Good afternoon, please tell me...", "time": "..."}
],
"call_meta": {
"urgency": "medium",
"topic": "invoice dispute",
"caller_name_detected": "John"
}
}When AVA is handling a call, you can send instructions via Signal:
| Command | Effect |
|---|---|
status or ? |
Reports whether a call is currently active |
end, stop, finish, hang up |
AVA wraps up and ends the call |
tell him/her <message> |
AVA relays the message to the caller |
ask him/her <question> |
AVA asks the caller that question |
| Any other text | Forwarded to AVA as a general instruction |
Polish equivalents also work: koniec, zakoncz, powiedz <wiadomosc>, zapytaj <pytanie>.
AVA confirms every instruction with a reply on Signal.
| Command | Description |
|---|---|
/ping |
Alive check + timestamp |
/status |
Uptime, active calls, public URL |
/stats |
Call count, memory, TTS cache size |
/calls |
Last 5 call records with topics |
/debug |
Latency breakdown (avg from last 10 calls). /debug -1 for last call detail. |
/billings |
Check API balances (ElevenLabs characters, Twilio balance, OpenAI costs) |
/recording-on |
Start recording calls via Twilio |
/recording-off |
Stop recording calls |
/restart |
Restart AVA (requires /restart confirm) |
/help |
Command list |
Estimated costs for a typical 2-minute call:
| Service | Rate | Cost per call |
|---|---|---|
| Twilio Voice | $0.013/min | approx. $0.03 |
| Twilio STT (enhanced) | $0.02/15 s | approx. $0.16 |
| OpenAI Whisper | $0.006/min | approx. $0.001 (first turn only) |
| OpenAI GPT-4o-mini | approx. $0.0006/1k tokens | approx. $0.001 |
| ElevenLabs | from $5/month (30k chars free) | -- |
| Twilio CNAM Lookup | $0.01/query | $0.01 (unknown numbers only) |
Total cost of a typical call: approximately $0.20-0.25.
# Check if the server responds
curl -I https://your-domain.com/health
# Check the SSL certificate
docker compose logs caddy | grep -i "certificate"
# Check that ports 80/443 are open
ss -tlnp | grep -E ':(80|443)'# Check TTS logs
docker compose logs ava | grep -i tts
# Make sure PUBLIC_URL is reachable from the internet
curl https://your-domain.com/audio/test.mp3
# Expected: 404 (file does not exist, but the endpoint works)# Check signal-cli logs
docker compose logs ava-signal-cli
# Check registered accounts
curl http://localhost:8080/v1/accounts
# Check AVA logs for Signal errors
docker compose logs ava | grep -i signal- Make sure the Twilio Console webhooks point to the correct address
- Verify that call forwarding is active on your phone
- Review the logs:
docker compose logs -f ava
# Restart all services
docker compose restart
# Rebuild after code changes
docker compose up -d --buildAVA includes the following security mechanisms:
| Mechanism | Description |
|---|---|
| Twilio signature validation | Every request to /twilio/* must carry a valid X-Twilio-Signature header. Forged requests are rejected with HTTP 403. |
| Direct call rejection | Only forwarded calls are answered. Direct calls to the Twilio number are rejected (busy), unless the caller is in contacts.json. |
| Rate limiting | A maximum of 30 requests per minute from a single IP address. Exceeding the limit results in HTTP 429. |
| Hidden application port | Port 8000 is not exposed to the internet. Traffic passes exclusively through Caddy (HTTPS on port 443). |
| Signal sender filtering | Signal messages are accepted only from the SIGNAL_RECIPIENT number. All others are logged and ignored. |
| Audio file protection | File names are validated with a regular expression (MD5 hash + .mp3 only). Path traversal attacks are blocked. |
| Security headers | Caddy adds: HSTS, X-Frame-Options DENY, X-Content-Type-Options nosniff, and strips the Server header. |
| Disabled API documentation | The /docs, /redoc, and /openapi.json endpoints are turned off. |
┌─────────────────────────────────────────────────────────────────┐
│ EXTERNAL SERVICES │
│ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Twilio │ │ OpenAI │ │ ElevenLabs │ │
│ │ Voice/STT │ │ GPT-4o │ │ TTS (voice) │ │
│ └─────┬─────┘ │ TTS fbk │ └──────┬───────┘ │
│ │ └─────┬─────┘ │ │
└────────┼──────────────────┼───────────────────┼─────────────────┘
│ HTTPS │ HTTPS │ HTTPS
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ DOCKER HOST (your server) │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Caddy :443 (Let's Encrypt) OR Cloudflare Tunnel │ │
│ └──────────────────────┬─────────────────────────────────────┘ │
│ │ ava-net (Docker bridge) │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ AVA (FastAPI :8000) │ │
│ │ │ │
│ │ main.py ─── conversation.py ─── tts.py │ │
│ │ │ │ │ │ │
│ │ Twilio hooks GPT-4o/Groq ElevenLabs→OpenAI │ │
│ │ Rate limiter Streaming →Polly fallback │ │
│ │ Whisper async Meta parsing TTS cache (MD5) │ │
│ │ Audio serve Summarizer Circuit breaker │ │
│ │ Diagnostics │ │
│ │ │ │ │
│ │ owner_channel.py ─── contact_lookup.py ─── i18n.py │ │
│ │ │ │ │ │
│ │ Signal notify contacts.json 11+ langs │
│ │ Signal poll (3s) CNAM lookup Signal │
│ │ Slash commands Lang from prefix templates │
│ │ Owner instructions Per-contact lang │
│ └─────────┬────────────────────────────────────────────┘ │
│ │ HTTP │
│ ▼ │
│ ┌─────────────────┐ ┌──────────────────────┐ │
│ │ signal-cli :8080 │ │ Volumes: │ │
│ │ REST API │ │ tts_cache (MP3s) │ │
│ │ Signal servers │ │ /data/calls/ (JSON) │ │
│ └─────────────────┘ │ /data/contacts.json │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
▲
│ Signal protocol
▼
┌────────────┐
│ Owner's │
│ Signal │
└────────────┘
| Parameter | Value | Description |
|---|---|---|
speech_timeout |
1 s | Silence after speech before Twilio fires callback |
LLM max_tokens |
180 | Max response length per turn |
| Hard turn limit | 10 exchanges | AVA wraps up the call |
| ElevenLabs timeout | 15 s | HTTP timeout for TTS API |
| ElevenLabs circuit breaker | 10 min | Auto-disable on 401/403/429 |
| Signal poll interval | 3 s | Check for new owner messages |
| Rate limiter | 30 req/min/IP | Sliding window |
| Call cleanup | 90 s delay | Cleanup after call ends |
| TTS cache | no expiry | Persists in Docker volume |
Internet traffic reaches AVA via Caddy (ports 80/443) or Cloudflare Tunnel (no open ports required). All other services run exclusively on the internal Docker network.