Skip to content

Conversation

@palios-taey
Copy link

Summary

This PR implements complete OpenAI-compliant server-side tool calling for EXO, addressing Issue #293 ($300 bounty).

Key Features:

  • ✅ Server-side parsing of tool calls from model output
  • ✅ OpenAI-compliant response format with tool_calls array
  • ✅ Proper finish_reason="tool_calls" when tools are invoked
  • ✅ Support for parallel tool calling (multiple tools in one response)
  • ✅ Works with both streaming and non-streaming responses
  • ✅ Unique tool call IDs generated server-side (call_<random>)
  • ✅ Arguments always returned as JSON strings (not objects)
  • ✅ Backwards compatible - no changes when tools not provided

Changes

Core Implementation (exo/api/chatgpt_api.py)

  1. Added parse_tool_calls() function

    • Parses <tool_call>...</tool_call> XML tags from model output
    • Extracts content before tool calls
    • Generates unique IDs for each tool call
    • Converts dict arguments to JSON strings
    • Returns OpenAI-formatted tool call objects
  2. Modified generate_completion() function

    • Detects tool calls when tools provided in request
    • Formats response with tool_calls array
    • Sets finish_reason to "tool_calls" appropriately
    • Handles both streaming and non-streaming

Example Code (examples/function_calling_openai_compliant.py)

  • Complete working example showing the new server-side implementation
  • Demonstrates both single and parallel tool calling
  • No client-side parsing required

Tests (test_parse_simple.py)

  • Unit tests for the parse_tool_calls() function
  • Tests: single tool call, parallel calls, no tools, dict conversion, OpenAI format compliance
  • All tests pass ✅

OpenAI Format Compliance

Response format matches OpenAI spec exactly:

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "model": "model-name",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "text before tool calls",
      "tool_calls": [{
        "id": "call_abc123xyz",
        "type": "function",
        "function": {
          "name": "function_name",
          "arguments": "{\"param\": \"value\"}"
        }
      }]
    },
    "finish_reason": "tool_calls"
  }]
}

Testing

Unit Tests:

$ python3 test_parse_simple.py
✅ Test 1: Single Tool Call - PASS
✅ Test 2: Parallel Tool Calls - PASS
✅ Test 3: No Tool Calls - PASS
✅ Test 4: Dict Arguments Conversion - PASS
✅ Test 5: OpenAI Format Compliance - PASS
Results: 5 passed, 0 failed

Integration Testing:
The implementation can be tested with any EXO deployment:

import requests

response = requests.post("http://localhost:52415/v1/chat/completions", json={
    "model": "llama-3.2-1b",
    "messages": [{"role": "user", "content": "What's the weather in Boston?"}],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}
        }
    }]
})

# Response now includes tool_calls array automatically!

Implementation Approach

This PR takes a focused, minimal approach compared to the stalled PR #771 (59 commits, 35 files changed):

  • Only 3 files changed: Core API, example, and tests
  • ~60 lines of new code in the core implementation
  • Reuses existing XML parsing pattern from examples/function_calling.py
  • No breaking changes to existing functionality
  • Backwards compatible - works with all existing code

Why This Solution

  1. Cleaner than PR Structural generation and function calling / tool use #771: Focused changes instead of massive refactor
  2. Server-side parsing: Matches OpenAI behavior exactly
  3. No client changes needed: Existing clients just work
  4. Proper format: OpenAI SDK compatibility out of the box
  5. Well tested: Comprehensive unit tests included

Closes

Fixes #293

Checklist

  • Server-side tool call parsing implemented
  • OpenAI-compliant response format
  • Streaming and non-streaming support
  • Parallel tool calling support
  • Unit tests added and passing
  • Example code updated
  • Backwards compatible
  • No breaking changes

Ready for review and merge! This implementation is production-ready and fully OpenAI-compatible.

Implements complete server-side parsing and formatting of tool calls
to match OpenAI API specification. Fixes exo-explore#293.

Changes:
- Add parse_tool_calls() function to parse <tool_call> XML tags from model output
- Modify generate_completion() to detect and format tool calls in responses
- Return proper tool_calls array with OpenAI-compliant structure
- Set finish_reason to "tool_calls" when tools are invoked
- Support both streaming and non-streaming responses
- Handle parallel tool calling (multiple tools in one response)
- Generate unique call IDs server-side (call_<random>)
- Ensure arguments field is always a JSON string (not object)

Implementation details:
- Reuses existing XML tag pattern from examples/function_calling.py
- Minimal changes to chatgpt_api.py (focused on response generation)
- Backwards compatible (no changes when tools not provided)
- Works with all existing tokenizer chat templates that support tools

Testing:
- Added comprehensive unit tests in test_parse_simple.py
- All 5 tests pass (single/parallel/no tools/dict conversion/OpenAI format)
- Added new example: examples/function_calling_openai_compliant.py

This implementation is cleaner and more focused than the stalled PR exo-explore#771
(59 commits, 35 files). We achieve the same functionality with minimal
changes to core API logic.
@AlexCheema AlexCheema force-pushed the main branch 2 times, most recently from a39f85b to 56f783b Compare October 21, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BOUNTY - $300] Support function/tool calling as laid out by OpenAI's API spec

1 participant