Skip to content

Realtime SDK + Azure: unsupported fields; initial config ignored #466

@AlexRRR

Description

@AlexRRR

Summary

When using the Agents Realtime SDK with Azure OpenAI Realtime (WebRTC), the SDK emits OpenAI-specific session fields that Azure rejects (e.g., session.type, output_modalities). Also, with WebRTC the session is created during SDP, so initial agent config (instructions/tools) is ignored until the data channel opens; subsequent SDK-driven session.update calls include fields Azure doesn’t accept, preventing config from applying cleanly.

Notes

I recognize these incompatibilities stem primarily from Azure OpenAI’s Realtime implementation differing from OpenAI’s GA interface (for example, rejecting session.type, output_modalities, and disallowing ["audio"]-only modalities). Still, I wanted to open this issue so the community has a clear record of the symptoms, error messages, and a practical workaround. Even if the “fix” ultimately belongs on the Azure side (which btw I have no idea where to report), a provider-aware path or extensibility hook in the SDK would help developers avoid common pitfalls and ship faster across providers.

Environment

  • @openai/agents-realtime: 0.1.0
  • @openai/agents: 0.1.0
  • openai: 5.19.1
  • Next.js: 15.5.2
  • React: 19.1.1
  • Node.js: 20.x
  • Client: Chrome (WebRTC, ephemeral client key)
  • Provider: Azure OpenAI Realtime (region WebRTC endpoint)
  • gpt-realtime

Repro steps

  1. Browser, WebRTC:
    import { RealtimeAgent, RealtimeSession } from "@openai/agents-realtime";
    
    const agent = new RealtimeAgent({ name: "Assistant", instructions: "..." });
    const session = new RealtimeSession(agent, {
      model: "gpt-realtime",
      // baseUrl points to Azure WebRTC endpoint
      // e.g. https://<region>.realtimeapi-preview.ai.azure.com/v1/realtimertc
    });
    
    await session.connect({
      apiKey: "<ephemeral>",
      url: "https://<region>.realtimeapi-preview.ai.azure.com/v1/realtimertc",
    });
  2. Observe transport events (session.created, then session.update).

Observed behavior

  • session.created shows Azure defaults (expected for WebRTC).
  • SDK sends session.update with OpenAI-only fields:
    • Azure error: Unknown parameter: session.type.
    • Azure error: Unknown parameter: session.output_modalities.
  • Attempting to set modalities: ["audio"]:
    • Azure error: Invalid modalities. Supported: ["text"] or ["audio","text"].
  • Agent instructions/tools often don’t take effect because the follow-up session.update is rejected due to unsupported fields.

Example error events (from logs):

  • Unknown parameter: 'session.type'
  • Unknown parameter: 'session.output_modalities'
  • Invalid modalities: ['audio']

Expected behavior

  • SDK should allow provider-aware session.update payloads and avoid sending fields the provider does not support.
  • Post-connect updates should apply minimal, Azure-accepted changes (e.g., instructions, tools, tool_choice, voice, model) without error.

Workarounds that work

  • Custom WebRTC transport that strips OpenAI-only fields and allowlists Azure-accepted keys for session.update. Also avoid sending session.type in tracing updates.
    import { OpenAIRealtimeWebRTC } from "@openai/agents-realtime";
    
    class AzureRealtimeWebRTC extends OpenAIRealtimeWebRTC {
      protected _updateTracingConfig(tracing: unknown) {
        this.sendEvent({
          type: "session.update",
          session: { tracing: tracing === "auto" ? "auto" : tracing },
        });
      }
    
      updateSessionConfig(config: Partial<Record<string, unknown>>) {
        const merged = this._getMergedSessionConfig(config) as Record<string, unknown>;
        const session: Record<string, unknown> = {};
        if (typeof merged.model === "string") session.model = merged.model;
        if (typeof merged.instructions === "string") session.instructions = merged.instructions;
        if (Array.isArray(merged.tools)) session.tools = merged.tools;
        if (typeof merged.tool_choice !== "undefined") session.tool_choice = merged.tool_choice;
        if (typeof merged.voice === "string") session.voice = merged.voice;
        this.sendEvent({ type: "session.update", session }); // no type/output_modalities/audio/tracing
      }
    
      async connect(options: any) {
        // Ensure Azure SDP URL includes ?model=<deployment>
        const baseUrl = options?.url;
        const model = options?.model;
        let urlToUse = baseUrl;
        if (baseUrl && model) {
          try {
            const u = new URL(baseUrl);
            if (!u.searchParams.has("model")) u.searchParams.set("model", model);
            urlToUse = u.toString();
          } catch { /* keep baseUrl */ }
        }
        return await super.connect({ ...options, url: urlToUse });
      }
    }
  • Use minimal initial agent; apply instructions/tools right after connect via a minimal session.update.
  • Keep tracing: null to avoid tracing-shaped updates that include unsupported fields.

Usage of workaround:

// usage in a browser (WebRTC)
import { RealtimeAgent, RealtimeSession } from "@openai/agents-realtime";
import { AzureRealtimeWebRTC } from "./azure-realtime-transport";

// Keep initial agent minimal; Azure creates session during SDP, then we update
const agent = new RealtimeAgent({
  name: "Assistant",
  instructions: "",
});

// Azure region WebRTC endpoint (no model param needed here; transport adds it)
const baseUrl = "https://<region>.realtimeapi-preview.ai.azure.com/v1/realtimertc";
const model = "gpt-realtime"; // your Azure deployment name

const transport = new AzureRealtimeWebRTC({
  baseUrl,
  useInsecureApiKey: true, 
});

const session = new RealtimeSession(agent, {
  model,
  transport,
  config: { tracing: null }, // avoid tracing-shaped updates Azure may reject
});

// Obtain ephemeral client key from your backend first
await session.connect({
  apiKey: "<ephemeral_client_key>",
  url: baseUrl,
  model,
  initialSessionConfig: {
    instructions: "You are a helpful assistant. Always use tools when available.",
    tools: [ your_tools],
  },
});

Why this matters

  • The SDK defaults are tuned for OpenAI GA. Azure’s Realtime (preview) differs and rejects certain fields. Without a provider-aware path, common SDK flows (initialSessionConfig, updateAgent) fail on Azure.

Proposed solutions

  • Add a provider adapter/flag (provider: "openai" | "azure") that adjusts session.update payloads.
  • Or expose a hook to transform outgoing session.update before sendEvent.
  • Make tracing updates optional and avoid session.type when disabled; avoid output_modalities for non-OpenAI providers.
  • Document provider differences (Azure WebRTC: session created at SDP; valid modalities; minimal post-connect updates).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions