Model Protocol Strategy

Purpose

This document records the protocol strategy for model integration in the design-to-code project.

Related implementation record:

docs/large-output-generation-plan.md: strategy for handling large model outputs with chunked generation, continuation recovery, and incremental assembly.

It exists to answer four practical questions:

how providers should be configured
how the service layer should isolate protocol differences
why the project should not hard-code everything around OpenAI-compatible assumptions
how to keep the open-source version simple while still leaving room for future commercial hosting or multi-provider support

This document is intended to guide future refactoring of app/services/model.ts.

Current State

The current generation pipeline already has a useful task-oriented shape.

Current strengths:

the business tasks are already explicit: ui-schema, design-image, image-parse, html-code
provider-level model lists already exist
the main service API is task-oriented rather than vendor-oriented
OpenAI-compatible providers can already be wired with provider-specific base URLs and paths

Current limitations:

protocol concerns and provider concerns are mixed together in ModelProviderConfig
openAICompatible is too coarse to express real protocol differences
corsEndpoint currently acts as a generic fallback rather than a clean adapter type
the service assumes that many providers can be treated as OpenAI-like even when compatibility is only partial
stream support is effectively tied to OpenAI-compatible SSE behavior

The result is acceptable for a first phase, but it will become harder to maintain once the project needs to support more than one family of protocol behavior.

Decision Summary

The project should not continue on a long-term path where every provider is modeled as OpenAI-compatible.

The project should also not try to implement many protocols immediately.

Recommended strategy:

keep OpenAI-compatible support as the primary implementation path for now
refactor the service so OpenAI-compatible support becomes one adapter rather than the architecture itself
separate business tasks from protocol details
make room for future adapters such as Anthropic-style, Gemini-style, or fully custom REST integrations

In short:

short term: one stable adapter
medium term: adapter-based architecture
long term: multi-protocol support without leaking protocol-specific fields into the business layer

Why Not Use OpenAI-Compatible Everywhere

The phrase OpenAI-compatible is useful, but it is not a sufficient architecture boundary.

Different vendors that claim OpenAI compatibility often diverge in practice on one or more of these points:

unsupported fields such as response_format
differences in multi-modal message content format
differences in streaming event structure
differences in image generation endpoints and request bodies
differences in token limit parameters, sampling parameters, and naming conventions
differences in authentication, rate limiting, and error response shape

This means OpenAI-compatible should be treated as a protocol family, not as a guarantee of full interchangeability.

Why Not Add Many Protocols Immediately

The project is still in an open-source, product-shaping stage.

The current priority is:

preserve a working pipeline
keep the integration understandable for contributors
avoid exploding service complexity too early

Implementing multiple protocols before the architecture is clarified would create unnecessary maintenance cost.

The correct move is to establish a clean adapter boundary first, then add protocols only when a real provider requires them.

Architectural Principle

The architecture should be organized around three layers.

1. Task Layer

This is the business-facing layer.

It should answer questions like:

generate UI Schema
generate design preview
parse image into schema
generate HTML plus Tailwind

This layer should not know whether the underlying provider uses OpenAI chat, Anthropic messages, Gemini content generation, or a custom REST gateway.

2. Protocol Adapter Layer

This layer is responsible for translating task requests into provider-specific HTTP or SDK calls.

Examples of adapter families:

OpenAI-compatible chat adapter
OpenAI-compatible image adapter
custom REST adapter
future Anthropic adapter
future Gemini adapter

Each adapter should own:

request body mapping
response parsing
stream handling
protocol-specific error interpretation

3. Provider Configuration Layer

This layer describes each concrete provider instance.

It should answer questions like:

which protocol family does this provider use
which tasks does it support
which endpoints does it expose
which auth method does it require
which models should be used for each task

Recommended Direction For Config Shape

The current ModelProviderConfig should evolve from a mixed structure into a clearer composition.

Recommended conceptual shape:

type TaskType = 'ui-schema' | 'design-image' | 'image-parse' | 'html-code'

type ProtocolKind =
  | 'openai-compatible'
  | 'custom-rest'
  | 'anthropic'
  | 'gemini'

type AuthScheme = 'bearer' | 'api-key' | 'custom'

interface ProviderProtocolConfig {
  kind: ProtocolKind
  baseUrl?: string
  authScheme?: AuthScheme
  endpoints?: {
    chat?: string
    image?: string
    vision?: string
    customTask?: string
  }
}

interface ProviderCapabilities {
  streamText?: boolean
  imageInput?: boolean
  imageGeneration?: boolean
  structuredJson?: boolean
}

interface ProviderTaskModels {
  uiSchema: string[]
  designImage: string[]
  imageParse: string[]
  htmlCode: string[]
}

interface ProviderConfig {
  providerId: string
  apiKey: string
  protocol: ProviderProtocolConfig
  capabilities: ProviderCapabilities
  taskModels: ProviderTaskModels
}

This is not a requirement to refactor immediately line for line, but it is the target shape to keep in mind.

Recommended Adapter Boundary

The service layer should eventually dispatch by protocol adapter rather than by boolean flags.

Conceptual direction:

interface ProtocolAdapter {
  kind: ProtocolKind
  invoke(input: AdapterInvokeInput): Promise<ModelRawResponse>
  invokeStream?(input: AdapterStreamInput): Promise<ModelRawResponse>
}

The important point is not the exact TypeScript signature.

The important point is:

the service chooses a task
the task resolves to a provider and model
the provider points to an adapter family
the adapter handles protocol-specific details

This keeps protocol logic out of the business flow.

Recommended Task Flow

The task flow should remain task-oriented.

Example sequence for UI Schema generation:

task layer receives ui-schema
service resolves candidate providers and models
service selects the adapter from provider protocol config
adapter builds the request for that protocol family
response is normalized back into project-level output

This means retry logic and provider fallback remain valid even as protocols diversify.

Streaming Strategy

Streaming is currently the area most tightly coupled to OpenAI-compatible behavior.

This is acceptable for now, but the code should explicitly treat streaming support as capability-dependent.

Required rule:

streaming must not be assumed for every provider

Recommended direction:

OpenAI-compatible adapter may implement SSE-based streaming first
future adapters can either implement their own streaming behavior or declare no stream support
the task layer should ask whether the provider supports stream text before attempting stream mode

This will prevent accidental coupling between task behavior and a single protocol family's stream format.

Role Of Custom REST

The current corsEndpoint and corsImageEndpoint concept should be reinterpreted as a protocol adapter family rather than a fallback hack.

Recommended interpretation:

custom-rest is a first-class adapter type
it is valid for hosted proxies, vendor wrappers, internal bridges, or future commercial routing services

This is important because the project may later want a hosted mode without forcing all providers through an OpenAI-shaped abstraction.

Open-Source Route Guidance

For the open-source version, the integration strategy should remain simple.

Recommended rule set:

keep OpenAI-compatible integration as the default path
document that compatibility is partial and provider-specific
expose provider config in a way contributors can understand
do not attempt to support every vendor protocol immediately

This keeps onboarding practical while avoiding architectural dead ends.

Commercialization Readiness

This project may later support a hosted or managed model mode.

The protocol strategy should leave room for that without complicating the current open-source flow.

Commercialization-friendly implication:

hosted routing should be represented as another adapter family, not as a special case embedded into task logic
rate limiting, queueing, caching, and usage tracking belong outside the task layer
provider-specific secrets and commercial orchestration should not shape the public task API

This keeps the open-source local-key mode and future hosted mode aligned around the same internal task model.

Practical Refactoring Guidance

The service should be refactored in small steps.

Recommended order:

Phase 1. Clarify Config Semantics

Goals:

reduce ambiguity inside ModelProviderConfig
make protocol family explicit
make capability support explicit where needed

Possible outcome:

keep the existing type name temporarily
add a protocol field instead of relying only on openAICompatible

Phase 2. Extract OpenAI-Compatible Adapter

Goals:

move invokeOpenAICompatible
move invokeOpenAICompatibleStream
keep current behavior unchanged

Acceptance criteria:

no business logic changes
service behavior remains equivalent
build and existing generation flow still work

Phase 3. Rename Custom CORS Pathing As Adapter Behavior

Goals:

replace the idea of generic fallback with explicit custom-rest behavior
make intent clearer for future contributors

Acceptance criteria:

custom REST is visible as a protocol choice rather than an escape hatch

Phase 4. Prepare For Second Protocol Family

Goals:

define the contract needed for a second adapter
avoid implementing it until a real provider requires it

Acceptance criteria:

the architecture can accept another adapter without rewriting task logic

What Should Stay Stable

These project-level concepts should remain stable even if protocol support evolves:

task names
output result types
retry and fallback flow
provider attempt records
business prompts
UI status stages in Studio

The UI should not need to care whether a response came from OpenAI-compatible chat, a custom REST bridge, or a future non-OpenAI adapter.

What Should Change Later

These elements should evolve when the service layer is refactored:

openAICompatible boolean should no longer be the main protocol discriminator
provider config should separate protocol, capability, and model mapping
stream support should become capability-aware
custom hosted integrations should become explicit adapter types

Recommendation Summary

The project should adopt this position:

do not lock the architecture to OpenAI-compatible assumptions
do not rush into many protocol implementations
keep OpenAI-compatible as the primary adapter for now
build around a task layer plus protocol adapter layer plus provider config layer

This is the lowest-risk path for an open-source product that still wants to remain extensible.

Next Implementation Step

When the service layer is next refactored, the first concrete step should be:

extract the current OpenAI-compatible request and stream logic behind an adapter boundary
make provider protocol type explicit in configuration
preserve current task results and UI behavior

Only after that should the project consider adding another protocol family.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Protocol Strategy

Purpose

Current State

Decision Summary

Why Not Use OpenAI-Compatible Everywhere

Why Not Add Many Protocols Immediately

Architectural Principle

1. Task Layer

2. Protocol Adapter Layer

3. Provider Configuration Layer

Recommended Direction For Config Shape

Recommended Adapter Boundary

Recommended Task Flow

Streaming Strategy

Role Of Custom REST

Open-Source Route Guidance

Commercialization Readiness

Practical Refactoring Guidance

Phase 1. Clarify Config Semantics

Phase 2. Extract OpenAI-Compatible Adapter

Phase 3. Rename Custom CORS Pathing As Adapter Behavior

Phase 4. Prepare For Second Protocol Family

What Should Stay Stable

What Should Change Later

Recommendation Summary

Next Implementation Step

FilesExpand file tree

model-protocol-strategy.md

Latest commit

History

model-protocol-strategy.md

File metadata and controls

Model Protocol Strategy

Purpose

Current State

Decision Summary

Why Not Use OpenAI-Compatible Everywhere

Why Not Add Many Protocols Immediately

Architectural Principle

1. Task Layer

2. Protocol Adapter Layer

3. Provider Configuration Layer

Recommended Direction For Config Shape

Recommended Adapter Boundary

Recommended Task Flow

Streaming Strategy

Role Of Custom REST

Open-Source Route Guidance

Commercialization Readiness

Practical Refactoring Guidance

Phase 1. Clarify Config Semantics

Phase 2. Extract OpenAI-Compatible Adapter

Phase 3. Rename Custom CORS Pathing As Adapter Behavior

Phase 4. Prepare For Second Protocol Family

What Should Stay Stable

What Should Change Later

Recommendation Summary

Next Implementation Step