Skip to content

RFC: Deprecating AutoDeploy Backend #15638

Description

@arysef

RFC: Deprecating AutoDeploy Backend

Rationale and Improving Model Support Moving Forward

Over the last year or so the TensorRT LLM team has been testing AutoDeploy as a compiler-based approach to supporting models closer to release date. Based on external usage and internal engineering alignment, the team plans to deprecate the AutoDeploy backend in TensorRT LLM. This will mean there will be no more feature development or added models. After a 3 month period, the backend will be removed from TensorRT LLM.

We are aware that earlier model support is a critical priority for many users of TensorRT LLM and are working on agentic approaches to improve time to functional model support in the PyTorch backend. As an early indicator, this was used to release Minimax M3 functional support on Day 0.

We plan to continue improving the modeling agent for implementing more models and more features as well as releasing the modeling agent for users once we are more confident about its reliability.

Retrospective

Successes

  • Over 100+ LLMs and VLMs supported out of the box.
  • Performance comparable to manually-tuned baselines for some models
  • Successful production deployments with a select set of customers

Why AutoDeploy didn't win as a product feature

  • AutoDeploy’s value-proposition (less effort/time to support a new model) was lessened by agents
    • Agentic coding has reduced the cost of manual implementation
  • Automation stopped short of the last mile
    • AutoDeploy automated graph-level work, but peak performance still relies on human and agent written optimizations for kernels

Feedback

We welcome feedback for this change and are interested to hear if you have any use cases that are not currently supported by the PyTorch backend. Our plan is that the PyTorch backend should have parity for any features that users currently rely on AutoDeploy backend for.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions