Skip to content

[BUG] plugin_daemon fails to parse stream response from OpenAI Compatible model (e.g., Xinference), resulting in empty answer in Dify frontend #477

@simons19920101

Description

@simons19920101

Self Checks

To make sure we get to you in time, please check the following :)

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • "Please do not modify this template :) and fill in all the required fields."

Versions

  1. dify-plugin-daemon:0.2.0-local
  2. dify1.8.1

Describe the bug
When using Dify 1.8.1 with an OpenAI API-compatible model provider (e.g., Xinference) configured, the model returns an empty result (answer: "") in the Dify application chat interface, although the model itself is loaded correctly. The page shows a timeout of about 20 seconds and calculates token usage (but completion_tokens is 0), indicating the request was sent to the model service, but the final content was not correctly received or displayed by Dify. Detailed investigation confirms the issue lies in the communication between Dify's plugin_daemon component and Xinference.

To Reproduce
Steps to reproduce the behavior:
1.Deploy Dify 1.8.1 offline on a CentOS 7.9 server.
2.Deploy Xinference on the same network environment and successfully load the qwen2.5-instruct model.
3.In Dify's backend, add an "OpenAI API-compatible" model provider.
4.Create a new application and select the configured model.
5.In the application frontend, initiate a conversation (e.g., input "Hello").
6.Observe the result: After several seconds, the page displays the elapsed time and token consumption, but the reply content remains blank. In the F12 developer tools Network panel, the answer field in the chat-messages request response is an empty string.

Behavioral Differences
Prompt Generator: When using the "Prompt Generator" in Dify's workflow, plugin_daemon sends "stream": false requests and receives standard JSON responses, displaying correctly in the frontend.
Application chat: When chatting in an application, plugin_daemon sends "stream": true requests and receives SSE responses, but the frontend answer is empty.

Expected behavior
When a user initiates a conversation in the Dify frontend, the model should generate a reply normally and display it in the interface. Even if plugin_daemon sends a streaming request, it should correctly handle the SSE stream response from the OpenAI Compatible model service and integrate it into complete text content returned to the frontend.

Screenshots
Image

Additional context
Add any other context about the problem here.

[Prompt Generator's logs]1.9.txt

[Application Chat's logs]2.1.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions