[BUG] plugin_daemon fails to parse stream response from OpenAI Compatible model (e.g., Xinference), resulting in empty answer in Dify frontend

**Self Checks**

To make sure we get to you in time, please check the following :)
- [ ] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify-plugin-daemon/issues), including closed ones.
- [ ] I confirm that I am using English to submit this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
- [ ] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:)
- [ ] "Please do not modify this template :) and fill in all the required fields."

**Versions**
1. dify-plugin-daemon:0.2.0-local
2. dify1.8.1

**Describe the bug**
When using Dify 1.8.1 with an OpenAI API-compatible model provider (e.g., Xinference) configured, the model returns an empty result (answer: "") in the Dify application chat interface, although the model itself is loaded correctly. The page shows a timeout of about 20 seconds and calculates token usage (but completion_tokens is 0), indicating the request was sent to the model service, but the final content was not correctly received or displayed by Dify. Detailed investigation confirms the issue lies in the communication between Dify's plugin_daemon component and Xinference.

**To Reproduce**
Steps to reproduce the behavior:
1.Deploy Dify 1.8.1 offline on a CentOS 7.9 server.
2.Deploy Xinference on the same network environment and successfully load the qwen2.5-instruct model.
3.In Dify's backend, add an "OpenAI API-compatible" model provider.
4.Create a new application and select the configured model.
5.In the application frontend, initiate a conversation (e.g., input "Hello").
6.Observe the result: After several seconds, the page displays the elapsed time and token consumption, but the reply content remains blank. In the F12 developer tools Network panel, the answer field in the chat-messages request response is an empty string.

**Behavioral Differences**
Prompt Generator: When using the "Prompt Generator" in Dify's workflow, plugin_daemon sends "stream": false requests and receives standard JSON responses, displaying correctly in the frontend.
Application chat: When chatting in an application, plugin_daemon sends "stream": true requests and receives SSE responses, but the frontend answer is empty.

**Expected behavior**
When a user initiates a conversation in the Dify frontend, the model should generate a reply normally and display it in the interface. Even if plugin_daemon sends a streaming request, it should correctly handle the SSE stream response from the OpenAI Compatible model service and integrate it into complete text content returned to the frontend.

**Screenshots**
<img width="1295" height="697" alt="Image" src="https://github.com/user-attachments/assets/ca2c7085-ccac-4ab9-9d6f-9462653b4f94" />

**Additional context**
Add any other context about the problem here.

[Prompt Generator's logs][1.9.txt](https://github.com/user-attachments/files/22904580/1.9.txt)

[Application Chat's logs][2.1.txt](https://github.com/user-attachments/files/22904583/2.1.txt)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] plugin_daemon fails to parse stream response from OpenAI Compatible model (e.g., Xinference), resulting in empty answer in Dify frontend #477

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] plugin_daemon fails to parse stream response from OpenAI Compatible model (e.g., Xinference), resulting in empty answer in Dify frontend #477

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions