- 
                Notifications
    You must be signed in to change notification settings 
- Fork 248
Description
Self Checks
To make sure we get to you in time, please check the following :)
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- "Please do not modify this template :) and fill in all the required fields."
Versions
- dify-plugin-daemon:0.2.0-local
- dify1.8.1
Describe the bug
When using Dify 1.8.1 with an OpenAI API-compatible model provider (e.g., Xinference) configured, the model returns an empty result (answer: "") in the Dify application chat interface, although the model itself is loaded correctly. The page shows a timeout of about 20 seconds and calculates token usage (but completion_tokens is 0), indicating the request was sent to the model service, but the final content was not correctly received or displayed by Dify. Detailed investigation confirms the issue lies in the communication between Dify's plugin_daemon component and Xinference.
To Reproduce
Steps to reproduce the behavior:
1.Deploy Dify 1.8.1 offline on a CentOS 7.9 server.
2.Deploy Xinference on the same network environment and successfully load the qwen2.5-instruct model.
3.In Dify's backend, add an "OpenAI API-compatible" model provider.
4.Create a new application and select the configured model.
5.In the application frontend, initiate a conversation (e.g., input "Hello").
6.Observe the result: After several seconds, the page displays the elapsed time and token consumption, but the reply content remains blank. In the F12 developer tools Network panel, the answer field in the chat-messages request response is an empty string.
Behavioral Differences
Prompt Generator: When using the "Prompt Generator" in Dify's workflow, plugin_daemon sends "stream": false requests and receives standard JSON responses, displaying correctly in the frontend.
Application chat: When chatting in an application, plugin_daemon sends "stream": true requests and receives SSE responses, but the frontend answer is empty.
Expected behavior
When a user initiates a conversation in the Dify frontend, the model should generate a reply normally and display it in the interface. Even if plugin_daemon sends a streaming request, it should correctly handle the SSE stream response from the OpenAI Compatible model service and integrate it into complete text content returned to the frontend.
Additional context
Add any other context about the problem here.
[Prompt Generator's logs]1.9.txt
[Application Chat's logs]2.1.txt
