Skip to content

[Bug]: Word (.docx) parsing fails in Title Chunker with "NoneType + str" error / [ERROR]expected string or bytes-like object, got 'NoneType' #13399

@Sandrine-lll

Description

@Sandrine-lll

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

20260305

RAGFlow image version

v0.24.0

Other environment information

Deployment: Docker
OS: v0.24.0
RAGFlow: Latest version
Browser: Chrome

Actual behavior

When using RAGFlow pipeline to parse a Word (.docx) document, the pipeline fails at the Title Chunker stage.

The Parser step finishes successfully, but the Title Chunker throws the following error:

unsupported operand type(s) for +: 'NoneType' and 'str'

Full log:

[File]:
10:30:15: File fetched.
10:30:15: Done
Start the pipeline...

[Parser]:
10:30:15: Start to work on a Word Processor Document
10:30:17: Done

[Title Chunker]:
10:30:17: Start to merge hierarchically.
10:30:17: unsupported operand type(s) for +: 'NoneType' and 'str'
10:30:17: [ERROR]unsupported operand type(s) for +: 'NoneType' and 'str'

The pipeline structure is:

File → Parser (Word) → Title Chunker → Embedding

Expected behavior

Word (.docx) documents should be parsed successfully through the pipeline, including the Title Chunker stage, similar to PDF, TXT, or Markdown files.

Steps to reproduce

When using RAGFlow pipeline to parse a Word (.docx) document, the pipeline fails at the Title Chunker stage.

The Parser step finishes successfully, but the Title Chunker throws the following error:

unsupported operand type(s) for +: 'NoneType' and 'str'

Full log:

[File]:
10:30:15: File fetched.
10:30:15: Done
Start the pipeline...
---------------------------------------
[Parser]:
10:30:15: Start to work on a Word Processor Document
10:30:17: Done
---------------------------------------
[Title Chunker]:
10:30:17: Start to merge hierarchically.
10:30:17: unsupported operand type(s) for +: 'NoneType' and 'str'
10:30:17: [ERROR]unsupported operand type(s) for +: 'NoneType' and 'str'

The pipeline structure is:

File → Parser (Word) → Title Chunker → Embedding

Additional information

标题模版.json

Image

Metadata

Metadata

Assignees

Labels

🐞 bugSomething isn't working, pull request that fix bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions