Skip to content

Dataframe doesn't properly implement ArrowStream export interface #1166

Closed
@johnnyg

Description

@johnnyg

Describe the bug
When trying to pass a dataframe to another library that expects an ArrowStream export interface we get the following error:

TypeError: argument 'input': DataFrame.__arrow_c_stream__() missing 1 required positional argument: 'requested_schema'

this is because the requested_schema argument should be optional but it's not.

To Reproduce

from arro3.core import RecordBatchReader
import datafusion

data = [{"num": 42}]
ctx = datafusion.SessionContext()
df = ctx.from_pylist(data)
reader = RecordBatchReader.from_arrow(df)

Expected behavior
The above to run without error

Additional context
Replacing RecordBatchReader.from_arrow(df) with RecordBatchReader.from_arrow(df.df) works around the bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions