Skip to content

[FEA] Support polars.Expr.str.split in cudf-polars #19026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandon-b-miller opened this issue May 29, 2025 · 0 comments
Open

[FEA] Support polars.Expr.str.split in cudf-polars #19026

brandon-b-miller opened this issue May 29, 2025 · 0 comments
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request

Comments

@brandon-b-miller
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Today on branch-25.08, we lack support for polars.Expr.str.split. Running the example from the docs yields:

import polars as pl
engine = pl.GPUEngine(raise_on_fail=True)

df = pl.DataFrame({"s": ["foo bar", "foo_bar", "foo_bar_baz"]}).lazy()
df = df.with_columns(
    pl.col("s").str.split(by="_").alias("split"),
    pl.col("s").str.split(by="_", inclusive=True).alias("split_inclusive"),
)
res = df.collect(engine=engine)
print(res)
NotImplementedError('String function <Name.Split: 29>')

Describe the solution you'd like
I'd like the above code to be able to execute using the polars GPU backend. There's some string splitting functionality in libcudf but in most cases it returns a table with the split elements. In addition, since this API returns a list[str] dtype, it's blocked on list dtypes generally in cudf-polars.

Describe alternatives you've considered
N/A

Additional context
#16480

@brandon-b-miller brandon-b-miller added the feature request New feature or request label May 29, 2025
@wence- wence- added the cudf.polars Issues specific to cudf.polars label May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request
Projects
Status: Todo
Development

No branches or pull requests

2 participants