Skip to content

[FEA] Support polars.Expr.str.len_bytes in cudf-polars #18999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandon-b-miller opened this issue May 28, 2025 · 2 comments
Open

[FEA] Support polars.Expr.str.len_bytes in cudf-polars #18999

brandon-b-miller opened this issue May 28, 2025 · 2 comments
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request

Comments

@brandon-b-miller
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Today on branch-25.08, we lack support for polars.Expr.str.len_bytes. Running the example from the docs yields:

import polars as pl
engine = pl.GPUEngine(raise_on_fail=True)

df = pl.DataFrame({"a": ["Café", "345", "東京", None]}).lazy()
df = df.with_columns(
    pl.col("a").str.len_bytes().alias("n_bytes"),
    pl.col("a").str.len_chars().alias("n_chars"),
)

res = df.collect(engine=engine)
print(res)
NotImplementedError('String function <Name.LenChars: 20>```

Describe the solution you'd like
I'd like the above code to be able to execute using the polars GPU backend. This API simply returns the length of each string in bytes, there doesn't seem to be a public libcudf API that does this today but cudf::string_view does expose the size of the string in bytes, so this should be fairly simple to add in theory.

Describe alternatives you've considered
N/A

Additional context
#16480

@brandon-b-miller brandon-b-miller added the feature request New feature or request label May 28, 2025
@davidwendt
Copy link
Contributor

This seems to be the same as #19000
Do we need 2 issues for this?

@brandon-b-miller
Copy link
Contributor Author

The two are slightly different, len_bytes vs len_chars.

@wence- wence- added the cudf.polars Issues specific to cudf.polars label May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request
Projects
Status: Todo
Development

No branches or pull requests

3 participants