Skip to content

[FEA] Support polars.Expr.str.normalize in cudf-polars #19001

@brandon-b-miller

Description

@brandon-b-miller

Is your feature request related to a problem? Please describe.
Today on branch-25.08, we lack support for polars.Expr.str.normalize. Running the example from the docs yields:

import polars as pl
engine = pl.GPUEngine(raise_on_fail=True)

df = pl.DataFrame({"text": ["01²", "KADOKAWA"]}).lazy()
df = df.with_columns(
    nfc=pl.col("text").str.normalize("NFC"),
    nfkc=pl.col("text").str.normalize("NFKC"),
)

res = df.collect(engine=engine)
print(res)
NotImplementedError('String function <Name.Normalize: 22>

Describe the solution you'd like
I'd like the above code to be able to execute using the polars GPU backend. I don't see a feature or group of features in libcudf that would be an obvious way of computing the equivalent on the GPU, so we might have to build something somewhat from scratch to support this eventually.

Describe alternatives you've considered
N/A

Additional context
#16480

Metadata

Metadata

Assignees

No one assigned

    Labels

    cudf-polarsIssues specific to cudf-polarsfeature requestNew feature or request

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions