Skip to content

[FEA] Add a setup API for cuIO to help benchmarking #19009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
GregoryKimball opened this issue May 28, 2025 · 1 comment
Open

[FEA] Add a setup API for cuIO to help benchmarking #19009

GregoryKimball opened this issue May 28, 2025 · 1 comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Milestone

Comments

@GregoryKimball
Copy link
Contributor

GregoryKimball commented May 28, 2025

Is your feature request related to a problem? Please describe.
Currently, benchmarking tools like PDS include some pre-loading work to help complete cuIO setup that is done lazily.

https://github.com/pola-rs/polars-benchmark/blob/4924e48204dd0809c260c0b0fa2174682ba7480b/queries/polars/utils.py#L66

cuIO setup steps include:

  • loading cuFile (if enabled)
  • loading nvCOMP (if host compression, decompression is set to AUTO or OFF)
  • allocating KvikIO pinned buffers

Describe the solution you'd like
We should add an API to complete the buffer allocation and library loading, so that the first cuIO read has the same latency as the second read. The API would need to be public and have Python bindings (perhaps cudf::io::setup() or some other design would be fine)

Describe alternatives you've considered
Continue adding dummy IO steps to benchmarks.

Additional context
We talked about doing this automatically during import cudf, but it would add pressure to the overall import time. See #627

@GregoryKimball GregoryKimball added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue labels May 28, 2025
@GregoryKimball GregoryKimball added this to the Benchmarking milestone May 28, 2025
@vyasr
Copy link
Contributor

vyasr commented May 30, 2025

I'd rather document this somewhere or have some snippets outside of the main library that we can look into this. It feels too specific to be something that actually lives in the package itself long-term. It could be short-lived inside the repo while we experiment though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

2 participants