You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, benchmarking tools like PDS include some pre-loading work to help complete cuIO setup that is done lazily.
loading nvCOMP (if host compression, decompression is set to AUTO or OFF)
allocating KvikIO pinned buffers
Describe the solution you'd like
We should add an API to complete the buffer allocation and library loading, so that the first cuIO read has the same latency as the second read. The API would need to be public and have Python bindings (perhaps cudf::io::setup() or some other design would be fine)
Describe alternatives you've considered
Continue adding dummy IO steps to benchmarks.
Additional context
We talked about doing this automatically during import cudf, but it would add pressure to the overall import time. See #627
The text was updated successfully, but these errors were encountered:
I'd rather document this somewhere or have some snippets outside of the main library that we can look into this. It feels too specific to be something that actually lives in the package itself long-term. It could be short-lived inside the repo while we experiment though.
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem? Please describe.
Currently, benchmarking tools like PDS include some pre-loading work to help complete cuIO setup that is done lazily.
https://github.com/pola-rs/polars-benchmark/blob/4924e48204dd0809c260c0b0fa2174682ba7480b/queries/polars/utils.py#L66
cuIO setup steps include:
Describe the solution you'd like
We should add an API to complete the buffer allocation and library loading, so that the first cuIO read has the same latency as the second read. The API would need to be public and have Python bindings (perhaps
cudf::io::setup()
or some other design would be fine)Describe alternatives you've considered
Continue adding dummy IO steps to benchmarks.
Additional context
We talked about doing this automatically during
import cudf
, but it would add pressure to the overall import time. See #627The text was updated successfully, but these errors were encountered: