Multi-gpu KNN build for UMAP using all-neighbors API #6654

jinsolp · 2025-05-08T20:44:40Z

Description

Allows multi-gpu knn graph building in UMAP using the all-neighbors API.

PRs that need to be merged before this one

Changes in cuML UMAP usage

from pylibraft.common import DeviceResourcesSNMG

# if want to use multi GPU
multigpu_handle = DeviceResourcesSNMG()
umap_nnd = UMAP(handle = multigpu_handle, 
                 build_algo="nn_descent", 
                 build_kwds={"nnd_n_nearest_clusters": 2, 
                               "nnd_n_clusters": 8, 
                               "nnd_graph_degree": 32, 
                               "nnd_max_iterations": 20
                 })

Closes #6729

into umap-use-all-neighbors

divyegala

The hints in Python docs are awesome. But again, there is no choice for a user of UMAP to select multi-GPU build. We must not do this automatically.

cpp/include/cuml/manifold/umapparams.h

python/cuml/cuml/manifold/umap.pyx

jinsolp · 2025-05-12T21:35:55Z

@divyegala

But again, there is no choice for a user of UMAP to select multi-GPU build. We must not do this automatically.

I am working on multi-gpu resource on raft side. Currently the plan is to wrap this in pylibraft and advise users to explicitly pass this as the handle if they want to use multi-gpu. Like this;

from pylibraft.common import DeviceResourcesSNMG

# if want to use multi GPU
multigpu_handle = DeviceResourcesSNMG()
# can also do this indicating which GPUs they want to use
multigpu_handle = DeviceResourcesSNMG([2,4,6,7])

# pass the multigpu_handle as the handle to UMAP
umap_nnd = UMAP(handle = multigpu_handle, 
                 build_algo="nn_descent", 
                 build_kwds={"n_nearest_clusters": 2, 
                            "n_clusters": 8, 
                            "nn_descent": {'graph_degree': 32, 'max_iterations': 20}
                 })

If nothing is passed to the handle, then the default handle will be used, and this results in a single-gpu run.

Do you think there is a better way to let users do a multi-gpu build? Any advice would be greatly appreciated : )

divyegala · 2025-05-12T21:40:32Z

@jinsolp thanks for the explanation, that API looks good to me. Opt-in behavior is perfect.

We need to be very explicit about the fact that this is not multi-GPU UMAP but rather multi-GPU KNN step in UMAP. That's because users of cuml-dask expect that their data can be distributed across GPUs and we would still be able to create a learned ML model, but in this use-case data will need to be on a single-GPU for other parts of UMAP.

divyegala

Single-GPU C++ and Python LGTM, but I'd like @csadorf to assign a Python reviewer as well. For multi-GPU, I'd like to see tests added perhaps in the dask module to ensure the environment has enough GPUs.

viclafargue

Thanks @jinsolp! Just two small comments

cpp/src/umap/knn_graph/algo.cuh

python/cuml/cuml/manifold/umap.pyx

csadorf · 2025-05-13T17:32:27Z

I would like to review this PR prior to merge.

cpp/cmake/thirdparty/get_cuvs.cmake

divyegala

Cpp approval only. LGTM!

csadorf · 2025-05-27T17:01:34Z

@jcrist Can you approve this PR assuming that your concerns were addressed?

jinsolp · 2025-05-27T19:52:44Z

Changed docs (content staying almost same, change in style) in .pyx file because of docs build failure in CI
@csadorf

into umap-use-all-neighbors

divyegala

This PR is blocked on some changes from rapidsai/cuvs#944

csadorf

A combination of nnd_n_clusters>1 and data_on_host=False (the default) will currently break user code, because batching is not supported on device.

Agreed mitigation approach is to auto-set data_on_host=True with a deprecation warning in 25.06 and then require data_on_host=True in combination with nnd_n_clusters>1 as of 25.08.

No longer blocked.

csadorf

Just a tiny suggestion for language, otherwise LGTM.

python/cuml/cuml/manifold/umap.pyx

divyegala · 2025-05-28T22:11:05Z

/merge

using all-neighbors from umap

39353cf

jinsolp requested review from a team as code owners May 8, 2025 20:44

jinsolp requested review from dantegd and vyasr May 8, 2025 20:44

github-actions bot added Cython / Python Cython or Python issue CUDA/C++ labels May 8, 2025

jinsolp added 3 commits May 8, 2025 13:44

Merge branch 'branch-25.06' into umap-use-all-neighbors

f7f580c

pytest

79cb4f3

Merge branch 'umap-use-all-neighbors' of https://github.com/jinsolp/cuml

888dd7a

into umap-use-all-neighbors

jinsolp self-assigned this May 8, 2025

jinsolp added breaking Breaking change feature request New feature or request labels May 8, 2025

jinsolp mentioned this pull request May 12, 2025

[Tracker] MGPU support for UMAP #6571

Open

cjnolet added this to Vector Search, ML, & Data Mining Release Board May 12, 2025

cjnolet moved this to In Progress in Vector Search, ML, & Data Mining Release Board May 12, 2025

divyegala requested changes May 12, 2025

View reviewed changes

jinsolp force-pushed the umap-use-all-neighbors branch from d2383e9 to 888dd7a Compare May 12, 2025 22:16

jinsolp added 2 commits May 12, 2025 22:33

addressing review

79960e1

Merge branch 'branch-25.06' into umap-use-all-neighbors

34ea7e5

jinsolp removed request for dantegd and vyasr May 12, 2025 22:34

doc

c627328

divyegala reviewed May 13, 2025

View reviewed changes

viclafargue reviewed May 13, 2025

View reviewed changes

cpp/src/umap/knn_graph/algo.cuh Outdated Show resolved Hide resolved

python/cuml/cuml/manifold/umap.pyx Outdated Show resolved Hide resolved

csadorf self-requested a review May 13, 2025 16:38

Merge branch 'branch-25.06' into umap-use-all-neighbors

6a9f4ec

csadorf added non-breaking Non-breaking change and removed breaking Breaking change labels May 22, 2025

csadorf requested changes May 22, 2025

View reviewed changes

cpp/cmake/thirdparty/get_cuvs.cmake Outdated Show resolved Hide resolved

Merge branch 'branch-25.06' into umap-use-all-neighbors

006d8c6

divyegala approved these changes May 27, 2025

View reviewed changes

viclafargue approved these changes May 27, 2025

View reviewed changes

jinsolp added 2 commits May 27, 2025 18:19

revert cmake

e5dc732

revert cmake

2ead5e7

github-actions bot removed the CMake label May 27, 2025

jinsolp added 2 commits May 27, 2025 14:21

Merge branch 'branch-25.06' into umap-use-all-neighbors

fcf45b2

docs

e2b6b22

csadorf approved these changes May 27, 2025

View reviewed changes

jcrist approved these changes May 27, 2025

View reviewed changes

jinsolp added 3 commits May 27, 2025 17:52

Merge branch 'branch-25.06' into umap-use-all-neighbors

b8282b3

empty commit

160708f

Merge branch 'umap-use-all-neighbors' of https://github.com/jinsolp/cuml

e05ffc1

into umap-use-all-neighbors

divyegala previously requested changes May 27, 2025

View reviewed changes

change nearest_clusters->overlap_factor

ba2e4d9

csadorf requested changes May 28, 2025

View reviewed changes

jinsolp added 2 commits May 28, 2025 21:19

deprecation putting data on host

8195cab

newline

6d8662f

csadorf approved these changes May 28, 2025

View reviewed changes

python/cuml/cuml/manifold/umap.pyx Outdated Show resolved Hide resolved

python/cuml/cuml/manifold/umap.pyx Outdated Show resolved Hide resolved

python/cuml/cuml/manifold/umap.pyx Outdated Show resolved Hide resolved

fix wording

839e0ad

rapids-bot bot merged commit c1a572d into rapidsai:branch-25.06 May 29, 2025
92 of 93 checks passed

github-project-automation bot moved this from In Progress to Done in Vector Search, ML, & Data Mining Release Board May 29, 2025

jinsolp deleted the umap-use-all-neighbors branch May 31, 2025 05:09

Multi-gpu KNN build for UMAP using all-neighbors API #6654

Multi-gpu KNN build for UMAP using all-neighbors API #6654

Uh oh!

Conversation

jinsolp commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

PRs that need to be merged before this one

Changes in cuML UMAP usage

Uh oh!

divyegala left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jinsolp commented May 12, 2025

Uh oh!

divyegala commented May 12, 2025

Uh oh!

divyegala left a comment

Choose a reason for hiding this comment

Uh oh!

viclafargue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

csadorf commented May 13, 2025

Uh oh!

Uh oh!

divyegala left a comment

Choose a reason for hiding this comment

Uh oh!

csadorf commented May 27, 2025

Uh oh!

jinsolp commented May 27, 2025

Uh oh!

divyegala left a comment

Choose a reason for hiding this comment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

divyegala commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

jinsolp commented May 8, 2025 •

edited

Loading