[SYCL] Use shared_ptr instead of manual changing UR counters #18565

Alexandr-Konovalov · 2025-05-20T12:50:04Z

Keep kernel and program handle in a structure that lifetime is controlled by shared_ptr. This is faster wrt current implementation, because only one atomic operation is required for copying-destroying shared_ptr, while pair of kernel/program retain/release calls requires 2 atomic operations in the best case.

Keep kernel and program handle in a structure with lifetime controlled by shared_ptr. This is faster wrt current implementation, because only one atomic operation is required for copying-destroying shared_ptr, while pair of kernel/program retain/release requires at least 2 atomic operations. Signed-off-by: Alexandr Konovalov <[email protected]>

Signed-off-by: Alexandr Konovalov <[email protected]>

sycl/test-e2e/KernelAndProgram/disable-caching.cpp

sycl/test-e2e/XPTI/basic_event_collection_linux.cpp

sycl/source/detail/graph_impl.cpp

vinser52 · 2025-06-02T12:58:46Z

@intel/dpcpp-nativecpu-reviewers please review.

hvdijk · 2025-06-02T13:06:00Z

I'm not sure why NativeCPU needs to review this. The diff shows sycl/test/native_cpu/multiple_definitions.cpp being added, but that isn't from this PR, that's from #18726. What happened to cause it to show up here?

Alexandr-Konovalov · 2025-06-02T13:14:30Z

I'm not sure why NativeCPU needs to review this. The diff shows sycl/test/native_cpu/multiple_definitions.cpp being added, but that isn't from this PR, that's from #18726. What happened to cause it to show up here?

Sorry, my mistake. Removed.

aelovikov-intel · 2025-06-02T14:37:20Z

sycl/source/detail/kernel_name_based_cache_t.hpp

+
+struct FastKernelCacheVal {
+  ur_kernel_handle_t MKernelHandle;    /* UR kernel handle pointer. */
+  std::mutex *MMutex;                  /* Mutex guarding this kernel. */


Shouldn't this be a reference?

Sadly, no, as we pass nullptr, when caching is disabled.

I would suggest making such a change in a separate PR if we agree it is needed. This PR just changes the ref counting schema for the cache, but not the data structure fields.

Sadly, no, as we pass nullptr, when caching is disabled.

This is important, can you mention that in the in-code comment?

Sadly, no, as we pass nullptr, when caching is disabled.

This is important, can you mention that in the in-code comment?

Done.

aelovikov-intel · 2025-06-02T14:37:38Z

sycl/source/detail/kernel_name_based_cache_t.hpp

+  const KernelArgMask *MKernelArgMask; /* Eliminated kernel argument mask. */
+  ur_program_handle_t MProgramHandle;  /* UR program handle corresponding to
+                                     this kernel. */
+  const Adapter *MAdapterPtr;          /* We can keep raw pointer to the


aelovikov-intel · 2025-06-02T14:38:38Z

sycl/source/detail/kernel_name_based_cache_t.hpp

+        MKernelArgMask(KernelArgMask), MProgramHandle(ProgramHandle),
+        MAdapterPtr(AdapterPtr) {}
+
+  ~FastKernelCacheVal() {


Can we set to nullptr after "Release"? It should be relatively cheap but can save lots of time debugging obscure errors.

sycl/source/detail/program_manager/program_manager.cpp

aelovikov-intel · 2025-06-02T14:46:08Z

sycl/source/detail/scheduler/commands.cpp

+    KernelCacheVal = detail::ProgramManager::getInstance().getOrCreateKernel(
+        ContextImpl, DeviceImpl, KernelName, KernelNameBasedCachePtr, NDRDesc);
+    Kernel = KernelCacheVal->MKernelHandle;
+    KernelMutex = KernelCacheVal->MMutex;
+    Program = KernelCacheVal->MProgramHandle;


How often do we need to extract data like that? Would enabling structured bindings for this class make sense?

Even in that particular code snippet, we cannot use structured bindings because the Kernel, KernelMutex, Program, and EliminatedArgMask variables are declared somewhere above.

And additionally, reference counting is now supported by shared_ptr, so we have to keep shared_ptr, to not get race condition (use handler after decrementing counters).

Wouldn't std::tie allow to use them?

I am not sure, but I do not think so. As I understand the KernelCacheVal should be convertible to std::tuple.
Anyway, let's keep it out of scope for this PR.

Co-authored-by: aelovikov-intel <[email protected]>

aelovikov-intel · 2025-06-02T15:25:30Z

I haven't looked through this in details, so won't be LGTM'ing, but I'm not having any outstanding requests either. Feel free to merge once all is green.

Alexandr-Konovalov · 2025-06-02T18:17:01Z

Colleagues @fabiomestre , @sergey-semenov, @againull , could it be merged or some further improvements are required?

Alexandr-Konovalov had a problem deploying to WindowsCILock May 20, 2025 12:50 — with GitHub Actions Error

Code formatting.

3a3f017

Signed-off-by: Alexandr Konovalov <[email protected]>

Alexandr-Konovalov had a problem deploying to WindowsCILock May 20, 2025 12:58 — with GitHub Actions Error

Code formatting.

f0be3fe

Signed-off-by: Alexandr Konovalov <[email protected]>

Alexandr-Konovalov temporarily deployed to WindowsCILock May 20, 2025 13:10 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock May 20, 2025 13:37 — with GitHub Actions Inactive

Merge branch 'sycl' into Alexandr-Konovalov/UR-counters-shared_ptr

dad0228

Alexandr-Konovalov temporarily deployed to WindowsCILock May 20, 2025 16:11 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock May 20, 2025 16:46 — with GitHub Actions Inactive

Alexandr-Konovalov marked this pull request as ready for review May 20, 2025 17:07

Alexandr-Konovalov requested review from a team as code owners May 20, 2025 17:07

Alexandr-Konovalov requested review from fabiomestre and againull May 20, 2025 17:07

Clarify comment, drop unneeded move.

a11ef81

Signed-off-by: Alexandr Konovalov <[email protected]>

Alexandr-Konovalov temporarily deployed to WindowsCILock May 21, 2025 10:15 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock May 21, 2025 10:46 — with GitHub Actions Inactive

againull reviewed May 21, 2025

View reviewed changes

sycl/test-e2e/KernelAndProgram/disable-caching.cpp Show resolved Hide resolved

sycl/test-e2e/XPTI/basic_event_collection_linux.cpp Show resolved Hide resolved

Extend scope of checked calls.

93d9cc4

Alexandr-Konovalov temporarily deployed to WindowsCILock May 22, 2025 14:30 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock May 22, 2025 15:37 — with GitHub Actions Inactive

againull approved these changes May 22, 2025

View reviewed changes

fabiomestre approved these changes May 23, 2025

View reviewed changes

vinser52 requested a review from sergey-semenov May 26, 2025 08:31

vinser52 reviewed May 26, 2025

View reviewed changes

sycl/source/detail/graph_impl.cpp Show resolved Hide resolved

Alexandr-Konovalov had a problem deploying to WindowsCILock June 2, 2025 12:45 — with GitHub Actions Error

Code formatting.

76980f2

Alexandr-Konovalov had a problem deploying to WindowsCILock June 2, 2025 12:51 — with GitHub Actions Error

vinser52 approved these changes Jun 2, 2025

View reviewed changes

vinser52 added the performance Performance related issues label Jun 2, 2025

hvdijk mentioned this pull request Jun 2, 2025

[SYCL] Add size to detail::string_view #18661

Open

Alexandr-Konovalov added 2 commits June 2, 2025 15:12

Remove sycl/test/native_cpu/multiple_definitions.cpp.

8c16b6e

Merge branch 'sycl' into Alexandr-Konovalov/UR-counters-shared_ptr

441344d

Alexandr-Konovalov temporarily deployed to WindowsCILock June 2, 2025 13:14 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock June 2, 2025 13:47 — with GitHub Actions Inactive

vinser52 removed the request for review from a team June 2, 2025 13:51

aelovikov-intel reviewed Jun 2, 2025

View reviewed changes

sergey-semenov approved these changes Jun 2, 2025

View reviewed changes

Update sycl/source/detail/program_manager/program_manager.cpp

cbb03d2

Co-authored-by: aelovikov-intel <[email protected]>

Alexandr-Konovalov had a problem deploying to WindowsCILock June 2, 2025 15:12 — with GitHub Actions Error

Keep reference to Adapter, not pointer.

37bde6f

Alexandr-Konovalov had a problem deploying to WindowsCILock June 2, 2025 15:25 — with GitHub Actions Error

Code formatting.

8c0c60a

Alexandr-Konovalov had a problem deploying to WindowsCILock June 2, 2025 15:32 — with GitHub Actions Error

Extend comment.

728f698

Alexandr-Konovalov temporarily deployed to WindowsCILock June 2, 2025 15:55 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock June 2, 2025 16:42 — with GitHub Actions Inactive

aelovikov-intel merged commit 47f5338 into intel:sycl Jun 2, 2025
23 checks passed

[SYCL] Use shared_ptr instead of manual changing UR counters #18565

[SYCL] Use shared_ptr instead of manual changing UR counters #18565

Uh oh!

Conversation

Alexandr-Konovalov commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vinser52 commented Jun 2, 2025

Uh oh!

hvdijk commented Jun 2, 2025

Uh oh!

Alexandr-Konovalov commented Jun 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aelovikov-intel commented Jun 2, 2025

Uh oh!

Alexandr-Konovalov commented Jun 2, 2025

Uh oh!

Uh oh!

Uh oh!

Alexandr-Konovalov commented May 20, 2025 •

edited

Loading