Skip to content

[UR][SYCL] Add urUSMContextMemcpyExp API to enable device global support. #17268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jun 5, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
db41224
[UR] Add urUSMContextMemcpyExp API and basic l0 implementation.
aarongreig Mar 3, 2025
6dd8372
Add cts tests.
aarongreig Mar 7, 2025
acf53ad
Fix typo
aarongreig Mar 7, 2025
7629c48
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 7, 2025
43a3123
Add missing newline
aarongreig Mar 7, 2025
653c5b3
Correct linkage of l0 implementation.
aarongreig Mar 7, 2025
505c7ce
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 11, 2025
282b3d2
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 13, 2025
3802c4f
Add missing entry for l0v2.
aarongreig Mar 13, 2025
f702bb1
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 14, 2025
1a4e90c
Use new API to avoid temporary queue in ext_oneapi_get_device_global_…
aarongreig Mar 14, 2025
56c192a
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 14, 2025
874df40
Add l0 v2 implementation.
aarongreig Mar 14, 2025
2fb988f
Add back deleted newline.
aarongreig Mar 14, 2025
04c9156
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 17, 2025
ed29b77
Address review feedback.
aarongreig Mar 17, 2025
7a08d66
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 17, 2025
e034023
Fix bad merge.
aarongreig Mar 17, 2025
27c9cb6
actually fix hip this time
aarongreig Mar 17, 2025
55ce78a
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 21, 2025
c88f4f9
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 26, 2025
64a2c91
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 27, 2025
d1c5f88
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Mar 31, 2025
ea0e9b9
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Apr 11, 2025
573f8f6
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Jun 3, 2025
6c2eb44
Correct for some recent updates.
aarongreig Jun 3, 2025
7cbd224
Merge branch 'sycl' into aaron/usmContextMemcpy
aarongreig Jun 4, 2025
bb64e3a
Fix test cmake.
aarongreig Jun 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 54 additions & 1 deletion unified-runtime/include/ur_api.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions unified-runtime/include/ur_api_funcs.def

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions unified-runtime/include/ur_ddi.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions unified-runtime/include/ur_print.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

52 changes: 52 additions & 0 deletions unified-runtime/include/ur_print.hpp

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

63 changes: 63 additions & 0 deletions unified-runtime/scripts/core/EXP-USM-CONTEXT-MEMCPY.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
<%
OneApi=tags['$OneApi']
x=tags['$x']
X=x.upper()
%>

.. _experimental-usm-context-memcpy:

================================================================================
USM Context Memcpy
================================================================================

.. warning::

Experimental features:

* May be replaced, updated, or removed at any time.
* Do not require maintaining API/ABI stability of their own additions over
time.
* Do not require conformance testing of their own additions.


Motivation
--------------------------------------------------------------------------------

In order to support device globals there's a need for a blocking USM write
operation that doesn't need a queue. This is to facilitate fast initialization
of the device global memory via native APIs that enable this kind of operation.

API
--------------------------------------------------------------------------------

Enums
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* ${x}_device_info_t
* ${X}_DEVICE_INFO_USM_CONTEXT_MEMCPY_SUPPORT_EXP

Functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* ${x}USMContextMemcpyExp

Changelog
--------------------------------------------------------------------------------

+-----------+---------------------------+
| Revision | Changes |
+===========+===========================+
| 1.0 | Initial Draft |
+-----------+---------------------------+


Support
--------------------------------------------------------------------------------

Adapters which support this experimental feature *must* return true for the new
``${X}_DEVICE_INFO_USM_CONTEXT_MEMCPY_SUPPORT_EXP`` device info query.


Contributors
--------------------------------------------------------------------------------

* Aaron Greig `[email protected] <[email protected]>`
48 changes: 48 additions & 0 deletions unified-runtime/scripts/core/exp-usm-context-memcpy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#
# Copyright (C) 2025 Intel Corporation
#
# Part of the Unified-Runtime Project, under the Apache License v2.0 with LLVM Exceptions.
# See LICENSE.TXT
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# See YaML.md for syntax definition
#
--- #--------------------------------------------------------------------------
type: header
desc: "Intel $OneApi Unified Runtime Experimental APIs for USM Context Memcpy"
ordinal: "99"
--- #--------------------------------------------------------------------------
type: enum
extend: true
typed_etors: true
desc: "Extension enums to $x_device_info_t to support $xUSMContextMemcpy"
name: $x_device_info_t
etors:
- name: USM_CONTEXT_MEMCPY_SUPPORT_EXP
value: "0x7000"
desc: "[$x_bool_t] returns true if the device supports $xUSMContextMemcpyExp"
--- #--------------------------------------------------------------------------
type: function
desc: "Perform a synchronous, blocking memcpy operation between USM allocations."
class: $xUSM
name: ContextMemcpyExp
ordinal: "0"
params:
- type: $x_context_handle_t
name: hContext
desc: "[in] Context associated with the device(s) that own the allocations `pSrc` and `pDst`."
- type: void*
name: pDst
desc: "[in] Destination pointer to copy to."
- type: const void*
name: pSrc
desc: "[in] Source pointer to copy from."
- type: size_t
name: size
desc: "[in] Size in bytes to be copied."
returns:
- $X_RESULT_SUCCESS
- $X_RESULT_ERROR_ADAPTER_SPECIFIC
- $X_RESULT_ERROR_INVALID_SIZE:
- "`size == 0`"
- "If `size` is higher than the allocation size of `pSrc` or `pDst`"
3 changes: 3 additions & 0 deletions unified-runtime/scripts/core/registry.yml
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,9 @@ etors:
- name: USM_POOL_GET_INFO_EXP
desc: Enumerator for $xUSMPoolGetInfoExp
value: '262'
- name: USM_CONTEXT_MEMCPY_EXP
desc: Enumerator for $xUSMContextMemcpyExp
value: '264'
---
type: enum
desc: Defines structure types
Expand Down
2 changes: 2 additions & 0 deletions unified-runtime/source/adapters/cuda/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1113,6 +1113,8 @@ UR_APIEXPORT ur_result_t UR_APICALL urDeviceGetInfo(ur_device_handle_t hDevice,
}
case UR_DEVICE_INFO_LOW_POWER_EVENTS_EXP:
return ReturnValue(false);
case UR_DEVICE_INFO_USM_CONTEXT_MEMCPY_SUPPORT_EXP:
return ReturnValue(false);
case UR_DEVICE_INFO_USM_P2P_SUPPORT_EXP:
return ReturnValue(true);
case UR_DEVICE_INFO_LAUNCH_PROPERTIES_SUPPORT_EXP:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,7 @@ UR_DLLEXPORT ur_result_t UR_APICALL urGetUSMExpProcAddrTable(
pDdiTable->pfnPoolSetDevicePoolExp = urUSMPoolSetDevicePoolExp;
pDdiTable->pfnPoolGetDevicePoolExp = urUSMPoolGetDevicePoolExp;
pDdiTable->pfnPoolTrimToExp = urUSMPoolTrimToExp;
pDdiTable->pfnContextMemcpyExp = urUSMContextMemcpyExp;
return UR_RESULT_SUCCESS;
}

Expand Down
6 changes: 6 additions & 0 deletions unified-runtime/source/adapters/cuda/usm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -459,3 +459,9 @@ UR_APIEXPORT ur_result_t UR_APICALL urUSMPoolTrimToExp(ur_context_handle_t,
size_t) {
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
}

UR_APIEXPORT ur_result_t UR_APICALL urUSMContextMemcpyExp(ur_context_handle_t,
void *, const void *,
size_t) {
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
}
5 changes: 3 additions & 2 deletions unified-runtime/source/adapters/hip/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1077,9 +1077,10 @@ UR_APIEXPORT ur_result_t UR_APICALL urDeviceGetInfo(ur_device_handle_t hDevice,
}
case UR_DEVICE_INFO_COMMAND_BUFFER_EVENT_SUPPORT_EXP:
return ReturnValue(false);
case UR_DEVICE_INFO_LOW_POWER_EVENTS_EXP: {
case UR_DEVICE_INFO_LOW_POWER_EVENTS_EXP:
return ReturnValue(false);
case UR_DEVICE_INFO_USM_CONTEXT_MEMCPY_SUPPORT_EXP:
return ReturnValue(false);
}
case UR_DEVICE_INFO_USM_P2P_SUPPORT_EXP:
return ReturnValue(true);
case UR_DEVICE_INFO_LAUNCH_PROPERTIES_SUPPORT_EXP:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,7 @@ UR_DLLEXPORT ur_result_t UR_APICALL urGetUSMExpProcAddrTable(
pDdiTable->pfnPoolSetDevicePoolExp = urUSMPoolSetDevicePoolExp;
pDdiTable->pfnPoolGetDevicePoolExp = urUSMPoolGetDevicePoolExp;
pDdiTable->pfnPoolTrimToExp = urUSMPoolTrimToExp;
pDdiTable->pfnContextMemcpyExp = urUSMContextMemcpyExp;
return UR_RESULT_SUCCESS;
}

Expand Down
Loading
Loading