Skip to content

Extend OrtAllocator API to get Allocator statistics #24785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jun 4, 2025
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
343f920
Add api to get allocator stats.
toothache May 16, 2025
c08b129
Address the comments.
toothache May 19, 2025
f591045
Fix UT.
toothache May 20, 2025
0e9c691
Raise not implemented exception in default IAllocator::GetStats.
toothache May 20, 2025
6911375
Polish the code.
toothache May 20, 2025
36a060e
Address comments.
toothache May 21, 2025
9a69706
Debug CI failure.
toothache May 21, 2025
e746426
Return OrtStatus in IAllocator:GetStats.
toothache May 21, 2025
11d68e5
Fix UT.
toothache May 21, 2025
eefc347
Update UT.
toothache May 21, 2025
f0267e9
Fix lint.
toothache May 21, 2025
c6e8908
Fix lint.
toothache May 21, 2025
6b53fe2
Return ORT_NOT_IMPLEMENTED in default cpu allocator.
toothache May 22, 2025
848c30a
Return E_NOT_IMPLEMENTED in NO_RTTI case.
toothache May 22, 2025
39aa2b1
Update onnxruntime_c_api.cc
toothache May 23, 2025
7dc51ce
Fix UT.
toothache May 23, 2025
73bb00c
Move NO_RTTI check in OrtAllocator entry.
toothache May 24, 2025
efa2024
Add back RTTI check in c_api
toothache May 27, 2025
5358ccb
Added comments.
toothache May 27, 2025
999c2e7
Fix build issues with c++17 prior.
toothache May 29, 2025
4236d19
Update API to return OrtKeyValuePairs.
toothache May 29, 2025
ed9eb69
Update UT.
toothache May 29, 2025
31f2e19
Update.
toothache May 29, 2025
ab4eba9
Fix build.
toothache May 29, 2025
f3cd26b
Address comments.
toothache May 30, 2025
378e05c
Address comments.
toothache May 30, 2025
ffbdd3d
Update apidoc
toothache May 30, 2025
6699a72
Fix leak.
toothache May 30, 2025
14027ae
Address comments.
toothache May 30, 2025
e8085f5
Remove version check.
toothache May 30, 2025
84f4866
Fix build.
toothache May 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion include/onnxruntime/core/framework/allocator.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,9 @@ class IAllocator {
const OrtMemoryInfo& Info() const { return memory_info_; };

// Each implementation of IAllocator can override and provide their own implementation
virtual void GetStats(AllocatorStats* /*stats*/) { return; }
virtual void GetStats(AllocatorStats* stats) {
stats->Clear();
}

static bool CalcMemSizeForArray(size_t nmemb, size_t size, size_t* out) noexcept {
return CalcMemSizeForArrayWithAlignment(nmemb, size, 0, out);
Expand Down
34 changes: 34 additions & 0 deletions include/onnxruntime/core/session/onnxruntime_c_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,25 @@ typedef struct OrtAllocator {
* those made during session initialization. This allows for separate memory management strategies for these allocations.
*/
void*(ORT_API_CALL* Reserve)(struct OrtAllocator* this_, size_t size); ///< Returns a pointer to an allocated block of `size` bytes

/**
* @brief Function used to get the statistics of the allocator.
*
* Return a pointer to the OrtKeyValuePairs structure that contains the statistics of the allocator.
* Supported keys are:
* - Limit: Bytes limit of the allocator. -1 if no limit is set.
* - InUse: Number of bytes in use.
* - TotalAllocated: The total number of allocated bytes by the allocator.
* - MaxInUse: The maximum bytes in use.
* - NumAllocs: Number of allocations.
* - NumReserves: Number of reserves. (Number of calls to Reserve() in arena-based allocators)
* - NumArenaExtensions: Number of arena extensions (Relevant only for arena based allocators)
* - NumArenaShrinkages: Number of arena shrinkages (Relevant only for arena based allocators)
* - MaxAllocSize: The max single allocation seen.
*
* NOTE: If the allocator does not implement this function, the OrtKeyValuePairs instance will be empty.
*/
ORT_API2_STATUS(GetStats, _In_ const struct OrtAllocator* this_, _Outptr_ OrtKeyValuePairs** out);
} OrtAllocator;

typedef void(ORT_API_CALL* OrtLoggingFunction)(
Expand Down Expand Up @@ -5266,6 +5285,21 @@ struct OrtApi {
* \since Version 1.23
*/
ORT_API2_STATUS(GetTensorSizeInBytes, _In_ const OrtValue* ort_value, _Out_ size_t* size);

/** \brief Calls OrtAllocator::GetStats function
*
* Return a pointer to the OrtKeyValuePairs structure that contains the statistics of the allocator.
*
* NOTE: If the allocator does not implement this function, the OrtKeyValuePairs instance will be empty.
*
* \param[in] ort_allocator The allocator to get stats from
* \param[out] out A pointer to the OrtKeyValuePairs instance that contains the stats
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.23.
*/
ORT_API2_STATUS(AllocatorGetStats, _In_ const OrtAllocator* ort_allocator, _Outptr_ OrtKeyValuePairs** out);
};

/*
Expand Down
7 changes: 7 additions & 0 deletions include/onnxruntime/core/session/onnxruntime_cxx_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -2154,6 +2154,13 @@ struct AllocatorImpl : Base<T> {
MemoryAllocation GetAllocation(size_t size);
void Free(void* p);
ConstMemoryInfo GetInfo() const;

/** \brief Function that returns the statistics of the allocator.
*
* \param[out] stats: A pointer to a KeyValuePairs object that will be filled with the allocator statistics.
* \return A Status indicating success or failure.
*/
Status GetStats(KeyValuePairs* stats) const;
};

} // namespace detail
Expand Down
7 changes: 7 additions & 0 deletions include/onnxruntime/core/session/onnxruntime_cxx_inline.h
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,13 @@ inline ConstMemoryInfo AllocatorImpl<T>::GetInfo() const {
return ConstMemoryInfo{out};
}

template <typename T>
inline Status AllocatorImpl<T>::GetStats(KeyValuePairs* stats) const {
OrtKeyValuePairs* out;
ORT_CXX_RETURN_ON_API_FAIL(GetApi().AllocatorGetStats(this->p_, &out));
*stats = KeyValuePairs(out);
return Ort::Status();
}
} // namespace detail

inline AllocatorWithDefaultOptions::AllocatorWithDefaultOptions() {
Expand Down
2 changes: 1 addition & 1 deletion onnxruntime/core/framework/allocator_stats.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,4 @@ struct AllocatorStats {
return ss.str();
}
};
} // namespace onnxruntime
} // namespace onnxruntime
34 changes: 34 additions & 0 deletions onnxruntime/core/session/allocator_adapters.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

#include "allocator_adapters.h"
#include "core/framework/error_code_helper.h"
#include "core/session/abi_key_value_pairs.h"
#include "core/session/inference_session.h"
#include "core/session/ort_env.h"
#include "core/session/ort_apis.h"
Expand All @@ -11,6 +12,7 @@ namespace onnxruntime {

namespace {
constexpr uint32_t kOrtAllocatorReserveMinVersion = 18;
constexpr uint32_t kOrtAllocatorStatsMinVersion = 23;
} // namespace

OrtAllocatorImplWrappingIAllocator::OrtAllocatorImplWrappingIAllocator(onnxruntime::AllocatorPtr&& i_allocator)
Expand All @@ -26,6 +28,18 @@ OrtAllocatorImplWrappingIAllocator::OrtAllocatorImplWrappingIAllocator(onnxrunti
OrtAllocator::Reserve =
[](OrtAllocator* this_, size_t size) { return static_cast<OrtAllocatorImplWrappingIAllocator*>(this_)->Reserve(size); };
}
if (OrtAllocator::version >= kOrtAllocatorStatsMinVersion) {
OrtAllocator::GetStats =
[](const OrtAllocator* this_, OrtKeyValuePairs** stats) noexcept -> OrtStatusPtr {
API_IMPL_BEGIN
auto kvp = std::make_unique<OrtKeyValuePairs>();
auto stats_map = static_cast<const OrtAllocatorImplWrappingIAllocator*>(this_)->Stats();
kvp->Copy(stats_map);
*stats = reinterpret_cast<OrtKeyValuePairs*>(kvp.release());
return nullptr;
API_IMPL_END
};
}
}

void* OrtAllocatorImplWrappingIAllocator::Alloc(size_t size) {
Expand All @@ -44,6 +58,26 @@ const OrtMemoryInfo* OrtAllocatorImplWrappingIAllocator::Info() const {
return &i_allocator_->Info();
}

std::unordered_map<std::string, std::string> OrtAllocatorImplWrappingIAllocator::Stats() const {
AllocatorStats stats{};
i_allocator_->GetStats(&stats);

// Allocators which does not implement GetStats() will return empty stats
std::unordered_map<std::string, std::string> entries;
if (stats.num_allocs > 0 || stats.bytes_limit != 0) {
entries.insert_or_assign("Limit", std::to_string(stats.bytes_limit));
entries.insert_or_assign("InUse", std::to_string(stats.bytes_in_use));
entries.insert_or_assign("TotalAllocated", std::to_string(stats.total_allocated_bytes));
entries.insert_or_assign("MaxInUse", std::to_string(stats.max_bytes_in_use));
entries.insert_or_assign("NumAllocs", std::to_string(stats.num_allocs));
entries.insert_or_assign("NumReserves", std::to_string(stats.num_reserves));
entries.insert_or_assign("NumArenaExtensions", std::to_string(stats.num_arena_extensions));
entries.insert_or_assign("NumArenaShrinkages", std::to_string(stats.num_arena_shrinkages));
entries.insert_or_assign("MaxAllocSize", std::to_string(stats.max_alloc_size));
}
return entries;
}

onnxruntime::AllocatorPtr OrtAllocatorImplWrappingIAllocator::GetWrappedIAllocator() {
return i_allocator_;
}
Expand Down
4 changes: 4 additions & 0 deletions onnxruntime/core/session/allocator_adapters.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
#include "core/framework/allocator.h"
#include "core/session/onnxruntime_cxx_api.h"

#include <string>

Check warning on line 9 in onnxruntime/core/session/allocator_adapters.h

View workflow job for this annotation

GitHub Actions / Optional Lint C++

[cpplint] reported by reviewdog 🐶 Found C++ system header after other header. Should be: allocator_adapters.h, c system, c++ system, other. [build/include_order] [4] Raw Output: onnxruntime/core/session/allocator_adapters.h:9: Found C++ system header after other header. Should be: allocator_adapters.h, c system, c++ system, other. [build/include_order] [4]

namespace onnxruntime {

// Since all allocators are of type 'OrtAllocator' and there is a single
Expand All @@ -29,6 +31,8 @@
const OrtMemoryInfo* Info() const;
void* Reserve(size_t size);

std::unordered_map<std::string, std::string> Stats() const;

Check warning on line 34 in onnxruntime/core/session/allocator_adapters.h

View workflow job for this annotation

GitHub Actions / Optional Lint C++

[cpplint] reported by reviewdog 🐶 Add #include <unordered_map> for unordered_map<> [build/include_what_you_use] [4] Raw Output: onnxruntime/core/session/allocator_adapters.h:34: Add #include <unordered_map> for unordered_map<> [build/include_what_you_use] [4]

ORT_DISALLOW_COPY_AND_ASSIGNMENT(OrtAllocatorImplWrappingIAllocator);

onnxruntime::AllocatorPtr GetWrappedIAllocator();
Expand Down
10 changes: 10 additions & 0 deletions onnxruntime/core/session/default_cpu_allocator_c_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

#include "core/framework/utils.h"
#include "core/framework/allocator.h"
#include "core/session/abi_key_value_pairs.h"
#include "core/session/allocator_adapters.h"
#include "core/session/onnxruntime_cxx_api.h"
#include "core/session/ort_apis.h"
Expand All @@ -17,6 +18,15 @@
[](OrtAllocator* this_, void* p) { static_cast<OrtDefaultCpuAllocator*>(this_)->Free(p); };
OrtAllocator::Info =
[](const OrtAllocator* this_) { return static_cast<const OrtDefaultCpuAllocator*>(this_)->Info(); };
OrtAllocator::Reserve =
[](OrtAllocator* this_, size_t size) { return static_cast<OrtDefaultCpuAllocator*>(this_)->Alloc(size); };
OrtAllocator::GetStats =
[](const OrtAllocator* /*this_*/, OrtKeyValuePairs** stats) noexcept -> OrtStatusPtr {
// Default allocator does not support stats, return an empty OrtKeyValuePairs.
auto kvp = std::make_unique<OrtKeyValuePairs>();

Check warning on line 26 in onnxruntime/core/session/default_cpu_allocator_c_api.cc

View workflow job for this annotation

GitHub Actions / Optional Lint C++

[cpplint] reported by reviewdog 🐶 Add #include <memory> for make_unique<> [build/include_what_you_use] [4] Raw Output: onnxruntime/core/session/default_cpu_allocator_c_api.cc:26: Add #include <memory> for make_unique<> [build/include_what_you_use] [4]
*stats = reinterpret_cast<OrtKeyValuePairs*>(kvp.release());
return nullptr;
};
Ort::ThrowOnError(OrtApis::CreateCpuMemoryInfo(OrtDeviceAllocator, OrtMemTypeDefault, &cpu_memory_info));
}

Expand Down
7 changes: 7 additions & 0 deletions onnxruntime/core/session/onnxruntime_c_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1567,6 +1567,12 @@ ORT_API_STATUS_IMPL(OrtApis::AllocatorGetInfo, _In_ const OrtAllocator* ptr, _Ou
API_IMPL_END
}

ORT_API_STATUS_IMPL(OrtApis::AllocatorGetStats, _In_ const OrtAllocator* ptr, _Outptr_ OrtKeyValuePairs** out) {
API_IMPL_BEGIN
return ptr->GetStats(ptr, out);
API_IMPL_END
}

template <typename T>
ORT_STATUS_PTR OrtGetNumSequenceElements(const OrtValue* p_ml_value, size_t* out) {
auto& data = p_ml_value->Get<T>();
Expand Down Expand Up @@ -3024,6 +3030,7 @@ static constexpr OrtApi ort_api_1_to_23 = {
&OrtApis::GetEpApi,
// End of Version 22 - DO NOT MODIFY ABOVE (see above text for more information)
&OrtApis::GetTensorSizeInBytes,
&OrtApis::AllocatorGetStats,
};

// OrtApiBase can never change as there is no way to know what version of OrtApiBase is returned by OrtGetApiBase.
Expand Down
1 change: 1 addition & 0 deletions onnxruntime/core/session/ort_apis.h
Original file line number Diff line number Diff line change
Expand Up @@ -601,4 +601,5 @@ ORT_API(const OrtEpApi*, GetEpApi);

ORT_API_STATUS_IMPL(GetTensorSizeInBytes, _In_ const OrtValue* ort_value, _Out_ size_t* size);

ORT_API_STATUS_IMPL(AllocatorGetStats, _In_ const OrtAllocator* ptr, _Outptr_ OrtKeyValuePairs** out);
} // namespace OrtApis
6 changes: 6 additions & 0 deletions onnxruntime/test/shared_lib/test_allocator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,10 @@ TEST(CApiTest, DefaultAllocator) {
ASSERT_EQ(allocation.size(), 100U);
ASSERT_NE(allocation.get(), nullptr);
memset(allocation.get(), 0, 100U);

// Default Allocator does not implement GetStats, we expect the stats to be empty.
Ort::KeyValuePairs stats{};
auto status = default_allocator.GetStats(&stats);
ASSERT_TRUE(status.IsOK());
ASSERT_EQ(0, stats.GetKeyValuePairs().size());
}
41 changes: 41 additions & 0 deletions onnxruntime/test/shared_lib/test_inference.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1992,6 +1992,31 @@ TEST(CApiTest, get_allocator_cpu) {
auto mem_allocation = cpu_allocator.GetAllocation(1024);
ASSERT_NE(nullptr, mem_allocation.get());
ASSERT_EQ(1024U, mem_allocation.size());

Ort::KeyValuePairs stats;
auto status = cpu_allocator.GetStats(&stats);

// CPU allocator may not support arena usage.
// See func DoesCpuAllocatorSupportArenaUsage() in allocator_utils.cc.
if (allocator_info.GetAllocatorType() == OrtAllocatorType::OrtArenaAllocator) {
ASSERT_TRUE(status.IsOK());

ASSERT_EQ("-1", std::string(stats.GetValue("Limit")));
ASSERT_EQ("1024", std::string(stats.GetValue("InUse")));
ASSERT_EQ("1024", std::string(stats.GetValue("MaxInUse")));
ASSERT_EQ("1024", std::string(stats.GetValue("MaxAllocSize")));
ASSERT_EQ("2", std::string(stats.GetValue("NumAllocs")));
ASSERT_EQ("0", std::string(stats.GetValue("NumReserves")));

// We don't check values of the following stats
ASSERT_NE(nullptr, stats.GetValue("TotalAllocated"));
ASSERT_NE(nullptr, stats.GetValue("NumArenaExtensions"));
ASSERT_NE(nullptr, stats.GetValue("NumArenaShrinkages"));
} else {
// If the allocator is not an arena allocator, we expect the stats to be empty.
ASSERT_TRUE(status.IsOK());
ASSERT_EQ(0, stats.GetKeyValuePairs().size());
}
}

#ifdef USE_CUDA
Expand All @@ -2014,6 +2039,22 @@ TEST(CApiTest, get_allocator_cuda) {
auto mem_allocation = cuda_allocator.GetAllocation(1024);
ASSERT_NE(nullptr, mem_allocation.get());
ASSERT_EQ(1024U, mem_allocation.size());

Ort::KeyValuePairs stats;
auto status = cuda_allocator.GetStats(&stats);
ASSERT_TRUE(status.IsOK());

ASSERT_EQ("-1", std::string(stats.GetValue("Limit")));
ASSERT_EQ("1024", std::string(stats.GetValue("InUse")));
ASSERT_EQ("1024", std::string(stats.GetValue("MaxInUse")));
ASSERT_EQ("1024", std::string(stats.GetValue("MaxAllocSize")));
ASSERT_EQ("2", std::string(stats.GetValue("NumAllocs")));
ASSERT_EQ("0", std::string(stats.GetValue("NumReserves")));

// We don't check values of the following stats
ASSERT_NE(nullptr, stats.GetValue("TotalAllocated"));
ASSERT_NE(nullptr, stats.GetValue("NumArenaExtensions"));
ASSERT_NE(nullptr, stats.GetValue("NumArenaShrinkages"));
}
#endif

Expand Down
Loading