Skip to content

[libc++][atomic_ref] Use __atomic_fetch_{add,sub} builtins on floating-points whenever possible #135685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dalg24
Copy link
Member

@dalg24 dalg24 commented Apr 14, 2025

Fix #135109

Clang is able to emit an atomicrmw instruction from the __atomic_fetch_add and __atomic_fetch_sub builtins on floating-point types.

…g-points whenever possible

Fix llvm#135109

Clang is able to emit an `atomicrmw` instruction from the
`__atomic_fetch_add` and `__atomic_fetch_sub` builtins on floating-point
types.
@dalg24 dalg24 requested a review from huixie90 April 14, 2025 21:35
Copy link
Member

@huixie90 huixie90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make atomic_ref and atomic share the similar logic (or just share the implantation) for these rmw operations. The reason why atomic has special branch for fp80 long double is that , the clang builtin that emit the llvm ir, has serious bug that affects all types whose data size is not power of 2. ( I think I have a patch but I never got a chance to finalise the tests. )

@@ -7,6 +7,7 @@

// UNSUPPORTED: c++03, c++11, c++14, c++17
// XFAIL: !has-64-bit-atomics
// XFAIL: target={{x86_64-.*}} && tsan
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from

// https://github.com/llvm/llvm-project/issues/72893
// XFAIL: target={{x86_64-.*}} && tsan

Note that the #72893 issue referenced in the comment seems to indicate that only long double was problematic with ThreadSanitizer but it is not accurate because the instantiation of the tests are commented out for long double

// TODO https://github.com/llvm/llvm-project/issues/47978
// test<long double>();

Per a conversation on Discord with @huixie90, I propose to not hold up this PR and just use the same XFAIL annotation that we have in the std::atomic<Floating>::fetch_{add,sub} tests.

@dalg24 dalg24 marked this pull request as ready for review June 27, 2025 16:38
@dalg24 dalg24 requested a review from a team as a code owner June 27, 2025 16:38
@dalg24 dalg24 requested a review from huixie90 June 27, 2025 16:39
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 27, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2025

@llvm/pr-subscribers-libcxx

Author: Damien L-G (dalg24)

Changes

Fix #135109

Clang is able to emit an atomicrmw instruction from the __atomic_fetch_add and __atomic_fetch_sub builtins on floating-point types.


Full diff: https://github.com/llvm/llvm-project/pull/135685.diff

8 Files Affected:

  • (modified) libcxx/include/CMakeLists.txt (+1)
  • (modified) libcxx/include/__atomic/atomic.h (+3-26)
  • (modified) libcxx/include/__atomic/atomic_ref.h (+19-10)
  • (added) libcxx/include/__atomic/floating_point_helper.h (+55)
  • (modified) libcxx/include/module.modulemap.in (+1)
  • (modified) libcxx/test/std/atomics/atomics.ref/fetch_add.pass.cpp (+1)
  • (modified) libcxx/test/std/atomics/atomics.ref/fetch_sub.pass.cpp (+1)
  • (modified) llvm/utils/gn/secondary/libcxx/include/BUILD.gn (+1)
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index e386f31386b60..623641ee43233 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -216,6 +216,7 @@ set(files
   __atomic/check_memory_order.h
   __atomic/contention_t.h
   __atomic/fence.h
+  __atomic/floating_point_helper.h
   __atomic/is_always_lock_free.h
   __atomic/kill_dependency.h
   __atomic/memory_order.h
diff --git a/libcxx/include/__atomic/atomic.h b/libcxx/include/__atomic/atomic.h
index eead49dde6192..31e21514f25de 100644
--- a/libcxx/include/__atomic/atomic.h
+++ b/libcxx/include/__atomic/atomic.h
@@ -11,6 +11,7 @@
 
 #include <__atomic/atomic_sync.h>
 #include <__atomic/check_memory_order.h>
+#include <__atomic/floating_point_helper.h>
 #include <__atomic/is_always_lock_free.h>
 #include <__atomic/memory_order.h>
 #include <__atomic/support.h>
@@ -332,41 +333,17 @@ template <class _Tp>
   requires is_floating_point_v<_Tp>
 struct atomic<_Tp> : __atomic_base<_Tp> {
 private:
-  _LIBCPP_HIDE_FROM_ABI static constexpr bool __is_fp80_long_double() {
-    // Only x87-fp80 long double has 64-bit mantissa
-    return __LDBL_MANT_DIG__ == 64 && std::is_same_v<_Tp, long double>;
-  }
-
-  _LIBCPP_HIDE_FROM_ABI static constexpr bool __has_rmw_builtin() {
-#  ifndef _LIBCPP_COMPILER_CLANG_BASED
-    return false;
-#  else
-    // The builtin __cxx_atomic_fetch_add errors during compilation for
-    // long double on platforms with fp80 format.
-    // For more details, see
-    // lib/Sema/SemaChecking.cpp function IsAllowedValueType
-    // LLVM Parser does not allow atomicrmw with x86_fp80 type.
-    // if (ValType->isSpecificBuiltinType(BuiltinType::LongDouble) &&
-    //    &Context.getTargetInfo().getLongDoubleFormat() ==
-    //        &llvm::APFloat::x87DoubleExtended())
-    // For more info
-    // https://github.com/llvm/llvm-project/issues/68602
-    // https://reviews.llvm.org/D53965
-    return !__is_fp80_long_double();
-#  endif
-  }
-
   template <class _This, class _Operation, class _BuiltinOp>
   _LIBCPP_HIDE_FROM_ABI static _Tp
   __rmw_op(_This&& __self, _Tp __operand, memory_order __m, _Operation __operation, _BuiltinOp __builtin_op) {
-    if constexpr (__has_rmw_builtin()) {
+    if constexpr (std::__has_rmw_builtin<_Tp>()) {
       return __builtin_op(std::addressof(std::forward<_This>(__self).__a_), __operand, __m);
     } else {
       _Tp __old = __self.load(memory_order_relaxed);
       _Tp __new = __operation(__old, __operand);
       while (!__self.compare_exchange_weak(__old, __new, __m, memory_order_relaxed)) {
 #  ifdef _LIBCPP_COMPILER_CLANG_BASED
-        if constexpr (__is_fp80_long_double()) {
+        if constexpr (std::__is_fp80_long_double<_Tp>()) {
           // https://github.com/llvm/llvm-project/issues/47978
           // clang bug: __old is not updated on failure for atomic<long double>::compare_exchange_weak
           // Note __old = __self.load(memory_order_relaxed) will not work
diff --git a/libcxx/include/__atomic/atomic_ref.h b/libcxx/include/__atomic/atomic_ref.h
index b5493662c518e..9bdc6b1160d2c 100644
--- a/libcxx/include/__atomic/atomic_ref.h
+++ b/libcxx/include/__atomic/atomic_ref.h
@@ -20,6 +20,7 @@
 #include <__assert>
 #include <__atomic/atomic_sync.h>
 #include <__atomic/check_memory_order.h>
+#include <__atomic/floating_point_helper.h>
 #include <__atomic/memory_order.h>
 #include <__atomic/to_gcc_order.h>
 #include <__concepts/arithmetic.h>
@@ -322,20 +323,28 @@ struct atomic_ref<_Tp> : public __atomic_ref_base<_Tp> {
   atomic_ref& operator=(const atomic_ref&) = delete;
 
   _LIBCPP_HIDE_FROM_ABI _Tp fetch_add(_Tp __arg, memory_order __order = memory_order_seq_cst) const noexcept {
-    _Tp __old = this->load(memory_order_relaxed);
-    _Tp __new = __old + __arg;
-    while (!this->compare_exchange_weak(__old, __new, __order, memory_order_relaxed)) {
-      __new = __old + __arg;
+    if constexpr (std::__has_rmw_builtin<_Tp>()) {
+      return __atomic_fetch_add(this->__ptr_, __arg, std::__to_gcc_order(__order));
+    } else {
+      _Tp __old = this->load(memory_order_relaxed);
+      _Tp __new = __old + __arg;
+      while (!this->compare_exchange_weak(__old, __new, __order, memory_order_relaxed)) {
+        __new = __old + __arg;
+      }
+      return __old;
     }
-    return __old;
   }
   _LIBCPP_HIDE_FROM_ABI _Tp fetch_sub(_Tp __arg, memory_order __order = memory_order_seq_cst) const noexcept {
-    _Tp __old = this->load(memory_order_relaxed);
-    _Tp __new = __old - __arg;
-    while (!this->compare_exchange_weak(__old, __new, __order, memory_order_relaxed)) {
-      __new = __old - __arg;
+    if constexpr (std::__has_rmw_builtin<_Tp>()) {
+      return __atomic_fetch_sub(this->__ptr_, __arg, std::__to_gcc_order(__order));
+    } else {
+      _Tp __old = this->load(memory_order_relaxed);
+      _Tp __new = __old - __arg;
+      while (!this->compare_exchange_weak(__old, __new, __order, memory_order_relaxed)) {
+        __new = __old - __arg;
+      }
+      return __old;
     }
-    return __old;
   }
 
   _LIBCPP_HIDE_FROM_ABI _Tp operator+=(_Tp __arg) const noexcept { return fetch_add(__arg) + __arg; }
diff --git a/libcxx/include/__atomic/floating_point_helper.h b/libcxx/include/__atomic/floating_point_helper.h
new file mode 100644
index 0000000000000..c4347a10892c8
--- /dev/null
+++ b/libcxx/include/__atomic/floating_point_helper.h
@@ -0,0 +1,55 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___ATOMIC_FLOATING_POINT_HELPER_H
+#define _LIBCPP___ATOMIC_FLOATING_POINT_HELPER_H
+
+#include <__config>
+#include <__type_traits/is_floating_point.h>
+#include <__type_traits/is_same.h>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+#if _LIBCPP_STD_VER >= 20
+
+template <class _Tp>
+_LIBCPP_HIDE_FROM_ABI constexpr bool __is_fp80_long_double() {
+  // Only x87-fp80 long double has 64-bit mantissa
+  return __LDBL_MANT_DIG__ == 64 && std::is_same_v<_Tp, long double>;
+}
+
+template <class _Tp>
+_LIBCPP_HIDE_FROM_ABI constexpr bool __has_rmw_builtin() {
+  static_assert(std::is_floating_point_v<_Tp>);
+#  ifndef _LIBCPP_COMPILER_CLANG_BASED
+  return false;
+#  else
+  // The builtin __cxx_atomic_fetch_add errors during compilation for
+  // long double on platforms with fp80 format.
+  // For more details, see
+  // lib/Sema/SemaChecking.cpp function IsAllowedValueType
+  // LLVM Parser does not allow atomicrmw with x86_fp80 type.
+  // if (ValType->isSpecificBuiltinType(BuiltinType::LongDouble) &&
+  //    &Context.getTargetInfo().getLongDoubleFormat() ==
+  //        &llvm::APFloat::x87DoubleExtended())
+  // For more info
+  // https://github.com/llvm/llvm-project/issues/68602
+  // https://reviews.llvm.org/D53965
+  return !std::__is_fp80_long_double<_Tp>();
+#  endif
+}
+
+#endif
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP___ATOMIC_FLOATING_POINT_HELPER_H
diff --git a/libcxx/include/module.modulemap.in b/libcxx/include/module.modulemap.in
index adf80f2ac9acb..ff68e06a8ebd6 100644
--- a/libcxx/include/module.modulemap.in
+++ b/libcxx/include/module.modulemap.in
@@ -885,6 +885,7 @@ module std [system] {
     module check_memory_order     { header "__atomic/check_memory_order.h" }
     module contention_t           { header "__atomic/contention_t.h" }
     module fence                  { header "__atomic/fence.h" }
+    module floating_point_helper  { header "__atomic/floating_point_helper.h" }
     module is_always_lock_free    { header "__atomic/is_always_lock_free.h" }
     module kill_dependency        { header "__atomic/kill_dependency.h" }
     module memory_order           { header "__atomic/memory_order.h" }
diff --git a/libcxx/test/std/atomics/atomics.ref/fetch_add.pass.cpp b/libcxx/test/std/atomics/atomics.ref/fetch_add.pass.cpp
index 8e2605fef9d31..65a457a6129d5 100644
--- a/libcxx/test/std/atomics/atomics.ref/fetch_add.pass.cpp
+++ b/libcxx/test/std/atomics/atomics.ref/fetch_add.pass.cpp
@@ -7,6 +7,7 @@
 
 // UNSUPPORTED: c++03, c++11, c++14, c++17
 // XFAIL: !has-64-bit-atomics
+// XFAIL: target={{x86_64-.*}} && tsan
 
 // integral-type fetch_add(integral-type, memory_order = memory_order::seq_cst) const noexcept;
 // floating-point-type fetch_add(floating-point-type, memory_order = memory_order::seq_cst) const noexcept;
diff --git a/libcxx/test/std/atomics/atomics.ref/fetch_sub.pass.cpp b/libcxx/test/std/atomics/atomics.ref/fetch_sub.pass.cpp
index 04def076a301f..ab89ebdbde261 100644
--- a/libcxx/test/std/atomics/atomics.ref/fetch_sub.pass.cpp
+++ b/libcxx/test/std/atomics/atomics.ref/fetch_sub.pass.cpp
@@ -7,6 +7,7 @@
 
 // UNSUPPORTED: c++03, c++11, c++14, c++17
 // XFAIL: !has-64-bit-atomics
+// XFAIL: target={{x86_64-.*}} && tsan
 
 // integral-type fetch_sub(integral-type, memory_order = memory_order::seq_cst) const noexcept;
 // floating-point-type fetch_sub(floating-point-type, memory_order = memory_order::seq_cst) const noexcept;
diff --git a/llvm/utils/gn/secondary/libcxx/include/BUILD.gn b/llvm/utils/gn/secondary/libcxx/include/BUILD.gn
index 7b9966d9bfc58..29d566a32ba64 100644
--- a/llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+++ b/llvm/utils/gn/secondary/libcxx/include/BUILD.gn
@@ -291,6 +291,7 @@ if (current_toolchain == default_toolchain) {
       "__atomic/check_memory_order.h",
       "__atomic/contention_t.h",
       "__atomic/fence.h",
+      "__atomic/floating_point_helper.h",
       "__atomic/is_always_lock_free.h",
       "__atomic/kill_dependency.h",
       "__atomic/memory_order.h",

@dalg24
Copy link
Member Author

dalg24 commented Jun 28, 2025

The CI failures are unrelated as far as I can tell.

The generic-{msan,tsan} builds fail when configuring Google benchmark

if(NOT HAVE_STD_REGEX AND NOT HAVE_GNU_POSIX_REGEX AND NOT HAVE_POSIX_REGEX)
message(FATAL_ERROR "Failed to determine the source files for the regular expression backend")
endif()

The generic-no-localization build fails some "libcxx mangled names" test which does not seem to use anything that changed in this patch
https://github.com/llvm/llvm-project/actions/runs/15928686318/job/44937568088?pr=135685

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[libc++] atomic_ref<float>::fetch_add has CAS loop rather than atomicrmw
4 participants