Skip to content

gh-135832: implement Py_DECREF specializations for Py_GIL_DISABLED build #135833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 79 additions & 3 deletions Include/internal/pycore_object.h
Original file line number Diff line number Diff line change
Expand Up @@ -272,17 +272,93 @@ _Py_DECREF_NO_DEALLOC(PyObject *op)
}

#else
// TODO: implement Py_DECREF specializations for Py_GIL_DISABLED build

static inline void
_Py_DECREF_SPECIALIZED(PyObject *op, const destructor destruct)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is barely used now, but there's a similar function PyStackRef_CLOSE_SPECIALIZED.

There are two possible reasons to implement this as something that's not just Py_DECREF:

  1. Performance, but it's not really clear that it's going to be any faster
  2. To avoid _Py_Dealloc so that you can enable this assertion

    cpython/Objects/object.c

    Lines 3166 to 3170 in 6227662

    #if !defined(Py_GIL_DISABLED) && !defined(Py_STACKREF_DEBUG)
    /* This assertion doesn't hold for the free-threading build, as
    * PyStackRef_CLOSE_SPECIALIZED is not implemented */
    assert(tstate->current_frame == NULL || tstate->current_frame->stackpointer != NULL);
    #endif
    , but that means you can't call _Py_MergeZeroLocalRefcount because that internally may call _Py_Dealloc

I think (2) would be difficult to achieve because of _Py_brc_queue_object.

{
Py_DECREF(op);
uint32_t local = _Py_atomic_load_uint32_relaxed(&op->ob_ref_local);
if (local == _Py_IMMORTAL_REFCNT_LOCAL) {
_Py_DECREF_IMMORTAL_STAT_INC();
return;
}
_Py_DECREF_STAT_INC();
#ifdef Py_REF_DEBUG
_Py_DECREF_DecRefTotal();
#endif
if (_Py_IsOwnedByCurrentThread(op)) {
local--;
_Py_atomic_store_uint32_relaxed(&op->ob_ref_local, local);
if (local == 0) {
_Py_MergeZeroLocalRefcount(op);
}
}
else {
Py_ssize_t shared = _Py_atomic_load_ssize_relaxed(&op->ob_ref_shared);
for (;;) {
Py_ssize_t new_shared = shared - (1 << _Py_REF_SHARED_SHIFT);
if (_Py_atomic_compare_exchange_ssize(
Comment on lines +298 to +299
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing important parts of _Py_DecRefShared: if the shared reference count becomes negative, the object must be queued so that the owning thread merges the two reference count fields.

&op->ob_ref_shared,
&shared,
new_shared)) {
if (new_shared == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can only safely deallocate if the reference count is zero and it's merged (i.e., new_shared is _Py_REF_MERGED)

#ifdef Py_TRACE_REFS
_Py_ForgetReference(op);
#endif
_PyReftracerTrack(op, PyRefTracer_DESTROY);
destruct(op);
}
break;
}
}
}
}

static inline void
_Py_DECREF_NO_DEALLOC(PyObject *op)
Comment on lines 316 to 317
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this function is used anymore.

{
Py_DECREF(op);
uint32_t local = _Py_atomic_load_uint32_relaxed(&op->ob_ref_local);
if (local == _Py_IMMORTAL_REFCNT_LOCAL) {
_Py_DECREF_IMMORTAL_STAT_INC();
return;
}
_Py_DECREF_STAT_INC();
#ifdef Py_REF_DEBUG
_Py_DECREF_DecRefTotal();
#endif
if (_Py_IsOwnedByCurrentThread(op)) {
local--;
_Py_atomic_store_uint32_relaxed(&op->ob_ref_local, local);
if (local == 0) {
_Py_MergeZeroLocalRefcount(op);
}
}
else {
Py_ssize_t shared = _Py_atomic_load_ssize_relaxed(&op->ob_ref_shared);
for (;;) {
Py_ssize_t new_shared = shared - (1 << _Py_REF_SHARED_SHIFT);
if (_Py_atomic_compare_exchange_ssize(
&op->ob_ref_shared,
&shared,
new_shared)) {
break;
}
}
}
#ifdef Py_DEBUG
// Check that the refcount is still positive after decrement
if (_Py_IsOwnedByCurrentThread(op)) {
uint32_t final_local = _Py_atomic_load_uint32_relaxed(&op->ob_ref_local);
if (final_local == 0) {
_Py_FatalRefcountError("Expected a positive remaining refcount");
}
}
else {
Py_ssize_t final_shared = _Py_atomic_load_ssize_relaxed(&op->ob_ref_shared);
if ((final_shared >> _Py_REF_SHARED_SHIFT) <= 0) {
_Py_FatalRefcountError("Expected a positive remaining refcount");
}
}
#endif
}

static inline int
Expand Down
Loading