Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323

AKHIL-149 · 2025-12-11T02:59:56Z

Summary

fixes #63314 - pivot_table creating duplicate indices on python 3.14 with numpy 1.26

tracked down the actual bug. wasn't in compress_group_index like i thought - it's numpy's searchsorted that's broken with this version combo.

What was happening

unstack uses searchsorted to build the compressor array
with py3.14 + numpy 1.26, searchsorted returns duplicate values instead of unique positions
this causes multiple different index values to map to the same output row

The fix

fallback to the np.unique approach when on python 3.14 + numpy < 2.0. this is the same method the non-sorted path already uses, so it's tested.

Testing

tested with the reproduction case from the issue (100k rows, 3 metrics). works correctly now.

…ain)

…maybe_converts_object`) if requested (pandas-dev#59487) * String dtype: maybe_converts_object give precedence to nullable dtype * update datetimelike input validation * update tests and remove xfails * explicitly test pd.array() behaviour (remove xfail) * fixup allow_2d * undo changes related to datetimelike input validation * fix test for str on current main --------- Co-authored-by: Matthew Roeschke <[email protected]>

… Python :: 3.13 added to pyproject.toml) (pandas-dev#60012) Backport PR pandas-dev#59985: Programming Language :: Python :: 3.13 added to pyproject.toml Co-authored-by: LOCHAN PAUDEL <[email protected]>

* REF: avoid copy in StringArray factorize * mypy fixup * un-xfail

…andas-dev#59610)

* DOC: Add whatsnew for 2.3.0 * fix duplicate label

* BUG (string): str.replace with negative n * update GH ref

* TST (string) fix xfailed groupby tests (3) * TST: non-pyarrow build

Co-authored-by: Joris Van den Bossche <[email protected]>

… regex replacement) (pandas-dev#62328) Co-authored-by: Álvaro Kothe <[email protected]>

…_str__) (pandas-dev#61148) (pandas-dev#62329)

… dtype (pandas-dev#62333)

pandas-dev#62326) Co-authored-by: Nathan Goldbaum <[email protected]>

…_book typing) (pandas-dev#62391) Co-authored-by: Matthew Roeschke <[email protected]>

… 3.14 (pandas-dev#62324) (pandas-dev#62375) Co-authored-by: Nathan Goldbaum <[email protected]>

…-dev#62318) (pandas-dev#62394) Co-authored-by: Nathan Goldbaum <[email protected]>

… JSON datetime serialization) (pandas-dev#62253) Co-authored-by: Álvaro Kothe <[email protected]>

…lmatch for Arrow backend with optional groups) (pandas-dev#62401) Co-authored-by: ptth222 <[email protected]>

…andas-dev#60941) (pandas-dev#62409) Co-authored-by: ChiLin Chiu <[email protected]>

…ct_dtypes(include=object) selecting string columns) (pandas-dev#62400) Co-authored-by: Joris Van den Bossche <[email protected]>

…els workflow (pandas-dev#61669) (pandas-dev#61718) (pandas-dev#62395) Co-authored-by: Evgenii Mosikhin <[email protected]> Co-authored-by: Evgenii Mosikhin <[email protected]> Co-authored-by: Laurie O <[email protected]>

…case to 3.0 string migration guide) (pandas-dev#62413) Co-authored-by: Joris Van den Bossche <[email protected]>

…ch for Arrow backend with optional groups) (pandas-dev#62412) Co-authored-by: Joris Van den Bossche <[email protected]>

…n 3.14 support in pyproject.toml and release notes) (pandas-dev#62415) Co-authored-by: Joris Van den Bossche <[email protected]>

…ndexes (pandas-dev#62367)

…perly (pandas-dev#60726) (pandas-dev#62436) Co-authored-by: Patrick Hoefler <[email protected]>

…umexpr 2.13) (pandas-dev#62454) Co-authored-by: Matthew Roeschke <[email protected]>

…or older pyarrow (pandas-dev#61962) (pandas-dev#62476)

…as-dev#62480) (pandas-dev#62485)

…andas-dev#62424) (pandas-dev#62504) Co-authored-by: jbrockmendel <[email protected]>

…thon] array (pandas-dev#62498) (pandas-dev#62505)

…s-dev#62499) (pandas-dev#62508)

found the real issue - searchsorted is broken with python 3.14 + numpy 1.26. it's not compress_group_index, it's the compressor calculation in unstack that uses searchsorted. just fallback to the unique/return_index approach for this combo, same as what the non-sorted path does. works with 100k rows now.

AKHIL-149 · 2025-12-11T03:10:06Z

closing - will reopen with correct base branch

WillAyd and others added 30 commits October 9, 2024 20:09

Remove .pre-commit check for pytest ref pandas-dev#56671

99e98e5

Skip niche issue

7edc8d7

Add required skip from pandas-dev#58467

24bff56

Remove tests that will fail without backport of pandas-dev#58437

75b551f

additional test fixes (for tests that changed or no longer exist on m…

5e27da4

…ain)

Enable CoW in the string test build

0a2981a

Skip test if pyarrow not installed in test_numeric_only

baefc5c

pick out stringarray keepdims changes from pandas-dev#59234

3dc222d

Fix: avoid object dtype inference warning in to_datetime

4f628e8

xfail tests that trigger dtype inference warnings

39260a0

avoid dtype inference warnings by removing explicit dtype=object

91e65b6

un-xfail tests for replace/fillna downcasting

380372f

xfail tests triggering empty concat warning

13bf07a

Update xfails for 2.3.x

33072d0

Fix string dtype comparison in value_counts dtype inference deprecation

e14e99a

string[pyarrow_numpy] -> str

0537c90

Fix cow ref tracking in replace with list and regex

e825b0e

suppress pylint errors

46fbd7f

Backport PR pandas-dev#59985 on branch 2.3.x (Programming Language ::…

1eb8f0e

… Python :: 3.13 added to pyproject.toml) (pandas-dev#60012) Backport PR pandas-dev#59985: Programming Language :: Python :: 3.13 added to pyproject.toml Co-authored-by: LOCHAN PAUDEL <[email protected]>

String dtype: fix pyarrow-based IO + update tests (pandas-dev#59478)

c9d4b1b

REF (string): avoid copy in StringArray factorize (pandas-dev#59551)

60175cc

* REF: avoid copy in StringArray factorize * mypy fixup * un-xfail

String dtype: avoid surfacing pyarrow exception in binary operations (p…

daa46c1

…andas-dev#59610)

DOC: Add whatsnew for 2.3.0 (pandas-dev#59625)

616ede5

* DOC: Add whatsnew for 2.3.0 * fix duplicate label

BUG (string): str.replace with negative n (pandas-dev#59628)

a9e7d2b

* BUG (string): str.replace with negative n * update GH ref

TST (string): fix xfailed groupby value_counts tests (pandas-dev#59632)

3d1617f

REF (string): rename result converter methods (pandas-dev#59626)

9cb66bf

TST (string) fix xfailed groupby tests (3) (pandas-dev#59642)

d64b8d8

* TST (string) fix xfailed groupby tests (3) * TST: non-pyarrow build

REF (string): de-duplicate str_endswith, startswith (pandas-dev#59568)

807d8d5

DEPR (string): non-bool na for obj.str.contains (pandas-dev#59615)

4de4268

Co-authored-by: Joris Van den Bossche <[email protected]>

meeseeksmachine and others added 25 commits September 13, 2025 08:08

Backport PR pandas-dev#62283 on branch 2.3.x (BUG: fix pyarrow string…

6690762

… regex replacement) (pandas-dev#62328) Co-authored-by: Álvaro Kothe <[email protected]>

[backport 2.3.x] String dtype: more informative repr (keeping brief _…

4bdac7c

…_str__) (pandas-dev#61148) (pandas-dev#62329)

[2.3.x] Only use new string dtype repr for the new (NaN-based) string…

dd45373

… dtype (pandas-dev#62333)

[backport 2.3.x] TST: run python-dev CI on 3.14-dev (pandas-dev#61950) (

374ddcb

pandas-dev#62326) Co-authored-by: Nathan Goldbaum <[email protected]>

Backport PR pandas-dev#62348 on branch 2.3.x (TYP: Ignore xlsxwriter …

a25f560

…_book typing) (pandas-dev#62391) Co-authored-by: Matthew Roeschke <[email protected]>

[backport 2.3.x] CoW: disable chained assignment detection for Python…

34a69a2

… 3.14 (pandas-dev#62324) (pandas-dev#62375) Co-authored-by: Nathan Goldbaum <[email protected]>

[backport 2.3.x] DEPS: add Python 3.14 and 3.14t wheel builds (pandas…

4579b88

…-dev#62318) (pandas-dev#62394) Co-authored-by: Nathan Goldbaum <[email protected]>

Backport PR pandas-dev#62217 on branch 2.3.x (BUG: fix memory leak in…

8be57bc

… JSON datetime serialization) (pandas-dev#62253) Co-authored-by: Álvaro Kothe <[email protected]>

Backport PR pandas-dev#61073 on branch 2.3.x (BUG: fix bug in str.ful…

fd40f9a

…lmatch for Arrow backend with optional groups) (pandas-dev#62401) Co-authored-by: ptth222 <[email protected]>

[backport 2.3.x] BUG: Fixed assign failure when with Copy-on-Write (p…

fd2f7eb

…andas-dev#60941) (pandas-dev#62409) Co-authored-by: ChiLin Chiu <[email protected]>

Backport PR pandas-dev#62323 on branch 2.3.x (String dtype: keep sele…

0426e59

…ct_dtypes(include=object) selecting string columns) (pandas-dev#62400) Co-authored-by: Joris Van den Bossche <[email protected]>

Backport PR pandas-dev#62403 on branch 2.3.x (DOC: add select_dtypes …

3ecc8f2

…case to 3.0 string migration guide) (pandas-dev#62413) Co-authored-by: Joris Van den Bossche <[email protected]>

Backport PR pandas-dev#62410 on branch 2.3.x (BUG: fix bug in str.mat…

d348852

…ch for Arrow backend with optional groups) (pandas-dev#62412) Co-authored-by: Joris Van den Bossche <[email protected]>

Backport PR pandas-dev#62396 on branch 2.3.x (PKG/DOC: indicate Pytho…

6113696

…n 3.14 support in pyproject.toml and release notes) (pandas-dev#62415) Co-authored-by: Joris Van den Bossche <[email protected]>

BUG: improve future warning for boolean operations with missaligned i…

23a1085

…ndexes (pandas-dev#62367)

Backport to 2.3.x: REGR: from_records not initializing subclasses pro…

e0fe9a0

…perly (pandas-dev#60726) (pandas-dev#62436) Co-authored-by: Patrick Hoefler <[email protected]>

Backport PR pandas-dev#62452 on branch 2.3.x (TST: Adjust tests for n…

e57c7d6

…umexpr 2.13) (pandas-dev#62454) Co-authored-by: Matthew Roeschke <[email protected]>

[backport 2.3.x] BUG: fix .str.isdigit to honor unicode superscript f…

92bf98f

…or older pyarrow (pandas-dev#61962) (pandas-dev#62476)

[backport 2.3.x] DEPR: remove the Period resampling deprecation (pand…

2ca088d

…as-dev#62480) (pandas-dev#62485)

[backport 2.3.x] BUG: String[pyarrow] comparison with mixed object (p…

058eb2b

…andas-dev#62424) (pandas-dev#62504) Co-authored-by: jbrockmendel <[email protected]>

[backport 2.3.x] BUG: avoid validation error for ufunc with string[py…

b64f0df

…thon] array (pandas-dev#62498) (pandas-dev#62505)

[backport 2.3.x] DOC: prepare 2.3.3 whatsnew notes for release (panda…

6aa788a

…s-dev#62499) (pandas-dev#62508)

RLS: 2.3.3

9c8bc3e

AKHIL-149 requested review from attack68, mroeschke and rhshadrach as code owners December 11, 2025 02:59

AKHIL-149 closed this Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323

Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323

Uh oh!

AKHIL-149 commented Dec 11, 2025

Uh oh!

AKHIL-149 commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

Uh oh!

Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323

Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323

Uh oh!

Conversation

AKHIL-149 commented Dec 11, 2025

Summary

What was happening

The fix

Testing

Uh oh!

AKHIL-149 commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants