-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
AKHIL-149
wants to merge
478
commits into
pandas-dev:main
from
AKHIL-149:fix-pivot-searchsorted-py314
Closed
Fix pivot_table duplicate indices with Python 3.14 + NumPy 1.26 #63323
AKHIL-149
wants to merge
478
commits into
pandas-dev:main
from
AKHIL-149:fix-pivot-searchsorted-py314
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…maybe_converts_object`) if requested (pandas-dev#59487) * String dtype: maybe_converts_object give precedence to nullable dtype * update datetimelike input validation * update tests and remove xfails * explicitly test pd.array() behaviour (remove xfail) * fixup allow_2d * undo changes related to datetimelike input validation * fix test for str on current main --------- Co-authored-by: Matthew Roeschke <[email protected]>
… Python :: 3.13 added to pyproject.toml) (pandas-dev#60012) Backport PR pandas-dev#59985: Programming Language :: Python :: 3.13 added to pyproject.toml Co-authored-by: LOCHAN PAUDEL <[email protected]>
* REF: avoid copy in StringArray factorize * mypy fixup * un-xfail
* DOC: Add whatsnew for 2.3.0 * fix duplicate label
* BUG (string): str.replace with negative n * update GH ref
* TST (string) fix xfailed groupby tests (3) * TST: non-pyarrow build
Co-authored-by: Joris Van den Bossche <[email protected]>
… regex replacement) (pandas-dev#62328) Co-authored-by: Álvaro Kothe <[email protected]>
pandas-dev#62326) Co-authored-by: Nathan Goldbaum <[email protected]>
…_book typing) (pandas-dev#62391) Co-authored-by: Matthew Roeschke <[email protected]>
… 3.14 (pandas-dev#62324) (pandas-dev#62375) Co-authored-by: Nathan Goldbaum <[email protected]>
…-dev#62318) (pandas-dev#62394) Co-authored-by: Nathan Goldbaum <[email protected]>
… JSON datetime serialization) (pandas-dev#62253) Co-authored-by: Álvaro Kothe <[email protected]>
…lmatch for Arrow backend with optional groups) (pandas-dev#62401) Co-authored-by: ptth222 <[email protected]>
…andas-dev#60941) (pandas-dev#62409) Co-authored-by: ChiLin Chiu <[email protected]>
…ct_dtypes(include=object) selecting string columns) (pandas-dev#62400) Co-authored-by: Joris Van den Bossche <[email protected]>
…els workflow (pandas-dev#61669) (pandas-dev#61718) (pandas-dev#62395) Co-authored-by: Evgenii Mosikhin <[email protected]> Co-authored-by: Evgenii Mosikhin <[email protected]> Co-authored-by: Laurie O <[email protected]>
…case to 3.0 string migration guide) (pandas-dev#62413) Co-authored-by: Joris Van den Bossche <[email protected]>
…ch for Arrow backend with optional groups) (pandas-dev#62412) Co-authored-by: Joris Van den Bossche <[email protected]>
…n 3.14 support in pyproject.toml and release notes) (pandas-dev#62415) Co-authored-by: Joris Van den Bossche <[email protected]>
…perly (pandas-dev#60726) (pandas-dev#62436) Co-authored-by: Patrick Hoefler <[email protected]>
…umexpr 2.13) (pandas-dev#62454) Co-authored-by: Matthew Roeschke <[email protected]>
…andas-dev#62424) (pandas-dev#62504) Co-authored-by: jbrockmendel <[email protected]>
found the real issue - searchsorted is broken with python 3.14 + numpy 1.26. it's not compress_group_index, it's the compressor calculation in unstack that uses searchsorted. just fallback to the unique/return_index approach for this combo, same as what the non-sorted path does. works with 100k rows now.
Contributor
Author
|
closing - will reopen with correct base branch |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
fixes #63314 - pivot_table creating duplicate indices on python 3.14 with numpy 1.26
tracked down the actual bug. wasn't in compress_group_index like i thought - it's numpy's searchsorted that's broken with this version combo.
What was happening
The fix
fallback to the np.unique approach when on python 3.14 + numpy < 2.0. this is the same method the non-sorted path already uses, so it's tested.
Testing
tested with the reproduction case from the issue (100k rows, 3 metrics). works correctly now.