Skip to content

Bug: groupby(..., observed=True) doesn't respect sort key #27369

Closed
@topper-123

Description

@topper-123

df.groupby(..., observed=True) behaves inconsistently for unordered Categoricals:

>>> df = pd.DataFrame({'A': pd.Categorical(['b', 'a']), 'B': [1,2]}) 
>>> df.groupby('A', observed=False).sum()  # ok
   B
A
a  2
b  1
>>> df.groupby('A', observed=True).sum()  # not ok
   B
A
b  1
a  2

Because the sort parameter in the groupby is implicitly True, the second result should be the same as if sort_index had been called:

>>> df.groupby('A', observed=True).sum().sort_index()  # ok result, but shouldn't be needed
   B
A
a  2
b  1

My guess is that somewhere there is missing a if sort: obj.sort_index() block or similar.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions