Closed
Description
df.groupby(..., observed=True)
behaves inconsistently for unordered Categoricals:
>>> df = pd.DataFrame({'A': pd.Categorical(['b', 'a']), 'B': [1,2]})
>>> df.groupby('A', observed=False).sum() # ok
B
A
a 2
b 1
>>> df.groupby('A', observed=True).sum() # not ok
B
A
b 1
a 2
Because the sort
parameter in the groupby is implicitly True, the second result should be the same as if sort_index
had been called:
>>> df.groupby('A', observed=True).sum().sort_index() # ok result, but shouldn't be needed
B
A
a 2
b 1
My guess is that somewhere there is missing a if sort: obj.sort_index()
block or similar.