Skip to content

Pivoting on multiple columns causes data loss #375

Closed
@ukch

Description

@ukch

Seen on HEAD. This problem is new since 0.5.0.

In [3]: data
Out[3]: 
   foo   bar   baz   spam   data
0  foo1  bar1  baz1  spam2  20  
1  foo1  bar2  baz1  spam3  30  
2  foo2  bar2  baz1  spam2  40  
3  foo1  bar1  baz2  spam1  50  
4  foo3  bar1  baz2  spam1  60  

In [4]: pandas.pivot_table(data, values="data", rows=["foo", "bar"], cols=["baz", "spam"])
Out[4]: 
  baz      baz1   baz2 
  spam     spam1  spam1
foo  bar               
foo1 bar1  20     50   
     bar2  30     NaN  
foo2 bar2  40     NaN  
foo3 bar1  NaN    60   

As you can see, the ("baz1", "spam2"), ("baz2", "spam2"), ("baz1", "spam3") and ("baz2", "spam3") columns have disappeared and their contents have been aggregated into the remaining two columns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions