Skip to content

Conversation

@sursu
Copy link

@sursu sursu commented Aug 25, 2019

I've recently learned how to do PRs.
So, here's a tiny change I'm proposing.


self._n_examples = df.shape[0]
self._n_unique = df.index.unique().shape[0]
self._n_unique = df.index.nunique()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, TIL about this feature

@CamDavidsonPilon
Copy link
Owner

There are unit test errors, and I think they caused by us carrying a multi-index around which just happens to have the (possibly redundant) id col - all comes full circle. Let me tie this into the issue #809, and once I decide on that topic, then this PR can be addressed.

@sursu
Copy link
Author

sursu commented Aug 25, 2019

The issue with MultiIndex comes from pandas apparently:

In JupyterLab if I run:

arrays = [[1, 1, 1, 2], ['red', 'blue', 'red', 'blue']]
mi = pd.MultiIndex.from_arrays(arrays, names=('number', 'color'))

mi.nunique()

I get the following error:

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/missing.py in _isna_new(obj)
    131     # hack (for now) because MI registers as ndarray
    132     elif isinstance(obj, ABCMultiIndex):
--> 133         raise NotImplementedError("isna is not defined for MultiIndex")
    134     elif isinstance(
    135         obj,

NotImplementedError: isna is not defined for MultiIndex

The comment there says it all.

I guess this PR can be closed.

@CamDavidsonPilon
Copy link
Owner

I'd like to keep it open, if you don't mind. It's a reminder to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants