Skip to content

BUG: Dataframe arithmatic operators don't work with Series using fill_value #61828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -778,6 +778,7 @@ MultiIndex
- :func:`MultiIndex.get_level_values` accessing a :class:`DatetimeIndex` does not carry the frequency attribute along (:issue:`58327`, :issue:`57949`)
- Bug in :class:`DataFrame` arithmetic operations in case of unaligned MultiIndex columns (:issue:`60498`)
- Bug in :class:`DataFrame` arithmetic operations with :class:`Series` in case of unaligned MultiIndex (:issue:`61009`)
- Bug in :class:`DataFrame` arithmetic operations with :class:`Series` now works with ``fill_value`` parameter (:issue:`61581`)
- Bug in :meth:`MultiIndex.from_tuples` causing wrong output with input of type tuples having NaN values (:issue:`60695`, :issue:`60988`)

I/O
Expand Down
5 changes: 0 additions & 5 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -8382,11 +8382,6 @@ def _flex_arith_method(
if self._should_reindex_frame_op(other, op, axis, fill_value, level):
return self._arith_method_with_reindex(other, op)

if isinstance(other, Series) and fill_value is not None:
# TODO: We could allow this in cases where we end up going
# through the DataFrame path
raise NotImplementedError(f"fill_value {fill_value} not supported.")

other = ops.maybe_prepare_scalar_for_op(other, self.shape)
self, other = self._align_for_op(other, axis, flex=True, level=level)

Expand Down
42 changes: 23 additions & 19 deletions pandas/tests/frame/test_arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,12 +628,6 @@ def test_arith_flex_frame_corner(self, float_frame):
expected = float_frame.sort_index() * np.nan
tm.assert_frame_equal(result, expected)

with pytest.raises(NotImplementedError, match="fill_value"):
float_frame.add(float_frame.iloc[0], fill_value=3)

with pytest.raises(NotImplementedError, match="fill_value"):
float_frame.add(float_frame.iloc[0], axis="index", fill_value=3)

@pytest.mark.parametrize("op", ["add", "sub", "mul", "mod"])
def test_arith_flex_series_ops(self, simple_frame, op):
# after arithmetic refactor, add truediv here
Expand Down Expand Up @@ -667,19 +661,6 @@ def test_arith_flex_series_broadcasting(self, any_real_numpy_dtype):
result = df.div(df[0], axis="index")
tm.assert_frame_equal(result, expected)

def test_arith_flex_zero_len_raises(self):
# GH 19522 passing fill_value to frame flex arith methods should
# raise even in the zero-length special cases
ser_len0 = Series([], dtype=object)
df_len0 = DataFrame(columns=["A", "B"])
df = DataFrame([[1, 2], [3, 4]], columns=["A", "B"])

with pytest.raises(NotImplementedError, match="fill_value"):
df.add(ser_len0, fill_value="E")

with pytest.raises(NotImplementedError, match="fill_value"):
df_len0.sub(df["A"], axis=None, fill_value=3)

def test_flex_add_scalar_fill_value(self):
# GH#12723
dat = np.array([0, 1, np.nan, 3, 4, 5], dtype="float")
Expand Down Expand Up @@ -2199,3 +2180,26 @@ def test_mixed_col_index_dtype(using_infer_string):
dtype = "string"
expected.columns = expected.columns.astype(dtype)
tm.assert_frame_equal(result, expected)


@pytest.mark.parametrize("op", ["add", "sub", "mul", "div", "mod", "truediv", "pow"])
def test_df_series_fill_value(op):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ill need to take a closer look at this, just because im really skeptical that the fix is this easy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, i think the trouble is that in _maybe_align_series_as_frame we will broadcast the 1D object to 2D for numpy dtypes, but not EA dtypes. so can you add a test for non-numpy dtypes and see how it goes

# GH 61581
data = np.arange(50).reshape(10, 5)
columns = list("ABCDE")
df = DataFrame(data, columns=columns)
for i in range(5):
df.iat[i, i] = np.nan
df.iat[i + 1, i] = np.nan
df.iat[i + 4, i] = np.nan

df_a = df.iloc[:, :-1]
df_b = df.iloc[:, -1]
nan_mask = df_a.isna().astype(int).mul(df_b.isna().astype(int), axis=0).astype(bool)

df_result = getattr(df_a, op)(df_b, axis=0, fill_value=5)
df_expected = getattr(df_a.fillna(5), op)(df_b.fillna(5), axis=0).mask(
nan_mask, np.nan
)

tm.assert_frame_equal(df_result, df_expected)
Loading