-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix(rust): Fix cum_min and cum_max does not preserve inf or -inf values at series start #22896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #22896 +/- ##
==========================================
- Coverage 80.61% 80.31% -0.31%
==========================================
Files 1677 1682 +5
Lines 222297 223181 +884
Branches 2801 2804 +3
==========================================
+ Hits 179205 179248 +43
- Misses 42425 43264 +839
- Partials 667 669 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @orlp , please review and suggest if needed anything. |
|
You can save a lot of repetition by defining fn cum_scan_numeric<T, F>(ca: &ChunkedArray<T>, reverse: bool, init: T::Native, update: F) -> ChunkedArray<T>
where
T: PolarsNumericType,
ChunkedArray<T>: FromIterator<Option<T::Native>>,
F: Fn(&mut T::Native, Option<T::Native>) -> Option<Option<T::Native>>
{
let out: ChunkedArray<T> = match reverse {
false => ca.iter().scan(init, update).collect_trusted(),
true => ca.iter().rev().scan(init, update).collect_reversed(),
};
out.with_name(ca.name().clone())
}and then using let init = if <$T>::is_float() { <$T>::nan_value() } else { <$T>::min_value() };
cum_scan_numeric(ca, reverse, init, det_max)and similarly for the minimum in the dispatch. Note that your current init value for floats is incorrect, it needs to be |
orlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
…s_float() to simplify dispatch.
…s, as fmt suggests.
Thanks! That's very helpful. I've made some new changes on it. @orlp |
|
Wait, I missed implementing cum_scan_numeric to cum_sum_numeric and cum_prod_numeric |
… with cum_scan_numeric.
|
Hi @orlp , I've done new commits on it. |
|
Thank you for helping out :) |
All good! Appreciations. |
What’s Changed
Improved cum_min / cum_max for floating-point numbers, and NaN handling.
This update refines the
Value init when doing cum_max_numeric and cum_min_numeric.
It now checks the type: if it's a float, the starting value for the calculation is NaN. Otherwise, it uses the standard min/max possible value for that type.
Correct NaN handling in det_min and det_max
Now it uses MinMax::min_ignore_nan and MinMax::max_ignore_nan to compare the values. For example, if a series starts with NaN, the result will also start with NaN, as expected.
New cum_scan_numeric function.
A new helper function to handle the common scanning logic.
Added Test
Would close the issue #22855