-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Labels
high priorityHigh-priority issueHigh-priority issue
Description
Several tests are completely skipped right now because they are "flaky".
test_reshape
test_std
test_var
test_remainder
This is a pretty high priority issue because these functions are effectively completely untested, even though they appear to be tested.
Tests should be written in such a way that they aren't flaky, for instance, by using high numerical tolerances (or if necessary, avoiding values testing entirely).
Note that health checks for timeouts should just be skipped, and health checks for filtering too much should be fixed by fixing the strategy.
EDIT:
test_count_nonzero
is not skipped but is flaky on JAX. Some discussion is at add a test for count_nonzero #347 (comment)
Metadata
Metadata
Assignees
Labels
high priorityHigh-priority issueHigh-priority issue
Activity
ev-br commentedon Nov 17, 2024
Looking at
test_std
, https://github.com/data-apis/array-api-tests/blob/master/array_api_tests/test_statistical_functions.py#L262, it does not seem to attempt any value testing. Then what is flaky,assert_dtype
orassert_keepdimable_shape
?asmeurer commentedon Nov 18, 2024
I've no idea what's flaky with any of these. The first order of business would to remove that decorator and figure out why the test was failing. It's also possible that some of these were only flaky with certain libraries.
asmeurer commentedon Nov 18, 2024
Also, it's possible the flakyness was fixed and the skip was never removed. It looks like skip for std was added in #233 (with no explanation) if you want to check previous versions.
At best, if the test seems to be passing, we can just remove the skip and see if any upstream failures are found. Like I mentioned in another issue, it's really easy to just revert changes here if they break stuff since we don't even have releases, so I wouldn't be too worried about that.
ev-br commentedon Nov 23, 2024
test_reshape
is fixed in gh-319ev-br commentedon Nov 25, 2024
Caught a
test_std
failure witharray_api_compat.numpy
:asmeurer commentedon Nov 25, 2024
I can reproduce that with
asmeurer commentedon Nov 25, 2024
I can't tell what is causing it. None of the strategies seem to be that unusual. The only thing I see that's a little different from the other tests is that the input array is filtered to have at least 2 elements, but that shouldn't be causing this error.
Unfortunately, hypothesis makes it quite hard to tell what's going on with this error. The only thing I can suggest would be to refactor the input strategies, e.g., to use
shared
instead ofdata.draw
. Otherwise, we may want to report this upstream on the hypothesis repo, and see if the hypothesis devs can offer any advice. It may also just be a bug in hypothesis.