Open
Description
Description
I tested building and running the tests, including OpenMP support, by including the flag:
cmake -B build -G Ninja -DBUILD_TESTING=on -DCMAKE_Fortran_FLAGS=-fopenmp -DCMAKE_MAXIMUM_RANK:String=4 -DCMAKE_BUILD_TYPE=Release -DCMAKE_Fortran_COMPILER=gfortran
cmake --build build
ctest --test-dir build/test
several of the tests failed:
86% tests passed, 11 tests failed out of 77
Label Time Summary:
quadruple_precision = 0.17 sec*proc (2 tests)
Total Test time (real) = 12.68 sec
The following tests FAILED:
12 - chaining_maps (SEGFAULT)
13 - open_maps (SEGFAULT)
14 - maps (SEGFAULT)
15 - intrinsics (Failed)
30 - linalg_pseudoinverse (Failed)
38 - blas_lapack (Failed)
43 - sorting (Exit code 0xc0000374
)
47 - mean (Failed)
59 - string_intrinsic (Failed)
64 - string_to_number (Failed)
69 - simps (Failed)
64 - string_to_number (Failed)
69 - simps (Failed)
64 - string_to_number (Failed)
69 - simps (Failed)
64 - string_to_number (Failed)
69 - simps (Failed)
I wonder if one of the CI jobs should include OpenMP in order to catch such behaviours early ?
Expected Behaviour
Should pass
Version of stdlib
master
Platform and Architecture
Windows / gfortran 14.2.0
Additional Information
No response
Activity
jvdp1 commentedon Mar 29, 2025
Strange that it failed on some procedures like
blas_lapack
orsorting
. I agree that we should include OpenMP in at least one of the CI jobs.jalvesz commentedon Apr 2, 2025
I was looking at the
intrinsics
test and saw that they fail for sum and dot_product forxdp
. I saw that there is a tolerance issue, for instance, if I print the tolerance and relative errors herehttps://github.com/fortran-lang/stdlib/blob/60d0a769216322243e28a63b92ed7668d2df80d5/test/intrinsics/test_intrinsics.fypp#L213C1-L218C87
adding a
print *, '${t}$ dot err:', tolerance, err(1:3)
I get without openmp:
real(xdp) dot err: 1.08420217248550443401E-0017 3.25260651745651330202E-0019 0.00000000000000000000 0.00000000000000000000
With openmp:
real(xdp) dot err: 1.08420217248550443401E-0017 2.22044604925031308085E-0016 0.00000000000000000000 5.55111512312578270212E-0016
For the latter, the errors seems to be funnily close to
epsilon(0.d0)
=2.220446049250313E-016
... I'm intrigued here, I wonder if the other tests might be suffering from something similar.perazz commentedon Apr 8, 2025
Yes, unfortunately I also noted this a while ago:
https://github.com/fortran-lang/fpm/blob/7535cab6efc89dd5a294f0d9643b5eebd6b237f0/src/fpm_meta.f90#L139-L142
I have never had time to dig into the issue, though.
I don't use openmp much, but I believe every time there is a
static
(save
) variable somewhere, that must be declaredTHREADPRIVATE
, otherwise all threads will write to it, causing unpredictable behavior.jalvesz commentedon Apr 11, 2025
On a different machine ( without the hash_functions tests #976 ) I got "only" the following fails when using openmp (here using GNU from msys2 instead of equation.com)
running: ctest --test-dir build/test --rerun-failed --output-on-failure
click to view log
perazz commentedon Apr 11, 2025
Regarding the filesystem tests, it would seem like it may be enough to ensure that the test file name is different from each thread.
PierUgit commentedon Apr 14, 2025
As you are mentioning, the problem can arise only if the saved variable (which can be a module variable, which is saved by design) is written, there's no issue when reading the variable. But as a general rule, given the importance of multithreading in HPC nowadays, the thread-safety status of all stdlib routines should be mentioned: which ones are thread-safe, which ones are not.
jalvesz commentedon Apr 15, 2025
I would have said better to make it such that the deletion is executed by a single thread, like adding
!$omp single
where appropriate, no?