Open
Description
🐛 Bug
RunningMeanStd is not overflow safe, and overflows when running large-scale training (e.g., on a cluster).
To Reproduce
I'm submitting a pull request with a proposal to address the problem.
Relevant log output / Error message
No response
System Info
No response
Checklist
- My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
- I have checked that there is no similar issue in the repo
- I have read the documentation
- I have provided a minimal and working example to reproduce the bug
- I've used the markdown code blocks for both code and stack traces.