You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
intel-extension-for-pytorch/csrc/cpu/runtime/CPUPool.cpp is written such that it will fail on systems with more than 1024 cpus. For example:
cpu_set_t main_thread_pre_set;
CPU_ZERO(&main_thread_pre_set);
if (sched_getaffinity(0, sizeof(cpu_set_t), &main_thread_pre_set) != 0) {
throw std::runtime_error("Fail to get the thread affinity information");
}
Needs to be done using dynamically sized CPU sets (man CPU_SET(3)).
This is the first place to choke, but I suspect there may be additional code that needs correction.
Versions
v2.7.0+cpu (and others?)
The text was updated successfully, but these errors were encountered:
Hi @hpcpony Thanks for reporting this issue. We are evaluating. Btw, can I know what HW platform (cpu processors, cloud, etc.) are you using for more than 1024 cpus?
Admittedly there probably aren't many of these out there so I don't think this is a time-critical bug fix, but if there comes a point where it's easy to fix the code it's probably be worth it. The way things are going with core counts it's probably going to become a more general problem in the not to distant future.
(*) getting my sysadmins to turn off hyperthreading has not been successful ;^(
Describe the bug
intel-extension-for-pytorch/csrc/cpu/runtime/CPUPool.cpp is written such that it will fail on systems with more than 1024 cpus. For example:
Needs to be done using dynamically sized CPU sets (man CPU_SET(3)).
This is the first place to choke, but I suspect there may be additional code that needs correction.
Versions
v2.7.0+cpu (and others?)
The text was updated successfully, but these errors were encountered: