-
Notifications
You must be signed in to change notification settings - Fork 355
Ignore disabled hyperthreads #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@Flamefire can you please fix lint? (namely run clang-format on your PR) |
Fixed the clang format issue and squashed it into the initial commit I'm not sure what to put into the description in addition to what I have. I added a paragraph linking to the logic of CPUInfo itself. I.e. the issue is that while CPUInfo "knows" that it doesn't has information on the APIC ID it still uses it as-if it was valid. |
@pytorchbot rebase |
Only consider processors with an APIC_ID as valid. For processors with Hyperthreads but disabled SMT the APIC_ID will never be set and stays zero for the "disabled" threads. Among other inconsistencies this causes the first active thread/processor to be considered "the same" as all the disabled ones as they share `APIC_ID=0` It also doesn't make sense to make and decisions based on the APIC_ID if we don't have any information on it.
Rebased manually. Not sure why the bot didn't do it. |
#238 is for nnpack? Did you mean xnnpack? I've seen similar issues on arm where cores are unavailable, but for different reasons |
NNPACK is correct, that's where I ran into a crash due to this. See the issue there: Maratyszcza/NNPACK#218 |
Only consider processors with an APIC_ID as valid.
For processors with Hyperthreads but disabled SMT the APIC_ID will never be set by CPUInfo and stays zero for the "disabled" threads. Among other inconsistencies this causes the first active thread/processor to be considered "the same" as all the disabled ones as they share
APIC_ID=0
It also doesn't make sense to make and decisions based on the APIC_ID if we don't have any information on it.
So this patch adds a check for
CPUINFO_LINUX_FLAG_APIC_ID
.The flag
CPUINFO_LINUX_FLAG_APIC_ID
will be set if and only if the APIC_ID has been set. If it has not then CPUInfo doesn't has any information about the APIC_ID, hence the default value of zero. Seecpuinfo/src/x86/linux/cpuinfo.c
Lines 68 to 69 in 39ea79a
On a dual socket AMD EPYC 9334 system (2x32 cores, 128 threads total (2/core)) the current output of cache-info is
Note the "65 processors" being one-off.
With this patch:
This matches the Zen4 architecture having 32KB Data and 32KB instruction L1 cache per core, 1MB L2 cache per core and 256MB L3 cache total.
Fixes #238