Bug 26672 - Backport upstream fix NUMA sched domain build errors for GNR and CWF
Summary: Backport upstream fix NUMA sched domain build errors for GNR and CWF
Status: NEW
Alias: None
Product: ANCK 6.6 Dev
Classification: ANCK
Component: sched (show other bugs) sched
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: AubreyLi
QA Contact: CruzZhao
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-10-28 12:11 UTC by AubreyLi
Modified: 2025-12-15 16:57 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description AubreyLi intel_group 2025-10-28 12:11:01 UTC
While testing Granite Rapids (GNR) and Clearwater Forest (CWF) systems
in SNC-3 mode, we encountered sched domain build errors in dmesg.
The scheduler domain code did not expect asymmetric node distances
from a local node to multiple nodes in a remote package. As a result,
remote nodes ended up being grouped partially with local nodes with
asymemtric groupings, and creating too many levels in the NUMA sched
domain hierarchy.

To address this, we simplify remote node distances for the purpose of
sched domain construction on GNR and CWF. Specifically, we replace the
individual distances to nodes within the same remote package with their
average distance. This resolves the domain build errors and reduces the
number of NUMA sched domain levels.

The actual SLIT NUMA node distances are still preserved separately, in
case they are needed when building sched domains. NUMA balancing
continues to use the true distances when selecting a closer remote node
for a task’s numa_group.

The following two commits backported, as well as its necessary dependencies
if has.

- 0001-sched-Create-architecture-specific-sched-domain-dist.patch
- 0002-sched-topology-Fix-sched-domain-build-error-for-GNR-.patch
Comment 1 小龙 admin 2025-12-15 16:57:54 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/6206