Bug 7784 - Backport upstream patches to support EDAC on Intel Granite Rapids(GNR) and Serria Forest(SRF)
Summary: Backport upstream patches to support EDAC on Intel Granite Rapids(GNR) and Se...
Status: NEW
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: X86 (show other bugs) X86
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: Guanjun
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-20 19:07 UTC by wjin123
Modified: 2024-01-22 11:47 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wjin123 intel_group 2023-12-20 19:07:12 UTC
Description of problem:
The upstream kernel v6.3-rc1 and v6.6-rc1 provide EDAC support for Intel Granite Rapids(GNR) and Serria Forest(SRF) servers, need to backport these patches to ANCK 5.10 for supporting EDAC on the two Intel servers.
The related and dependent upstream patches list below:
0cfd8fbadd68 - x86/cpu: Fix Crestmont uarch
c545f5e41225 - EDAC/i10nm: Skip the absent memory controllers
96ae3995c693 - EDAC/i10nm: Add Intel Sierra Forest server support
ba987eaaabf9 - EDAC/i10nm: Add Intel Granite Rapids server support
dd7814b78539 - EDAC/i10nm: Make more configurations CPU model specific

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
BIOS setting:
System Event Log -> WHEA Support <Enabled>
System Event Log -> Error Injection Settings -> WHEA Error Injection 5.0 Extension = Enable
kernel configuration:
CONFIG_EDAC_I10NM=m
1.After GNR/SRF system power on, check if EDAC module installed with command "lsmod | grep -i edac", if nothing output, it indicates EDAC isn't support, if EDAC module can be found, i.e., "i10nm_edac       24576  0", go to next step.
2. inject memory CE error with ras-tools command "./cmcistorm 1 1".
3. check EDAC message in dmesg.

Actual results:
if no above EDAC patches applied, command "lsmod | grep -i edac" will have nothing output, insert EDAC modules with command "modprobe i10nm_edac" will report error.

Expected results:
1.command "lsmod | grep -i edac" will output similar info as: "i10nm_edac             24576  0". 
2. Inject memory CE error, EDAC message in dmesg will be found, similar as:"[  149.909903] EDAC MC9: 1 CE memory read error on CPU_SrcID#1_MC#1_Chan#0_DIMM#0 (channel:0 slot:0 page:0x3c2cedb7 offset:0x480 grain:32 syndrome:0x0 -  err_code:0x0080:0x0090  SystemAddress:0x3c2cedb7480 ProcessorSocketId:0x1 MemoryControllerId:0x1 ChannelAddress:0x4edb7480 ChannelId:0x0 RankAddress:0x13b6dd00 PhysicalRankId:0x1 DimmSlotId:0x0 DimmRankId:0x1 Row:0x23b3 Column:0x3a0 Bank:0x1 BankGroup:0x5 ChipSelect:0x1)"

Additional info:
Comment 1 wjin123 intel_group 2023-12-21 18:27:14 UTC
At reproduce step 2, run "modprobe einj" first before running command "./cmcistorm 1 1" to avoid einj.ko module not inserting in advance.
Comment 2 小龙 admin 2023-12-28 22:11:54 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/2559
Comment 3 小龙 admin 2024-01-22 11:47:57 UTC
The PR Link: https://gitee.com/anolis/anck-next/pulls/36