Linux 内存错误诊断 (3)

2.执行检测命令,可查看纠错提示如下

edac-util -v 1 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#0_DIMM#0: A1 2 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#1_DIMM#0: A2 3 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#2_DIMM#0: A3 4 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#3_DIMM#0: A4 5 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#0_DIMM#1: A5 6 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#1_DIMM#1: A6 7 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#2_DIMM#1: A7 8 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#3_DIMM#1: A8 9 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#0_DIMM#2: A9 10 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#1_DIMM#2: A10 11 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#2_DIMM#2: A11 12 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#3_DIMM#2: A12 13 14 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#0_DIMM#0: B1 15 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#1_DIMM#0: B2 16 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#2_DIMM#0: B3 17 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#3_DIMM#0: B4 18 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#0_DIMM#1: B5 19 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#1_DIMM#1: B6 20 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#2_DIMM#1: B7 21 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#3_DIMM#1: B8 22 mc1: csrow2: CPU_SrcID#1_Ha#0_Chan#0_DIMM#1: B9 23 mc1: csrow2: CPU_SrcID#1_Ha#0_Chan#1_DIMM#1: B10 24 mc1: csrow2: CPU_SrcID#1_Ha#0_Chan#2_DIMM#1: B11 25 mc1: csrow2: CPU_SrcID#1_Ha#0_Chan#3_DIMM#1: B12

其中

mc06 表示 表示内存控制器0;
CPU_Src_ID#0 表示源CPU0;
Channel#0 表示通道0;
DIMM#0 标示内存槽0;
Corrected Errors 代表已经纠错的次数;

根据前面列出的CPU通道和内存槽对应关系即可给edac-utils 返回的信息进行编号。
即可得出 A1槽 6312 次纠错,B1槽 6459次纠错,B3槽 535次纠错. 3条内存出现潜在故障,接下来联系供应商进行更换即可。

12条内存的对应关系

1 mc0: csrow0: CPU#0Channel#0_DIMM#0: A1 2 mc0: csrow0: CPU#0Channel#1_DIMM#0: A2 3 mc0: csrow0: CPU#0Channel#2_DIMM#0: A3 4 mc0: csrow1: CPU#0Channel#0_DIMM#1: A4 5 mc0: csrow1: CPU#0Channel#1_DIMM#1: A5 6 mc0: csrow1: CPU#0Channel#2_DIMM#1: A6 7 8 mc1: csrow0: CPU#1Channel#0_DIMM#0: B1 9 mc1: csrow0: CPU#1Channel#1_DIMM#0: B2 10 mc1: csrow0: CPU#1Channel#2_DIMM#0: B3 11 mc1: csrow1: CPU#1Channel#0_DIMM#1: B4 12 mc1: csrow1: CPU#1Channel#1_DIMM#1: B5 13 mc1: csrow1: CPU#1Channel#2_DIMM#1: B6

20条内存的对应关系

1 mc0: 0 Uncorrected Errors with no DIMM info 2 mc0: 0 Corrected Errors with no DIMM info 3 mc0: csrow0: 0 Uncorrected Errors 4 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#0_DIMM#0: 0 Corrected Errors A1 5 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors B1 6 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#2_DIMM#0: 0 Corrected Errors C1 7 mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors D1 8 mc0: csrow1: 0 Uncorrected Errors 9 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#0_DIMM#1: 0 Corrected Errors A2 10 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#1_DIMM#1: 0 Corrected Errors B2 11 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#2_DIMM#1: 0 Corrected Errors C2 12 mc0: csrow1: CPU_SrcID#0_Ha#0_Chan#3_DIMM#1: 0 Corrected Errors D2 13 mc0: csrow2: 0 Uncorrected Errors 14 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#0_DIMM#2: 0 Corrected Errors A3 15 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#1_DIMM#2: 11 Corrected Errors B3 16 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#2_DIMM#2: 0 Corrected Errors C3 17 mc0: csrow2: CPU_SrcID#0_Ha#0_Chan#3_DIMM#2: 0 Corrected Errors D3 18 mc1: 0 Uncorrected Errors with no DIMM info 19 mc1: 0 Corrected Errors with no DIMM info 20 mc1: csrow0: 0 Uncorrected Errors 21 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#0_DIMM#0: 0 Corrected Errors 22 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors 23 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#2_DIMM#0: 0 Corrected Errors 24 mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors 25 mc1: csrow1: 0 Uncorrected Errors 26 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#0_DIMM#1: 0 Corrected Errors 27 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#1_DIMM#1: 0 Corrected Errors 28 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#2_DIMM#1: 0 Corrected Errors 29 mc1: csrow1: CPU_SrcID#1_Ha#0_Chan#3_DIMM#1: 0 Corrected Errors 30 31 4x16关系 32 mc0: csrow0: CPU#0Channel#0_DIMM#0: 0 Corrected Errors 8a 33 mc0: csrow0: CPU#0Channel#1_DIMM#0: 0 Corrected Errors 5b 34 mc0: csrow0: CPU#0Channel#2_DIMM#0: 0 Corrected Errors 2c 35 mc0: csrow1: 0 Uncorrected Errors 36 mc0: csrow1: CPU#0Channel#0_DIMM#1: 1 Corrected Errors 7d 37 mc0: csrow1: CPU#0Channel#1_DIMM#1: 0 Corrected Errors 4e 38 mc0: csrow1: CPU#0Channel#2_DIMM#1: 0 Corrected Errors 1f 39 mc0: csrow2: 0 Uncorrected Errors 40 mc0: csrow2: CPU#0Channel#0_DIMM#2: 0 Corrected Errors 6G 41 mc0: csrow2: CPU#0Channel#1_DIMM#2: 0 Corrected Errors 3h

参考:
https://www.cnblogs.com/luckyall/p/11225772.html

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zzgzwd.html