V
V
vlarkanov2019-04-02 12:32:02
linux
vlarkanov, 2019-04-02 12:32:02

How to find a bad memory bar?

There are many messages in the log like
[Tue Apr 2 12:09:46 2019] EDAC MC1: 1 CE error on CPU#1Channel#1_DIMM#0 (channel:1 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
How to determine which memory stick needs to be replaced?
In iLo I see this:

PROC 1 DIMM 1G : not installed
PROC 1 DIMM 2D : not installed
PROC 1 DIMM 3A : 4096 MB 1333 MHz
PROC 1 DIMM 4H : not installed
PROC 1 DIMM 5E : not installed
PROC 1 DIMM 6B : 4096 MB 1333 MHz
PROC 1 DIMM 7I : 8192 MB 1333 MHz
PROC 1 DIMM 8F : 8192 MB 1333 MHz
PROC 1 DIMM 9C : 4096 MB 1333 MHz PROC
2 DIMM 1G : not installed PROC
2 DIMM 2D : not installed PROC 2 DIMM 5E : not installed PROC 2 DIMM 6B : 4096 MB 1333 MHz PROC 2 DIMM 7I : 8192 MB 1333 MHz PROC 2 DIMM 8F : 8192 MB 1333 MHz PROC 2 DIMM 9C : 4096 MB 1333 MHz

some more information:
# grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count
/sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
/sys/ devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch2_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow1/ch2_ce_count: 0
/sys/devices/system/edac/mc/mc0/csrow2/ch2_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc1/ csrow0/ch1_ce_count:2595
/sys/devices/system/edac/mc/mc1/csrow0/ch2_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow1/ch2_ce_count:0
/sys/devices/system/edac/ mc/mc1/csrow2/ch2_ce_count:0
# dmidecode -t memory | grep 'Locator: PROC'
Locator: PROC 1 DIMM 1G
Locator: PROC 1 DIMM 2D
Locator: PROC 1 DIMM 3A
Locator: PROC 1 DIMM 4H
Locator: PROC 1 DIMM 5E
Locator: PROC 1 DIMM 6B
Locator: PROC 1 DIMM 7I
Locator: PROC 1 DIMM 8F
Locator: PROC 1 DIMM 9C
Locator: PROC 2 DIMM 1G
Locator: PROC 2 DIMM 2D
Locator: PROC 2 DIMM 3A
Locator: PROC 2 DIMM 4H
Locator: PROC 2 DIMM 5E
Locator: PROC 2 DIMM 6B
Locator: PROC 2 DIMM 7I
Locator: PROC 2 DIMM 8F
Locator: PROC 2 DIMM 9C

Answer the question

In order to leave comments, you need to log in

2 answer(s)
C
CityCat4, 2019-04-02
@vlarkanov

I bet on PROC2 DIMM 6B :)

R
res2001, 2019-04-02
@res2001

Looks like 1 bar on 1 channel.
In any case, no one canceled the enumeration method.
Take memtest, leave 1 bar and test.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question