S
S
synapse_people2019-04-15 10:52:58
Debian
synapse_people, 2019-04-15 10:52:58

How to check the memory on the server?

Very often (once an hour) lines flies into the syslog:

Message from [email protected] at Apr 15 10:51:23 ...
 kernel:[ 1873.150211] [Hardware Error]: Corrected error, no action required.

Message from [email protected] at Apr 15 10:51:23 ...
 kernel:[ 1873.150228] [Hardware Error]: CPU:6 (10:8:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|CECC]: 0xdc4a400053080813

Message from [email protected] at Apr 15 10:51:23 ...
 kernel:[ 1873.150238] [Hardware Error]: Error Addr: 0x0000001729eb00e0

Message from [email protected] at Apr 15 10:51:23 ...
 kernel:[ 1873.150243] [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB.

Message from [email protected] at Apr 15 10:51:23 ...
 kernel:[ 1873.150274] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout)

In this connection, the question, as I understand it, is the problem in memory? How can I check if it is ECC?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
Ronald McDonald, 2019-04-15
@synapse_people

If possible, run through memtester. Here, however, the grandmother said in two - the memory can correct the broken data and the error will not reach memtest.
In general, the ECC is for that and the ECC, so that the administrator does not bother with memory errors, they even write to you:
So my personal verdict: don't pay attention. If programs suddenly start to crash and the kernel panics, then pay attention.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question