Answer the question
In order to leave comments, you need to log in
How to check the memory on the server?
Very often (once an hour) lines flies into the syslog:
Message from [email protected] at Apr 15 10:51:23 ...
kernel:[ 1873.150211] [Hardware Error]: Corrected error, no action required.
Message from [email protected] at Apr 15 10:51:23 ...
kernel:[ 1873.150228] [Hardware Error]: CPU:6 (10:8:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|CECC]: 0xdc4a400053080813
Message from [email protected] at Apr 15 10:51:23 ...
kernel:[ 1873.150238] [Hardware Error]: Error Addr: 0x0000001729eb00e0
Message from [email protected] at Apr 15 10:51:23 ...
kernel:[ 1873.150243] [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB.
Message from [email protected] at Apr 15 10:51:23 ...
kernel:[ 1873.150274] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout)
Answer the question
In order to leave comments, you need to log in
If possible, run through memtester. Here, however, the grandmother said in two - the memory can correct the broken data and the error will not reach memtest.
In general, the ECC is for that and the ECC, so that the administrator does not bother with memory errors, they even write to you:
So my personal verdict: don't pay attention. If programs suddenly start to crash and the kernel panics, then pay attention.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question