Answer the question
In order to leave comments, you need to log in
How to periodically monitor memory errors (EDAC,ECC) in linux (is there a comprehensive solution for monitoring server health)?
Hello, I
came across another monitoring task:
It is necessary to catch memory errors with a script (well, or a service) once per hour and report them to the alert channel (mail, messengers, etc. etc.)
What are the solutions for this?
I found this article, but for some reason this script swears at the lack of an integer value. (Maybe it should be so.)
In general, I am looking for a comprehensive solution for monitoring the hardware component of the server to be sent to prometheus and monitoring (partial alerting) through grafana, but while almost everything is self-written:
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question