Z
Z
Z0nd0R2014-02-19 22:39:24
linux
Z0nd0R, 2014-02-19 22:39:24

Why does the server randomly hang on CentOS 6.5?

Good day!
There is a server in Hetzner. EX4 configuration.
Recently began to spontaneously hang once a day at different intervals.
It was not possible to find a pattern in the logs before the crash. (maybe there are not enough skills)
The only thing that was done was updating the software about a month ago and the kernel was updated with it.
It hangs to death, so requests for automatic reboot do not give any result.
The most interesting thing is that the load on the server is almost zero, there is nothing supernatural on it.
PHP, Apache, Nginx, Clamav, ISPMgr, Ruby, MySQL.
The first time it all started on the 16th.

reboot   system boot  2.6.32-431.5.1.e Wed Feb 19 13:55 - 14:01  (00:06)
reboot   system boot  2.6.32-431.5.1.e Tue Feb 18 08:23 - 14:01 (1+05:38)
reboot   system boot  2.6.32-431.5.1.e Mon Feb 17 08:23 - 14:01 (2+05:38)
reboot   system boot  2.6.32-431.el6.x Sun Feb 16 01:15 - 14:01 (3+12:46)

Any ideas what could be the problem?

Answer the question

In order to leave comments, you need to log in

6 answer(s)
Z
Z0nd0R, 2014-10-13
@Z0nd0R

In general, 8 months have passed since the question was raised. About a month ago, I ran out of patience and wrote to Hetzner's technical support about the problems.
He asked me to diagnose the screws and see why the server crashed.
An hour after the start of the diagnostics, technical support was born.
Wrote that the screws are in order. And they updated the bios on the motherboard. And as if by magic, everything stopped. It's been a month now and not a single drop.

B
brutal_lobster, 2014-02-20
@brutal_lobster

Enable core crash dumping.
https://access.redhat.com/site/documentation/en-US...
And go over possible errors.
www.dedoimedo.com/computers/crash-analyze.html
In the dump, you will most likely find the cause of the freezes. If the problem is in the hardware - show the traces and logs to local engineers - they will replace something.

G
grossws, 2014-02-20
@grossws

Exhaust dmesg to the studio, for starters.

M
m1rl0b, 2014-02-20
@m1rl0b

and try to roll back and boot on the old kernel?

D
Dmitry Alekseev, 2014-02-24
@dalexeyev

See sar output for high load spikes or iowait before crash
It's also good to look at memory load just before crash, maybe outside via SNMP.
Look carefully at the messages log, was there any system activity during the moments of unavailability? Maybe the network controller has fallen off?

1
1serfer, 2014-08-15
@1serfer

Faced a similar problem.
Go through all the logs of the software you specified:
PHP, Apache, Nginx, Ruby, MySQL

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question