[LRUG] Finding out why server was unresponsive
Andrew Stewart
boss at airbladesoftware.com
Tue Apr 16 02:42:16 PDT 2013
Good morning El Rug,
Yesterday afternoon one of my servers stopped responding to HTTP and SSH. I eventually got it back by executing a hardware reset via the host's (Hetzner) web GUI. I have no idea what the problem was.
The server runs Ubuntu 12.04 LTS. It's been in production for a couple of months and hadn't had any downtime before yesterday.
All the following logs were silent during the outage:
- unicorn.std{out,err}.log
- production.log
- /var/log/kern.log
- /var/log/syslog
- /var/log/auth.log
- /var/log/nginx/{access,error}.log
I was running an mtr traceroute the whole time which showed packets making it into the host's network but failing to reach my server.
New Relic shows nothing abnormal. Memory use and CPU load were low as usual.
I would dearly like to establish what happened so I can (try to) prevent it happening again...but I'm stumped.
Any ideas?
Many thanks in advance,
Andy Stewart
-------
http://airbladesoftware.com
More information about the Chat
mailing list