Jupiter Issues [04/29/2009]


Tony
 Share

Recommended Posts

Right now we're having datacenter technicians look into this as this appears to be a hardware problem. We have reason to believe it may be an issue with either the raid card or even the motherboard. We'll post further when we have an ETA on things being back up.

Link to comment
Share on other sites

Okay so here's the latest I have after speaking with the guys working on it. I'll go with a full story though:

We rebooted the machine and we noticed hey the raid card is reporting something funky. Usually that means we lost drives or something along those lines. So a bad drive spit bad things out caused the file system to lock itself and we reboot and problem solved. We'd just need to replace a bad drive no worries. This is not what happened though the machine just refused to boot any further. This is when we needed a more hands on look into the machine.

The datacenter technician working on the issue reboots it again notices the same thing and says well that's odd. So their reboot did get further and we're running a fsck now on the file system. As that is happening we're also going to most likely be replacing a faulty drive or two.

We hope it was just bad drives and the raid card was sleeping on the job and did not throw the bad drives out of the array before they could cause issues.

Link to comment
Share on other sites

To summarize what happened is the bad drive started throwing errors which in turn messed up the file system. We usually can just reboot the machine and handle the issue from the unfortunately in this case we could not. On site technicians were able to get the machine back into the OS after being able to run a fsck (we did not even get it to boot that far). The best case scenario the machine just runs as it should and we replace the bad drive without ever even rebooting the machine. This is not always the case depending on what the bad drive does. We did however not lose any data so the raid system did most of it's job.

The drive reporting medium errors is being replaced right now and we'll keep this open until it's finished rebuilding the array.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share