Neptune Raid Issues [02/25/2009]


Recommended Posts

Posted

After dealing with several instances of file systems on the neptune server going read only we're doing emergency maintenance to hopefully solve this issue. The raid card on the system is running abnormally hot so we believe there could potentially be problems with cooling on the system or the card being bad. We will be taking the machine offline within the next hour to get to the bottom of this issue. We'll be updating this post as we have more information such as when it goes down and the amount of time it will be down for.

Posted

The machine is online at the moment it would be appreciated if everyone does not go generating backups as they will fail and cause us more problems as we prepare to take the machine offline to investigate and solve the root cause of these issues.

Posted

The machine will be going offline at about 11:00 PM EST and will be down for hopefully no more than 30 minutes as we'll be checking the fans on the machine to make sure they're all running. If any are not they will be replaced at that time. As well we'll be replacing the raid card as a precaution as even if it's good we'd rather not take the risk.

Posted

So much for doing this in a decent timeframe. We're going to take it offline right now as it's just gone and blown up again. We'll also be running a fsck on the partitions after the raid card is replaced as we have no idea what damage it could have done. The raid card replacement will be 30 minutes the fsck will probably take an hour to do. Unfortunately this is necessary to once again have a stable system.

Posted

All the fans were working fine but we replaced the raid card anyways as it was not running at an optimal temperature which could suggest a problem.

We are now running a fsck which could take up to an hour to do.

Guest
This topic is now closed to further replies.