Gizmo Migration [12/17/2008]


Tony

Recommended Posts

This morning we experienced drive failure on the Gizmo server and also raid failure. After rebooting the machine we were able to bring it back online eventually. We were alerted to failed stripes on the raid array of Gizmo. This is not fixable and requires a complete rebuild of the raid array. Rather than rebuilding the array and restoring from backups we're going to migrate everyone to another node. We will be starting this as soon as possible and we'll update everyone as we move forward. You will not have any change of IP addresses or logins.

We hope to have everyone migrated before anything serious happens. But we have made another set of backups just in case the server is to fall offline before we migrate all VPS's off of it.

Date: 12/17/2008

Start time (EST): 2:45pm

End time (EST): Unknown

Estimated Down Time: Unknown

Duration: Unknown

Link to comment
Share on other sites

We've started migrating VPS's.

The only issues you may encounter is your VPS rebooting and maybe 5 minutes of down time while we re-route your IP's to the new machine.

I can not stress enough that this is required. The raid array is bad and we're playing with fire the longer we sit on it. So we're dealing with a few minutes of down time in order to make sure no one loses data or it's an even longer extended period of down time.

Link to comment
Share on other sites

Just to update this is not going as fast as we'd like we ran into some technical difficulties along the way. We're only about 33% done entering close to business hours today. We're going to push today in hopes of having everything transferred before the end of the day.

You can find out if you're on the new VPS node by doing a trace route and having the second last hop contain

vswitch2.wdc.arandomserver.com

Link to comment
Share on other sites

We have found the cause to the routing issue it is related to this bug report here: http://bugzilla.openvz.org/show_bug.cgi?id=771

Basically the VPS node is not sending out the arp requests to make so the IP's are routed to it. We've disabled the feature causing this to not happen. We're going to manually arp everyones ip's again. You may experience up to a minute down time on each ip.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.