Friday, May 18, 2007

FamilySearch.org problems attributed to hard drive failures

Users of FamilySearch.org continue to experience the problems I mentioned yesterday. This morning I was greeted with this message:

FamilySearch.org Error Message

DearMYRTLE published this explanation, an email from one of her readers:

From: Support@FamilySearch.org
Sent: Thursday, May 17, 2007 8:29 AM
To: jcbrooks@...
Subject: Server overload

Dear ...

FSI (FamilySearch.org) is continuing to limp along. It is believed that the slowness is related to the database storage device (SAN) operating at a reduced rate to protect the data. The SAN had a couple of drives fail two weekends ago, which have been replaced, during the past week to 13 days. It is actively re-populating the new drives and redistributing the data across the new and old drives. The combination of data redistribution and the SAN operating at a reduced protective rate have slowed it to the point to where it cannot keep up with requests during peak hours. The requests have been throttled by taking the site offline when the queue becomes excessive to allow it to recover. It is estimated that this may continue to be the case until around Saturday. It is hoped that the system will return to normal when the new drives have been fully reintegrated. We are hopeful this will be the case, but will have to wait until the site is recovered for confirmation.

Even today, the network computer systems show signs of being well on the road to full recovery....

Sincerely,
FamilySearch Support
(epm)

Flash Drive

What is a SAN?

SAN stands for storage area network. You might be familiar with flash drives or external hard drives. A SAN is a related concept in the world of server computers. But instead of connecting one external hard drive to one computer, a SAN allows you to connect an external hard drive to more than one server.

In the server world, external hard drives come in sets and are stored together in a storage device. Data is stored in a weird fashion that makes it possible to "hot swap" a drive from the set. That means you can pull a drive out--say, because it goes bad--and stick in a new one. The storage device is smart enough to copy the original data back to the new drive, all while FamilySearch.org continues to function! However, some of the time the storage device could be sending data to visitors must be set aside to copy data to the new drive. Hence, the slowness.

Storage Area Network (SAN)

The downside to all of this is that most of the time that could be used to copy data to the new drive must be set aside to sending data to visitors! Do you take FamilySearch.org completely offline and fix the problem in days, or do you leave it up and struggling and fix the problem in weeks?

Prevailing wisdom says.... some access is better than none.

No comments:

Post a Comment