The Ancestry.com data center can store three petabytes of information. That’s the equivalent of a stack of DVDs piled three Empire State Buildings high, according to Ron Hair. (There were other bullet points that flew over my head: 3PAR for primary storage, Isilon & Nexsan, Hitachi, and finally NetApp for tier 3 storage.) I previously mentioned that a collection of 3,000 backup tapes containing a copy of all this data is stored in an underground vault, should catastrophe strike.
(You’ll recall that security was tight and we weren’t allowed to take pictures. As a result, please do not assume that I took these pictures with the camera hidden inside my umbrella. That one was out of film.)
With that much storage, and 6,000+ servers, one can expect that Ancestry.com eats through as much power in a month as many small Utah towns: 807,000 kilowatts per month. Should the power fail, the facility is equipped with enough battery backup to keep things running for 15 minutes. However, after 7 to 10 seconds a diesel powered electrical power plant comes online, powering the small town we know as Ancestry.com. One generator (like the one pictured above) can fail and the remaining two generators are strong enough to carry the load. A 1,600 gallon tank contains enough diesel to power the center for three days and can be refilled as necessary.
With that much energy being burned, you can imagine that quite a lot of heat is generated. The data center uses 16 huge air conditioning units, each of which could cool 160 homes. The aisles between the rack cabinets alternate between cold and hot. The cooling units blow cooled air into the cold aisles. Fans draw the cool air through the servers where it absorbs heat from the electronics. The heated air is then expelled out into the hot aisles, where it is sent back to the coolers. Each cold aisle was capped with a plastic covering that helps direct the cool air through the servers. You can see the plastic roof of the cold aisle in the photograph below. (Actually, this is a painting that I did with my photographic memory upon return home. Yep. That’s my story and I’m sticking with it.)
Monitoring 6,000+ servers is no easy task. This year, Hair merely mentioned it, but last year we got to tour the monitoring room. With a dozen monitors and multiple operators, the room looks not unlike the control room of a nuclear power plant or NASA’s mission control. (“Uh, Houston. We have a problem here.”) See the photograph… I mean… watercolor, below.
Ancestry.com’s Ra Database Server Software
Hair (or was it Mike Wolfgramm?) did take the time to explain a home grown application that monitors and controls the 4,000+ genealogy data servers employed by the Ancestry.com website. The system is codenamed “Ra,” although its icon is the Eye of Horus.
Loads of generic servers are divvied up to handle requests to particular groups of Ancestry.com genealogy databases. One group might be birth, marriage, death databases. (Oops. The diagram was supposed to be labeled BMD!) Another group might handle military databases, and so forth. Database search requests are routed to the appropriate “stack” of servers which perform the search and return the results.
The servers are divided into three sets. The Live Servers set has all the servers currently in operation. If automatic monitoring software detects a problem with a server, that server is automatically moved to the Need Repairs set and is replaced with a server from the Available set. Technicians diagnose and fix the servers needing repair and return them to the Available set.
Last year we were told that a commercial vendor of a similar software program approached Ancestry.com to ply their wares. When the vendor saw the Ra system, they disappointedly announced that Ra was more sophisticated than their own offering.
I wish FamilySearch would give bloggers a similar briefing so I could contrast the two search systems. My suspicion is that Ancestry.com utilizes lots and lots of cheap servers with small databases while FamilySearch uses fewer, higher powered systems with monolithic databases. But I don’t really know. Some day I’ll have to give you my theory of why a for-profit company is naturally driven to the former and a non-profit company is naturally driven to the latter. But I digress…
Next week we’ll continue with a tour of Data Preservation Services, by Laryn Brown.