Tuesday, April 3, 2012

Details Behind the Failed 1940 Census Launch

1940census.archives.gov was down most of the official launch dayDuring the live launch event of the 1940 Census, Census Director Robert Groves was set to search for a member of his family. But the image never loaded, according to a story in the Chicago Tribune. Upon launch the website 1940census.archives.gov was immediately overwhelmed. In the first three hours, the website had 22.5 million hits.

The National Archives contracted with Inflection LLC, parent of Archives.com, to host the website.

“We want to apologize to the millions of people who came to the 1940 census website this morning in search of information about their family history,” the company said in a statement. “We take full responsibility for the technical issues that have occurred and are very sorry for the inconvenience you may have experienced.”

The contract between the government and Archives.com parent, Inflection, specified that they had to

4.4.1 Support up to 10 million hits per day, while providing response times of less than three seconds for keyword searches of the descriptive metadata. A hit is defined as a request for a file from the web server.

4.4.2 Support up to 25,000 concurrent users.

4.4.3 Scale on demand in the event that 10 million hits and/or 25,000 concurrent users are exceeded.

For “scale on demand,” Archives.com utilizes services from online bookstore vendor, Amazon.com. In addition to its online store, Amazon also provides “in the cloud” services to some websites. Theoretically, Amazon cloud services can be easily scaled to meet unexpected needs. Apparently, scalability is not completely transparent.

“We'd like to thank Amazon.com, who has been helping us with some of the scalability challenges we're tackling and lending important technical expertise,” said Archives.com.

Archives.com engineers worked through the night to fix issues, according to the company. Overnight engineers disabled some functionality, hoping to relieve the load on the overburdened servers. As I write this mid-day Tuesday, the website is still not functional. The site displays text without graphics or formatting. Click the image at the top of the page to see what the page should look like.

The company continues to work today to solve issues.

“Genealogists often claim that theirs is the biggest hobby in America,” said NPR host Robert Siegel. “It's very hard to find hard data to support that, but this would come pretty close if there are that many millions of people who are trying to get in.”

3 comments:

  1. At 1000 EDT this morning I was able to download a 221MB .zip file of Enumeration District 82-25 in Wayne County, Michigan. It started out at a bit over 3.0 MB/sec and finished up just over 2 MB/sec, so I would say a very impressive performance. The .zip archive contained 52 images and I found my father-in-law and his siblings an parents on about the 20th image I looked at.

    Scrolling through them on my Macintosh once they'd all downloaded was certainly easier than trying to view them in succession on any website.

    So it's not all bad :-)

    Roger

    ReplyDelete
  2. The NARA census website is working well now (Tuesday night, Pacific time), and my Twitter and Facebook friends have been reporting greatly improved site response starting mid-day, so it seems like they got things under control.

    They were hit with probably 4x their estimated traffic yesterday (it was 3.5x the estimate before 6 PM, so it was probably 4x or more by midnight), and then those massive numbers *increased* today, Tuesday, because of all the press attention. There's only so fast anyone can reasonably scale up with those kinds of numbers of visitors, and with such "heavy" (big high-res images) data.

    (Kind of reminds me of the Ellis Island database launch many years ago...remember how slow that was for *months*?)

    So yeah, maybe you could cool it a bit with the "ZOMG EPIC FAIL!" stuff. We all know that Ancestry.com really wanted to win the 1940 Census contract over Archives.com, and yet they didn't, so you certainly wouldn't want your readers accusing you guys of "sour grapes" now, right?

    ReplyDelete
  3. Dear Asparagirl,

    I am not part of Ancestry.com nor do I represent them. My opinions are my own.

    When you say that "we all know that Ancestry.com really wanted..." I must confess I did not know. Actually, I doubted they had bid on the contract, so your information is news to me.

    Might I inquire what your source is for that news?

    --The Insider

    ReplyDelete