Friday, June 18, 2010

NARA Responds to Ancestry.com Issues

Microfilm Documents Missing

The National Archives and Records Administration (NARA) recently responded to accusations that Ancestry.com posts NARA record collections with numerous quality problems, including missing documents.

A spokesperson for NARA wanted to make it clear that Ancestry.com has unilaterally digitized and published more than 300 NARA microfilm collections with over 70 million images prior to entering into a contractual relationship with them.

Ancestry.com digitized, indexed, and placed these images online using NARA microfilm publications that are available to anyone by purchase from NARA.  This was strictly the work of Ancestry[.com], with no involvement, oversight, or quality assurance work by NARA.

NARA has posted a list of all their collections online at partner websites, whether the collection was produced before or after their agreements. The list is at http://www.archives.gov/digitization/digitized-by-partners.html and will be updated on a regular basis. But I digress…

At a digitization facility in Silver Spring the two have started digitizing records per the contractual arrangement. NARA preps records for scanning, Ancestry.com does the scanning, and NARA conducts quality control checks. According to NARA,

Both staffs ensure that every page has been imaged. NARA does a page-by-page quality control check on 5% of the boxes scanned. If a problem arises, mistakes are rectified immediately and the percentage of review on that camera operator’s work is increased. An operator must image two consecutive boxes perfectly before the audit returns to the 5% level.

For a recent project, 9 boxes out of 130 were checked. The highest error rate for any one box was 4 pages missing for every 1,000 pages. Missing pages were immediately digitized before processing continued. The overall error rate for all the boxes reviewed was 7 missing pages out of every 10,000. Most missing documents hadn’t actually been skipped, but scanning failed to pick up a light stamp or mark on an otherwise blank page.

“NARA considers that digitizing thousands of documents and having them available online with unprecedented indexing is worth the small percentage of error.” As one attempts to drive the error rate to 0, the cost explodes exponentially. In other words, dropping the error rate from 0.07% to 0.007% might cost 100 times as much, and dropping to 0.0007% might increase costs 10,000 times.

While NARA seems to feel that the current quality/cost ratio is acceptable, a spokesperson made it clear, “NARA does not want errors.” Next week I’ll tell you what NARA officials recommend you do when you come across problems.

3 comments:

  1. Good to see NARA stepping up, because Ancestry seldom does!

    Thanks for letting us know...can't wait to see what the public can do and if it will really make a difference! Their indexing is poor due to several reasons, I'm sure...and the volunteers (or whomever transcribes)lack the ability to read old script. Missing census pages are frustrating!

    ReplyDelete
  2. Reader Paul Brown alerted me that Silver Spring is singular. I have corrected the article online. Thanks, Paul!

    -- The Insider

    ReplyDelete
  3. Insider,

    NARA's position that digitizing records on that level is "worth" a small amount of error seems reasonable on the surface. But the question is whether that is an industry wide error rate, or higher.

    Several other questions also come to mind:

    1) What is Family Search's error rate in digitization?

    2) Does NARA make Ancestry commit to fixing an error when it is found *after* the main digitization has taken place?

    3) Does NARA recognize the difference between intentional errors and unintentional? Ancestry is intentionally not following best practices in digitizing blank and spoiled pages, so is NARA even counting that in the error rate?

    Ancestry has shown a high tolerance for error, a refusal to come back and fix such errors within a reasonable timeframe, and a refusal to adopt best practices. So NARA should be looking at them with a more critical eye than others.

    MikeF

    ReplyDelete