A spokesperson for NARA wanted to make it clear that Ancestry.com has unilaterally digitized and published more than 300 NARA microfilm collections with over 70 million images prior to entering into a contractual relationship with them.
Ancestry.com digitized, indexed, and placed these images online using NARA microfilm publications that are available to anyone by purchase from NARA. This was strictly the work of Ancestry[.com], with no involvement, oversight, or quality assurance work by NARA.
NARA has posted a list of all their collections online at partner websites, whether the collection was produced before or after their agreements. The list is at http://www.archives.gov/digitization/digitized-by-partners.html and will be updated on a regular basis. But I digress…
At a digitization facility in Silver Spring the two have started digitizing records per the contractual arrangement. NARA preps records for scanning, Ancestry.com does the scanning, and NARA conducts quality control checks. According to NARA,
Both staffs ensure that every page has been imaged. NARA does a page-by-page quality control check on 5% of the boxes scanned. If a problem arises, mistakes are rectified immediately and the percentage of review on that camera operator’s work is increased. An operator must image two consecutive boxes perfectly before the audit returns to the 5% level.
For a recent project, 9 boxes out of 130 were checked. The highest error rate for any one box was 4 pages missing for every 1,000 pages. Missing pages were immediately digitized before processing continued. The overall error rate for all the boxes reviewed was 7 missing pages out of every 10,000. Most missing documents hadn’t actually been skipped, but scanning failed to pick up a light stamp or mark on an otherwise blank page.
“NARA considers that digitizing thousands of documents and having them available online with unprecedented indexing is worth the small percentage of error.” As one attempts to drive the error rate to 0, the cost explodes exponentially. In other words, dropping the error rate from 0.07% to 0.007% might cost 100 times as much, and dropping to 0.0007% might increase costs 10,000 times.
While NARA seems to feel that the current quality/cost ratio is acceptable, a spokesperson made it clear, “NARA does not want errors.” Next week I’ll tell you what NARA officials recommend you do when you come across problems.