Tuesday, February 10, 2009

Standards of an archive-quality digital record repository

Photograph of a record from a pension fileRecent online discussions among members of the Association of Professional Genealogists (APG) have raised issues about FamilySearch International, Ancestry.com, Footnote.com and others' ability to provide archive-quality digital record repository services to patrons of archives such as the National Archives and Records Administration (NARA).

A user has raised charges of

  • missing images,
  • unindexed names,
  • incorrectly indexed names,
  • lack of quality control, and
  • lack of accountability.

(You can see a summary by Randy Seaver of most of these charges.)

Speaking as an industry expert (using the word loosely!) without regard to what internal knowledge I may have about FamilySearch's strengths and weaknesses, these are some characteristics that I believe an archival-quality digital record repository ought to have:

  • A process that captures a count of the number of items in a physical collection, compares it to the item counts in the digital collection and posts all the results for patrons to examine. Ideally, where an independent item count exists, that number should also be supplied. The idea is to have publicly posted two sets of counts of names, images, or any other important items. One set has the expected numbers, the other has the actual numbers. If the numbers differ, somebody better be able to explain why.
  • A process that captures the metadata for each item to allow the ability to browse through every item of a digital collection in the same order as the items in the physical collection. Each item should be displayed along with metadata that would indicate missing items. For example, images digitized from microfilm could have the linear position on the film along with the linear width of the image. If these numbers revealed unaccounted gaps, the metadata should include an explanation. A note would also be required for missing page numbers, missing certificate numbers, out-of-order items, alphabetical ordering anomalies, etc.
  • A process that allows users to correct (or at least mark) an index entry with incorrect information or an incorrect image. The corrections should be displayed with the corrected item. Uncorrected items should be so noted, again alongside the item. The total number of errors should be posted along with the inventory numbers.
  • A process that allows users to submit corrections to bad browse structures or incorrect/incomplete information in collection descriptions.

These MUST be automated processes. We're dealing with just too many records for non-automated processes to deal with corrections.

Vendor with online collections having these characteristics would earn the confidence of record custodians, record owners and patrons.

There is another item that I know patrons want. As a veteran of the software industry, I also know the expense of providing it is currently beyond most organizations. But one day when this is a common feature in Customer Relationship Management (CRM) software, it should also be a requirement that an archival-quality record repository places on vendors that wish to digitize records:

  • A process that informs the patron when a correction is available to an item that the patron reported to be bad.

So, sorry, Ellen. The processes don't exist to allow you to know how soon you will be able to access the missing image for your Texas ancestor. And if I were you, I wouldn't hold my breath.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.