Friday, September 26, 2008

What is a record?

For Ancestry.com databases, sometimes we speak of database records. The concept of a record differs a little bit between database programmers and genealogists, so it would probably be a good thing to help you understand what an Ancestry.com database record is.

A record is the smallest displayable unit of a database. That seems easy enough, but let's show some particular examples.

Database Type A record is Names per record Example
Vital records One event, such as a birth, marriage or death One vital event typically names several people England & Wales, FreeBMD Birth Index: 1837-1983
Newspaper (images) One page Estimated The New-York Times, 15 April 1865
Book (images) One page Estimated ANZAC Memorial, 1914-1918
Folio One folio (paragraph) Estimated Virginia Colonial Soldiers
Photograph One photograph Estimated Public Member Photos
Map One map Estimated Lewis & Clark's Journey
Table-style One row One or more WWI Civilian Draft Registrations
Table and image One row One or more 1880 US Federal Census
Image with special name handling I'm not clear if it is one name or one image Not clear Who’s Who in Australia, 1921-1950 ? Yearbooks
Tree One individual One Public Member Trees

 

The table shows a database type of "folio." You may be aware that Ancestry.com was an outgrowth of Paul Allen's earlier company, Folio, which produced electronic books wherein the smallest searchable unit was a "folio" which typically was a paragraph. Many of the old databases from the early days of Ancestry.com give me the distinct impression they were either converted from Folio format, or Folio's technology was originally used in some fashion to publish the database on Ancestry.com.

5 comments:

  1. Thanks for clarifying this. The programmers have contributed to changing scholarly language to a programmer/marketing approach. IMHO the scholarly language should be used, not the programmers', when results of searches are shown.

    For example, often the link to 'view record' gives you someone's extract from an actual record, yet one is invited to 'save record' to the "shoebox" (one can never name the item saved in Shoebox, so can end up with 100 items with identical names). In the case of the really awful extracted databases, such as the so-called Delaware Marriages, which gives you no way to know what the original record actually was, this is annoying and even insulting to the genealogical researcher who cares about evidence and evidentiary citations.

    At least you have explained why search results that list ridiculous Tree entries call these 'records'. They are only records to the extent they record someone's opinion concerning an event or relationship, that was at one time entered in a gedcom file.

    All of this distortion of meaning only contributes to ignorance of newer researcher as to what actually constitutes a record.

    ReplyDelete
  2. I forgot to mention that the word 'folio' means sheet of paper, *not* paragraph. Using it to mean paragraph is silly and downright erroneous.

    ReplyDelete
  3. I agree with Geolover to the extent that programmers and marketing people may lead us astray by making poor application choices. However, I'll bet it is often unintentional; instead sometimes innocently caused by a lack of subject matter knowledge together with tight budget and/or time deadlines. To say nothing of the fact that some people are more creative and imaginative than others and thus able to craft a more elegant application.

    And maybe sometimes people just have a bad day. I love the Insider's work and have learned much from him, so certainly don't want to rile him or look a gift horse in the mouth.

    I enjoyed this post but also questioned the folio definition. After rereading, I'll bet he was probably referring to the "folio record" as designed and implemented by the Folio Corporation. But the sentence that really puzzled me was "A record is the smallest displayable unit of a database."

    Does the Insider really mean displayable or does he mean addressable? Either way, is a record the smallest unit? Isn't the smallest addresable unit dependent upon how much granularity with regard to columns/fields within the row/record that the DBA chooses to implement?

    And the smallest displayable unit? Certainly the programmer could display part of a year column/field with a value of 2008 as simply 08 to save room on the screen or on paper (though this example is not really a good one for genealogy applications, of course). And a further extreme might be cited when one realizes that if they ever receive a bill or statement from an IBM mainframe EBCDIC computer system that -- should they be lucky enough to have a credit balance -- the minus sign being displayed may be based upon only a couple of the bits (or a nibble at most) of a packed-decimal byte.

    Again Insider, enjoy your work immensely and terminology is often interpreted in different ways in different cultures -- just throwing against the wall for discussion and hopefully clarification for all of us . . .

    ReplyDelete
  4. Permit me to follow Geolover's lead in pulling a "Columbo" to come back with one more idea after reconsidering my post.

    I think perhaps the table within the original post could be improved by changing the column title "A record is" to "A data-entity is". [Such would call for certain changes in the "Names per record" column also, however.]

    ReplyDelete
  5. Dear geolover and tyler,

    Thanks for your comments!

    -- The Insider

    ReplyDelete