Wednesday, August 10, 2016

Two Techniques for Healing Broken Links

Two Techniques for Healing Broken LinksLast Friday I presented two citations composed solely of broken URLs and I challenged you to write full citations for them. (See “Darned Image Citations.”) The two can be used to illustrate two different techniques for recovering from broken URLs.

Internet Archive

When you have a URL for a page that no longer exists, sometimes you can find an archived copy of the page. Use the Wayback Machine of the Internet Archive. The Internet Archive has made copies of many freely accessible pages that are not from the “dark web.” Copy a broken URL, go to https://archive.org/, and paste it into the Wayback Machine. Select the desired year and then select the date on the calendar. Sometimes the page is intelligible and sometimes it is not.

The first citation in last week’s challenge consisted of this naked URL: http://www.mocavo.com/1910-United-States-Census/126211/004973415/607. The Mocavo website was deleted when Findmypast bought Mocavo. Mocavo was a free site, so the Wayback Machine might have archived it. But as part of a database, the Wayback Machine almost certainly did not archive it.

In fact, we find the page was archived twice. Neither one displays the image, but they display enough index data that we can search for and locate an image on another website. It is obvious from the URL that the page was from the 1910 U.S. Census. (As an aside, I’ve noticed that early FamilySearch DGS numbers all had nine digits beginning with 004 so the ID in the URL looks suspiciously like a FamilySearch DGS number.) The first three names on the page are

Name

Relation to Head of Household

Gender

Race

Birth Year

Birthplace

Father's Birthplace

Mother's Birthplace

Blanche Black

Daughter

Female

White

1884

Pennsylvania

Pennsylvania

Pennsylvania

Robert Black

Son

Male

White

1892

Pennsylvania

Pennsylvania

Pennsylvania

Mildred Cornelus

Niece

Female

White

1890

Pennsylvania

Pennsylvania

Pennsylvania

With this information, I can do an exact search of the 1910 census on FamilySearch.org for Robert Black, son, born 1892 in Pennsylvania with Blanche Black in the household. This matches only one person (who, by the way, appears on an image in digital folder number 004973415):

     1. 1910 U.S. census, Huntingdon County, Pennsylvania, population schedule, Huntingdon Borough, 2nd ward, enumeration district 67, sheet 6-A, family [124], line 2, Robert [Black]; digital image, FamilySearch (https://familysearch.org/ark:/61903/3:1:33S7-9RVF-13K : 11 November 2015), Pennsylvania > Huntingdon > Huntingdon Ward 2 > ED 67 > image 11 of 34.

Some notes:

  • I don’t often cite line number, but this is one of those odd ducks where the family is divided across two pages. This child appears near the top of the page and is not identified by family name. Citing the line number removes ambiguity.
  • I didn’t include the NARA publication number and roll number. I’m feeling that no one will use my citation to go look at microfilm. If they want to, there are multiple ways to determine roll number. As microfilm census access evaporates, microfilm information becomes unnecessary and census citations will evolve accordingly.
  • I didn’t include the FamilySearch collection name. I figure most people can find the 1910 U.S. census on FamilySearch.org without knowing the exact name. Besides, the URL will take them directly to the image within the collection.
  • I cited the URL of the image because it has “ark:” in it. That means FamilySearch intends to keep that URL from breaking.
  • I cited the publication date rather than the access date. As a general rule, cite publication date when available and access date when not.

 

URL Poking

When a URL breaks, sometimes looking at the URL gives useful information. One URL convention specifies the main address of a page before a question mark, followed by options separated by the ampersand (&) character.

The second URL in last week’s challenge was a broken URL from Ancestry.com: http://www.ancestry.com/search/io/browse.asp?c=8 & state=Vermont & county=Addison & township=Bristol & ed= & roll=M33_126 & STAbrv=VT & startimg=30 & endimg=42 & rp=42 & hash=1670352374 & width=2877 & height=5089 & levels=5 & colorspace=Grayscale

We see among the options, these values:

  • state=Vermont
  • county=Addison
  • township=Bristol
  • ed=
  • roll=M33_126
  • STAbrv=VT
  • startimg=30
  • endimg=42
  • rp=42

Census microfilm junkies will recognize M33 as a NARA microfilm publication number. Google indicates that [nara microfilm publication m33] is the 1820 U.S. Federal census. The challenge was to cite Pearis Raymond. I can do an exact search of the 1820 census on Ancestry.com for Pearis Raymond living in Bristol, Addison, Vermont. This matches only one person:

     2. 1820 U.S. census, Addison County, Vermont, population schedule, Bristol, page 69-B, 3rd name from bottom, Pearis Raymond; digital image, Ancestry (http://search.ancestry.com/search/db.aspx?dbid=7734 : updated 31 May 2013), Vermont > Addison > Bristol > image 9 of 9.

Notes:

  • For the same reasons as the FamilySearch.org citation, I left off the microfilm information and exact database title.
  • I went out on a limb and cited the URL of the database rather than the home page. Ancestry has never made any public commitment to make any of their URLs persistent. If it works, it’s an added convenience. If it doesn’t, most people can still get to Ancestry.com.
  • I questioned whether to specify an access date or a publication date since most people don’t know how to find publication dates of Ancestry’s databases. (Look up the database in the catalog and hover over the title.) In the end I figured the publication (update) date was still more useful than an access date.

The final challenge was to adapt what was essentially a microfilm citation to reference Lewis Rapp on FamilySearch.org. There was a wrinkle. FamilySearch uses a bad index of that page that they obtained from Fold3. Fold3 had the illegible image, so Lewis Rapp was indexed as “[illegible Rapp].” Good image; bad index. Until just a few weeks ago, Ancestry had the opposite: bad image, good index. Now they have the superior offering.

     3.  1860 U.S. census, Jackson County, Ohio, population schedule, Scioto Township, p. 62, family 426, Lewis Rapp; NARA microfilm publication M634, roll 992; digital image, FamilySearch (https://familysearch.org/ark:/61903/3:1:33SQ-GBSD-9PGS : 8 April 2016), Ohio > Jackson > Scioto Township > image 27 of 38.

Of course, these are not the only acceptable citations. I’ve noted some of my judgement calls; you may have made different ones. Just keep in mind: Citations communicate concisely, with clarity and consistency.

No comments:

Post a Comment