The Ancestry Insider: Two Techniques for Healing Broken Links

Wednesday, August 10, 2016

Two Techniques for Healing Broken Links

Last Friday I presented two citations composed solely of broken URLs and I challenged you to write full citations for them. (See “Darned Image Citations.”) The two can be used to illustrate two different techniques for recovering from broken URLs.

Internet Archive

When you have a URL for a page that no longer exists, sometimes you can find an archived copy of the page. Use the Wayback Machine of the Internet Archive. The Internet Archive has made copies of many freely accessible pages that are not from the “dark web.” Copy a broken URL, go to https://archive.org/, and paste it into the Wayback Machine. Select the desired year and then select the date on the calendar. Sometimes the page is intelligible and sometimes it is not.

The first citation in last week’s challenge consisted of this naked URL: http://www.mocavo.com/1910-United-States-Census/126211/004973415/607. The Mocavo website was deleted when Findmypast bought Mocavo. Mocavo was a free site, so the Wayback Machine might have archived it. But as part of a database, the Wayback Machine almost certainly did not archive it.

In fact, we find the page was archived twice. Neither one displays the image, but they display enough index data that we can search for and locate an image on another website. It is obvious from the URL that the page was from the 1910 U.S. Census. (As an aside, I’ve noticed that early FamilySearch DGS numbers all had nine digits beginning with 004 so the ID in the URL looks suspiciously like a FamilySearch DGS number.) The first three names on the page are

Name	Relation to Head of Household	Gender	Race	Birth Year	Birthplace	Father's Birthplace	Mother's Birthplace
Blanche Black	Daughter	Female	White	1884	Pennsylvania	Pennsylvania	Pennsylvania
Robert Black	Son	Male	White	1892	Pennsylvania	Pennsylvania	Pennsylvania
Mildred Cornelus	Niece	Female	White	1890	Pennsylvania	Pennsylvania	Pennsylvania

With this information, I can do an exact search of the 1910 census on FamilySearch.org for Robert Black, son, born 1892 in Pennsylvania with Blanche Black in the household. This matches only one person (who, by the way, appears on an image in digital folder number 004973415):

1. 1910 U.S. census, Huntingdon County, Pennsylvania, population schedule, Huntingdon Borough, 2nd ward, enumeration district 67, sheet 6-A, family [124], line 2, Robert [Black]; digital image, FamilySearch (https://familysearch.org/ark:/61903/3:1:33S7-9RVF-13K : 11 November 2015), Pennsylvania > Huntingdon > Huntingdon Ward 2 > ED 67 > image 11 of 34.

Some notes:

I don’t often cite line number, but this is one of those odd ducks where the family is divided across two pages. This child appears near the top of the page and is not identified by family name. Citing the line number removes ambiguity.
I didn’t include the NARA publication number and roll number. I’m feeling that no one will use my citation to go look at microfilm. If they want to, there are multiple ways to determine roll number. As microfilm census access evaporates, microfilm information becomes unnecessary and census citations will evolve accordingly.
I didn’t include the FamilySearch collection name. I figure most people can find the 1910 U.S. census on FamilySearch.org without knowing the exact name. Besides, the URL will take them directly to the image within the collection.
I cited the URL of the image because it has “ark:” in it. That means FamilySearch intends to keep that URL from breaking.
I cited the publication date rather than the access date. As a general rule, cite publication date when available and access date when not.

URL Poking

When a URL breaks, sometimes looking at the URL gives useful information. One URL convention specifies the main address of a page before a question mark, followed by options separated by the ampersand (&) character.

The second URL in last week’s challenge was a broken URL from Ancestry.com: http://www.ancestry.com/search/io/browse.asp?c=8 & state=Vermont & county=Addison & township=Bristol & ed= & roll=M33_126 & STAbrv=VT & startimg=30 & endimg=42 & rp=42 & hash=1670352374 & width=2877 & height=5089 & levels=5 & colorspace=Grayscale

We see among the options, these values:

state=Vermont
county=Addison
township=Bristol
ed=
roll=M33_126
STAbrv=VT
startimg=30
endimg=42
rp=42

Census microfilm junkies will recognize M33 as a NARA microfilm publication number. Google indicates that [nara microfilm publication m33] is the 1820 U.S. Federal census. The challenge was to cite Pearis Raymond. I can do an exact search of the 1820 census on Ancestry.com for Pearis Raymond living in Bristol, Addison, Vermont. This matches only one person:

2. 1820 U.S. census, Addison County, Vermont, population schedule, Bristol, page 69-B, 3rd name from bottom, Pearis Raymond; digital image, Ancestry (http://search.ancestry.com/search/db.aspx?dbid=7734 : updated 31 May 2013), Vermont > Addison > Bristol > image 9 of 9.

Notes:

For the same reasons as the FamilySearch.org citation, I left off the microfilm information and exact database title.
I went out on a limb and cited the URL of the database rather than the home page. Ancestry has never made any public commitment to make any of their URLs persistent. If it works, it’s an added convenience. If it doesn’t, most people can still get to Ancestry.com.
I questioned whether to specify an access date or a publication date since most people don’t know how to find publication dates of Ancestry’s databases. (Look up the database in the catalog and hover over the title.) In the end I figured the publication (update) date was still more useful than an access date.

The final challenge was to adapt what was essentially a microfilm citation to reference Lewis Rapp on FamilySearch.org. There was a wrinkle. FamilySearch uses a bad index of that page that they obtained from Fold3. Fold3 had the illegible image, so Lewis Rapp was indexed as “[illegible Rapp].” Good image; bad index. Until just a few weeks ago, Ancestry had the opposite: bad image, good index. Now they have the superior offering.

3. 1860 U.S. census, Jackson County, Ohio, population schedule, Scioto Township, p. 62, family 426, Lewis Rapp; NARA microfilm publication M634, roll 992; digital image, FamilySearch (https://familysearch.org/ark:/61903/3:1:33SQ-GBSD-9PGS : 8 April 2016), Ohio > Jackson > Scioto Township > image 27 of 38.

Of course, these are not the only acceptable citations. I’ve noted some of my judgement calls; you may have made different ones. Just keep in mind: Citations communicate concisely, with clarity and consistency.

No comments:

Biography

The Ancestry Insider was a readers’ choice for the top four genealogy news and resources blogs, part of Family Tree Magazine’s “40 Best Genealogy Blogs” for 2010. He reports on the two big genealogy organizations, Ancestry.com and FamilySearch. He was named a “Most Popular Genealogy Blogs” by ProGenealogists, and has received Family Tree Magazine’s “101 Best Web Sites” award every year since 2008. A genealogical technologist, the Insider has a post-graduate technology degree and holds a dozen technology patents in the United States and abroad. He has done genealogy since 1972 and has worked in the computer industry since 1978. He was Time Magazine Man of the Year in both 1966 and 2006. And he really is descended from an Indian princess.

Subscribe by Email

Wednesday, August 10, 2016

Two Techniques for Healing Broken Links

No comments:

Post a Comment