I’ve warned before that no matter what index you use, you’re going to find your relatives misindexed. You have better context than cold indexers. (See the Indexing Illustration in “Indexing Errors: Test, Check the Boxes.”)
To demonstrate the point, I thought I would compare the 1940 U.S. Census Indexes of Ancestry.com and FamilySearch for the state of Utah. I figured FamilySearch’s large Utah indexing workforce would have a big advantage over Ancestry’s offshore workforce. I searched for all people named Alonzo. The name is unusual (because I didn’t want a name with too many matches), and offshore indexers were likely to be unfamiliar with it. I didn’t think about it at the time, but it can be challenging to recognize the z and to differentiate o from a.
Ancestry had seven results that FamilySearch did not, giving a sample size of 170. Four of the Ancestry results did not live in Utah as requested (their 1935 addresses were in Utah). However, the four were in states published by FamilySearch, so I was able to include them in the sample set.
Here are the results:
|Given Name(s) Correct||Surname Correct||Both Names Correct|
|Ancestry.com||125 (74%)||150 (88%)||114 (67%)|
|FamilySearch.org||159 (94%)||167 (98%)||157 (92%)|
|Both websites wrong||3 (2%)||2 (1%)||4(2%)|
As I mentioned, the results for the given name Alonzo were stacked against Ancestry. Ancestry’s keyers made some egregious errors: Alanna, Alenae, Alomo, Aloms, Alorysw, Alorze, Donzo, and Hanzo. Ancestry also had several errors caused by combining Alonzo with a middle initial (Alonzob, Alonzoe, Alonzor, and Alonzos). It made me wonder if one or more of their keyers were not following instructions.
However, the sample set contained a random sampling of surnames, so the results for Ancestry keyers should be given some consideration. Here, Ancestry suffered a 12% error rate.
It has been said many times that there is value in having more than one index. This test shows that to be true. The FamilySearch index got the full name correct 92% of the time. But if one checks both the FamilySearch and the Ancestry indexes, the success rate goes up to 98%.
- Judging the difference between a and o in Alonzo was difficult, so the results for given names should be taken with great caution.
- Even though I cross checked some values against the 1930 census or other collections, I considered the “correct value” to be what the image indicated, whether that was truly right or wrong. Otherwise it becomes too painful to try to differentiate enumerator error and enumerator handwriting.
- Where letters were illegible, I ignored them when scoring.