The Ancestry Insider: August 2015

Monday, August 31, 2015

Monday Mailbox: How Fast Was the 1860 Census Indexed

Howland Davis sent a question in response to my article, “FamilySearch Indexing Not Keeping Up.”

Dear Ancestry Insider,

Interesting article, thank you. I have a question about the comparison of the indexing the 1860 and the 1940 censuses. I am fairly sure that the 1940 index was completed 1650 days after its release in 2012. Was the 1860 census indexed 17 years after its release in 1932(?) or did the work start some years after that?
Just curious, not important.
Howland Davis

Dear Howland,

Ooooh. Something shiny.

It took Ancestry.com four months and one day to finish its 1940 index. (See my article of 6 August 2012, “Census Indexing Update: And It’s Over.”) FamilySearch published the 50 states a while later, but I think it took them a considerable amount of time to finish the territories.

I believe the first large-scale effort to index the U.S. censuses was made by Ronald Vern Jackson and Accelerated Indexing Systems (AIS) in the late 1970s through the early 1990s. I believe he indexed heads-of-households only, and just the names, so the amount of work was more manageable. These were true indexes, not the census databases we use today. Where did he get his keyers? Does anyone know? He published the indexes as bound books of computer printouts.

A page from the 1976 AIS index to the Louisianna 1820 census
Ronald Vern Jackson, et. al, eds., Louisiana 1820 Census Index (Bountiful, Utah: Accelerated Indexing Systems, 1976), 1.

According to Thomas Jay Kemp’s The American Census Handbook (Wilmington, Delaware: Scholarly Resources, 2001), here are the publication years for a sampling of states:

Census	Publication year
1790	New York: 1990 Ohio: 1984
1800	Ohio: 1986 Vermont: 1976
1810	Virginia: 1978
1820	Iowa: 1977 Indiana: 1976
1830	Indiana: 1976
1840	Iowa: 1979
1850	Iowa: 1976
1860	Iowa: 1987 North Dakota: 1980 Virginia: 1988 Washington: 1979
1870	Iowa: 1990

Notice all were done after the widespread availability of computers.

In 1984 AIS published on microfiche what it had completed. Ancestry.com published AIS indexes online in 1999.

Some limited scope indexes were published earlier. For example, in 1964 the Ohio Library Foundation published an index of the 1830 Ohio census. This, too, was a computer printout. Volunteer family historians extracted the names of heads of households onto index cards. The cards were keyed onto punch cards, which were then sorted by an IBM mainframe computer.

A page from the Ohio Library Foundation's 1964 index of the 1830 Ohio census
Ohio Library Foundation, ed., 1830 Federal Population Census Index, vol. 1 (Columbus, Ohio: Ohio Library Foundation, 1964), 1.

So the answer to your question is, that indexing the 1860 census took about a decade and was finished around 1990.

Signed,
---tai

Thursday, August 27, 2015

The Future Will Bring Automated Indexing Tools – #BYUFHGC

Jake Gehring presenting at the 2015 BYU Conference on Family History and Genealogy “It’s not that we don’t like our [indexing] volunteers,” said Jake Gehring. “We would just rather have them work on things that only [humans] can do.” Jake is director of content development for FamilySearch and presented at the BYU Conference on Family History and Genealogy last month. This article is the third and last article about his presentation. In the first article I reported on Jake’s premise that FamilySearch Indexing is not keeping up with the number of records FamilySearch is acquiring and additional means are needed. In the second article I reported about two of those means: increasing the efficiency of human indexers and working with commercial partners. In today’s article I will report on the third means: increased automation via computers.

In the third part of his presentation, Jake spoke about “the really far-out stuff, HAL9000 kind of stuff.”

Jake showed a screen shot that we saw in Robert Kehrer’s keynote. (See “Kehrer Talks FamilySearch Transformations” on my blog.) The screen showed a color-coded obituary.

Obituary with parts of speech color coded by FamilySearch automated obituary indexing system

FamilySearch trained a computer to identify the different parts of speech. They trained the computer how to discern meaning out of a bunch of words. Notice in the example above that names of people are identified in dark green, places in brown, dates in dark blue, relationships in salmon, events in pale green, clock times in a steel blue (or would you call that a dark sky blue?), organizations in red, and buildings in goldenrod (or would you call that a mustard?).

They basically teach the computer to read. The computer is willing to extract a lot more detail from an obituary than a volunteer can easily do. And it can work really, really fast. For obituaries, computers can do in about a week and a half what it takes all of FamilySearch’s volunteers three and a half years to do. This is why in a few weeks FamilySearch is going to stop having volunteers index the current obituary project. In fact, FamilySearch has already published about 37 million obituaries this way. You may already have found and used an obituary that was indexed by a smart computer.

This applies to obituaries published since about 1977. Since that time, most obituaries have been published and stored digitally. Pre-1977 it looks a lot differently. Because the obituaries are not already digital, it is a pretty nasty OCR problem. [OCR converts the printed page to text so that the computer can subsequently try to make sense of it.] The problem is so severe, computers can recognize only about half of the words in pre-1900 newspapers.

If you were at RootsTech you may have seen the last thing Jake showed. A company named Planet entered its ArgusSearch into the Innovator Challenge. ArgusSearch is a system that reads the handwriting of documents that have not been indexed. You type in something like “Steinberg” and the program shows some records that might match that name. It won’t find all the matches. And it may return some results that aren’t matches. But this is still useful. This technology is still young, but an application like this is likely to hit real life in the next ten years.

Jake summarized by saying that while indexing is going really well—never better—unfortunately, it is just not good enough to give us all the records you need. [FamilySearch does not index all the records they acquire.] “We need to do much better. It’s not that we are not quite there; we are way behind and getting further behind every year,” he said. There are three areas that FamilySearch needs to utilize. FamilySearch needs to increase the efficiency of its indexing volunteers. FamilySearch needs more help from for-profit publishers who can bring more resources to the table. And FamilySearch needs to use computer technology to make images searchable with little or no human intervention.

“It’s an exciting time to be alive. Can you imagine the explosion of document availability once we make a bit more headway in a few of these areas?”

Jake took a couple of questions:

Q. How easy is it to use tools like Google Translate to translate Spanish records?

A. Google Translate is better at modern, generic words. If you type in the text of a letter, you would be able to get the gist of it, but it may not handle archaic words or words specific to a vital record. As long as you know a small set of terms, you can usually get by without a computerized translator. There is no magic tool currently available.

Q. Why do we sometimes key so very little from a record? While we have someone looking at the document, shouldn’t they be extracting more?

A. Because we publish both indexes and images, we index the minimal amount necessary to find the image. Why index something that no one will ever use in a search? Cook County, Illinois death certificates are an example where we indexed something that didn’t need to be. We indexed the deceased’s address, but who will ever search using the address? Sometimes we don’t get it quite right, but that’s the general principle.

Q. When will we be able to correct published indexes?

A. We’re starting now after ten years of being in the top three requested features, we’re starting to implement the feature to allow you to contribute corrections. We are rapidly approaching the point when this will be available. I’m not really authorized to say “soon,” but we have our eyes on that feature.

Wednesday, August 26, 2015

FamilySearch Should Increase Indexing Efficiency and Utilize Partnerships

Jake Gehring presenting at the 2015 BYU Conference on Family History and Genealogy FamilySearch is not keeping up with indexing the records it digitizes and improvements in three ways could help fix this, according to FamilySearch director of content development, Jake Gehring. Yesterday I presented the first part of my remarks about his presentation at the 2015 BYU Conference on Family History and Genealogy (#BYUFHGC). Today I’ll present the second part, covering the first two of the three ways, increasing efficiency and partnering. Tomorrow I’ll present the third way, increased use of computerization.

Today’s FamilySearch Indexing (FSI) system is somewhat inefficient. FSI primarily utilizes a double-blind indexing methodology, sometimes described as A+B+arbitrate. Two indexers independently index a batch of records. If there are any differences, even one letter in one record, the entire batch is sent to a third person to arbitrate between the two values, or supply a value of their own. It turns out that 97% of all batches have at least one difference, even though what is keyed is the same for 70% of the fields. As a result, almost all records are looked at by three people. There’s a good argument that that is wasteful. For certain kinds of records and certain kinds of people [and certain kinds of fields, I might add], only one keyer is sufficient. The accuracy doesn’t get any better when involving two more people. FamilySearch has recently switched to single keying for newspapers in the last year since reading typeset material can usually be done without error. You wouldn’t want to do this for certain types of records or for beginning indexers.

A more efficient methodology is referred to as A+review. One person keys the information and a second person reviews what is keyed. All the reviewer does is indicate whether the information is correct or not. This could easily be done, even on a cell phone. This method is about 40% more efficient than the double-blind methodology because FamilySearch knows when a record needs to be keyed a second time. FamilySearch is actively working on this kind of methodology to increase the efficiency of indexing.

Jake showed three, entirely new, experimental types of indexing. Some do not even have working prototypes: keyboardless indexing, free-form indexing, and casual “micro-indexing.”

Jake showed an indexing system that allows productive use of devices without keyboards, such as smart phones. If you’ve used photo recognition in Photoshop, you have seen the paradigm before. He showed a slide showing 12 snippets of a name, such as “Henry.” (See my version, below.) These had been read from documents by a computerized handwriting recognition system. But since computers aren’t too good at reading handwriting, it presents its results to a person for verification. The person marks any that the computer got wrong. Where the computer had a good second guess, it could present that as well, allowing the person to select an alternate name, such as “Kerry.” For pre-printed forms, this works great and allows easy indexing on devices without keyboards, such as cell phones.

Jake showed the FamilySearch Pilot Tool, another indexing system for free-form indexing. It is currently live, as a pilot. A large portion of the screen is a browser showing a record on FamilySearch.org. Along the right side is a pane where an indexer can enter names, dates, and places extracted from the document. (See the screen shot, below.) A person would use the tool to index any record that they care about and a short time later the record would be searchable. You wouldn’t have to ask for anyone’s permission. You wouldn’t have to index all the names. Anyone could take any collection desired and do some indexing. This tool is in pilot right now. FamilySearch is very interested in tools that let you index as you go. To join the pilot, send Jake an email. (I see someone has also posted the link online. See “FamilySearch Pilots Web-Based Indexing Extension” on the Tennessee GenWeb website.) There is no arbitration. If you care enough to index the image, you probably care enough to be accurate. But that supposition is something yet to be validated.

“Micro-indexing” could be used to make images more usable. It would be nice to be able to browse unindexed images easier. FamilySearch is very interested in an upgrade to the current browse experience. Jake showed an animated artist’s rendition of a tool, reminding us that this is just a research and development idea.

In micro-indexing the system might ask you really simple questions, like, “What kind of record is this?” and have you click the record type. By asking volunteers to do tiny tasks, FamilySearch might be able to gather information to make browsing images easier to find my record type, locality, and time. Just because FamilySearch doesn’t have the time to index the images, doesn’t mean they can’t be made easy to browse.

This is a mock-up of what a micro-indexing tool might look like.

In addition to talking about increasing the efficiency of indexing, Jake talked about partnering. FamilySearch is fine with the concept of trading data with other companies. FamilySearch provides images and the partner creates indexes. They may even get exclusive use of the indexes for awhile. For example, a lot of Mexico church and civil records are being indexed right now by Ancestry.com. We all get the value of it eventually. FamilySearch has similar projects going on with Findmypast (I didn’t catch the projects names) and MyHeritage (Danish census and church records, and Swedish household names). This increases the rate of indexing by bringing more indexers to the table.

Tuesday, August 25, 2015

FamilySearch Indexing Not Keeping Up – #BYUFHGC

Jake Gehring presenting at the 2015 BYU Conference on Family History and Genealogy “FamilySearch just isn’t indexing records fast enough,” said Jake Gehring. “If that is the case,…then what do we do about it?” Jake is director of content development for FamilySearch and presented at the BYU Conference on Family History and Genealogy last month. Jake’s presentation was titled “FamilySearch Indexing, Robo-keying, and Partnering, Oh My!”

In the last little while Jake has been involved in some research and development, which is really rewarding. It’s fun to work on some things that may become real someday. He emphasized to me that these things may never become real, so keep that in mind as you read.

Back in the old days an index was that thing in the back of the book, not some multi-billion name index you can search from your home. “We index records so that they get used more,” he said. We gather records for the same purpose and have been doing so since 1938, he said. FamilySearch has about 280 cameras, roughly 40 in the United States and the rest abroad.

There have been huge improvements in the technology for capturing records and making them available. Jake showed an example, a Weber County, Utah marriage license. It is one of the rare collections that FamilySearch has captured twice. A scan from microfilm looks like this:

Weber County, Utah marriage license scanned from microfilm

FamilySearch went back recently and captured the records digitally, in color.

A Weber County, Utah marriage license that was digitized in color

Granted, viewing a record scanned from microfilm is often less clear than viewing it on a microfilm reader, but you can see the huge improvement.

FamilySearch does things so the captured images are easier to use. One of the things they have done from early on with books and microfilm was catalog them. A catalog entry can specify locations, authors, subjects, and so forth. For family histories, they might put in a list of surnames, but that was about it. You had to know what you were looking for to find the records you needed.

When you think about what we do now, things are quite a bit different. Indexes contain full names and direct you to individual images. We index (as FamilySearch calls extracting) more than names. We also capture dates and places and relationships. By doing this, not only can you search for them, but FamilySearch can recommend records to you. FamilySearch calls these hints.

There is a range of things that FamilySearch can do to make records more accessible. Some can be done with less cost than others. Jake showed a diagram showing treatments that can be made to a collection. With added accessibility comes added cost. Here is my version of his diagram, including my own definitions:

Definitions:

catalog entry: a single entry for an entire collection
film notes: individual notes for each film
light waypointing: dividing the images of an entire collection into a few groups containing a large number of images
heavy waypointing: dividing the images into more specific groups with fewer images
light indexing: extracting a few, basic pieces of information from an image, perhaps just a name or a name and date
heavy indexing: extracting most genealogically significant information
lineage-linkage: using the record extracts to reconstruct families with links between parents, spouses, and children

Resources are limited. The more work invested in collections, the easier it is to use them, but the number of collections that can be published decreases. “The truth is that it is quite expensive to make collections very, conveniently searchable,” Jake said. “But it is still worth doing. In fact, we want to do it faster.”

The Church of Jesus Christ of Latter-day Saints, FamilySearch owner, has been indexing for a long time in one way or another.

1922 – Church employees started extracting information for the TIB, an early predecessor of the IGI. [I added this bullet point.]
1961 – Church employees started extracted names from historical records at Church headquarters.
1977 – Church members started extracting records at Church buildings via the stake records extraction program.
1986 – Church members began the family records extraction program (FREP) which used data entry by members using home computers.
1994 – The stake and family record extraction programs were consolidated.
2006 – FamilySearch began using the current FamilySearch Indexing tool, utilizing Church members and the general public.

Jake showed the current application, FamilySearch Indexing. He then showed the new, browser-based tool that is now being rolled out. The tool allows the data entry pane to be positioned in various places, such as the left of the screen or the top. One data entry mode, when field positions are well defined, allows data entry overtop of the image itself.

Jake showed statistics of the number of indexing volunteers since 2006.

This graph should be encouraging to everyone. Compare this to the number of people—probably 15,000—indexing the 1880 census some 20 to 25 years ago. The big explosion in indexers in 2012 was because of the 1940 census. This year FamilySearch is on track to have more volunteers than for the 1940 census project.

“It is a wonderful, exciting program to be a part of. You have the satisfaction of knowing that you and 350,000 of your closest friends are all working together to make documents more usable,” Jake said.

He compared the indexing project of the 1880 census to that of the 1940.

1880 US Census Index	1940 US Census Index
Index only	Index and images
56 CD-ROMs	Web
50 million records	132 million records
17 years to complete	150 days

But this amount of indexing is not good enough.

“Do the math,” Jake said. FamilySearch captures about 150 million images in the field each year. FamilySearch is also scanning the microfilm out of the Granite Mountain Record Vault. This year FamilySearch expects to scan about 300 million images. On average there are four to five records per image. That amounts to about 2 billion records digitized just this year. And Jake expects to have the same amount next year. But we are only indexing about 250 million per year. That’s only 12% of the records “brought in the door.”

“Now do you see why I say, we’re not going fast enough?”

Additionally, FamilySearch is trying to increase the number of cameras. Then they will be even more in the hole. Since 90% of indexing is English, the situation is far worse for other non-English records.

“If we want genealogy records to be more helpful more quickly to more people, we need to look at other ways of indexing,” Jake said. He spoke about three ways that might accelerate the number of indexed records: efficiency, collaboration, and computerized assistance.

Tomorrow I’ll report on what Jake said about increasing efficiency and using collaboration. Thursday I’ll finish reporting about his presentation.

Thursday, August 20, 2015

Lisa Elzey #BYUFHGC Presentation, Part 2

Yesterday I wrote the first part of my report about Lisa Elzey’s presentation at the 2015 BYU Conference on Family History and Genealogy. She titled her presentation, “Ancestry.com: How the Records Tell the Story.” Today, I continue with part 2.

3. Analyze the Details

Timelines, dates, and historical events can be used to analyze the details.

Lisa used an Excel spreadsheet for one example timeline. She had columns for dates, places, comments, and sources. It looked like some of her sources hyperlinked right to the sources. The new Ancestry has a built-in timeline which can be helpful.

Analyze why dates are important. Compare to calendar events, major holidays, community events, and seasons. Compare dates to those of historical events. In the new Ancestry, Life Story includes historical events.

Always ask yourself why your family members did what they did. Look for changes, such as disappearance, immigration, change in economics, and first-time occurrences such as literacy and property ownership. Look for differences, such as religion, ethnicity, language, economic standing, race, and age. As an example, Lisa showed the family of John and Tersa Flynn in the 1900 census in Seattle. John was a master mariner. His oldest son, George, was born at sea. The next two children, Maud and Marguerita, were born off the coast of Peru. Next, Evelin was born at sea, Edeth in Calcutta, and Henry at sea. The last child, Grace, was born back in John and Tersa’s native England. Do you see what probably happened? Apparently, he took his entire family on ship. Eventually they went back to England before Grace was born. From there they retired in Seattle at a time when many from England were going there.

4. Tell the story

Lisa showed us a case study about Leland Wright who appears in the 1930 U.S. census in Miami, Dade, Florida with his family, Leath, Juanita, Cora, George, and Roy. If I recall correctly, the case study arose out of what initially appeared to be a simple question from a user. And if memory serves, the question was: What ever happened to Leland? What appeared to be a straightforward question evolved into a tale so fascinating, Lisa is writing it up. Watch for the story coming soon to the Ancestry Blog.

5. Purpose + Audience = Project

Lisa quoted from the results of an Emory University study. “Children understand who they are in the world not only through their individual experience, but through the filters of family stories that provide a sense of identity through historical time,” says the study. (See “Children Benefit if They Know About Their Relatives, Study Finds,” Emory University [http://www.emory.edu : accessed 15 August 2015], path: News & Events > News Releases > 2010 > Archives > March. The link to the paper is no longer functional. See a PDF copy at “History Relevance Campaign,” Public History Commons [http://publichistorycommons.org/history-relevance-campaign : accessed 15 August 2015], hotlink titled “‘Do You Know…’ The Power of Family History in Adolescent Identity and Well-being.”)

6. Share the Story

Lisa had a video conference with the great-grandson of James Wright and shared the documents and what she had learned about the Wright family. The results were pretty touching. Watch for Lisa’s blog article to hear the rest of the story.

To ask for a copy of the flyer from Lisa’s class, write conferences@ancestry.com. Some Who Do You Think You Are? episodes are available for free on the WDYTYA website. Seasons four, five, and six are available for purchase on YouTube or iTunes.

Wednesday, August 19, 2015

Lisa Elzey Talks WDYTYA and Story Telling - #BYUFHGC

“A lot of the questions I get about my job are about how we do the research for WDYTYA,” said Lisa Elzey at the 2015 BYU Conference on Family History and Genealogy. In her presentation, “Ancestry.com: How the Records Tell the Story,” she not only shared some of the details, she shared how we could apply the principles in our own research. Lisa explained the process which Ancestry employees lovingly call the “Who Do You Machine.”

Ancestry.com uses a "Who Do You Machine" to crank out an episode of WDYTYA.

Lisa Elzey teaches a session at the 2015 BYU Conference on Family History and Technology. Casting is not done by Ancestry, but by Shed Media and The Learning Channel. Some stars come through referrals. You may have noticed that sometimes when a star is featured, you’ll see a costar or a friend in a later episode. For example, Kelly Clarkston is the daughter-in-law of Reba McIntire.

Some celebrities know a lot about their ancestors and some know very little. Once stars are selected we start building their tree, Lisa said. They use all of the basic records that can frame a story. Notice on the machine diagram, below, that some stars fall out. Sometimes it is because of scheduling. Sometimes they’ll come back later. Sometimes the research doesn’t get past a certain point.

After we get a solid foundation of a tree, we start exploring it, Lisa said. We look for compelling stories, such as Christina Applegate’s story about her father. “Beautiful episode,” Lisa said. Once we’ve found what we think is a compelling story, we start crafting the story together. To fill 42 minutes we need about 17 documents.

Once the story is done, then we film, she said. This can be tricky if the star is in a current project.

When that is done, you get the awesome show, Who Do You Think You Are.

You can use the same model as the Who-Do-You Machine to tell your own story.

1. Do the research.

Use primary source material whenever possible to authenticate your story. It’s like the difference between fresh peas and green pea soup.
Use census records. They create an arc of an individual’s life. They give you potential story clues. Plus, they are easy to find and use.
Use birth, marriage, and death records. They help establish relationships and give you even more potential story clues.
Then take a deeper dive. Use records such as pension files, newspapers, city directories, grave stones, deeds, probate (Ancestry has a huge collection coming out quite soon), histories, [and many more that I didn’t write down quickly enough].
Research complete families. It is like having trees versus poles. Everything that happened in that family affected your tree. I have found amazing stories about my ancestors by researching their entire families, Lisa said.

2. Gather and organize your information.

You will hurt your ability to find stories if you aren’t building a family tree. You also need to keep a research journal and keep a simple documents folder. Adopt a naming convention for images, such as: surname-first name-birth year-document year-document type. If you’ve inherited a messy stack of research, start over, realizing you’re not starting from scratch. Learn about and use the Genealogical Proof Standard. I recommend Tom Jones’s book, Mastering Genealogical Proof, she said.

Lisa uses the Ancestry Shoebox app. When she visits a relative, if she sees a photograph see doesn’t have, she takes a photograph. It’s easy to add notes and attach it to your tree.

The new Ancestry website has a new Sources column. It makes sources easy to see. Clicking a fact shows visually what sources are attached to that fact. It also has a new Notes tool panel. It’s helpful for abstracts, journaling, and many other functions.

As an aside at the beginning of her presentation, Lisa mentioned the What’s New or Updated collection list on Ancestry.com. At the top of the list was “U.S., Social Security Applications and Claims Index, 1936-2007.” A nice thing about the page is that along the side it lists what collections are coming up soon. Keep going back, because Ancestry is constantly adding new records.

Tomorrow I’ll continue my article about Lisa’s presentation.

Tuesday, August 18, 2015

Ron Tanner Fields Questions at - #BYUFHGC

Ron Tanner at RootsTech 2015 Ron Tanner, product manager for FamilySearch Family Tree fielded questions during his presentation, “FamilySearch Family Tree Road Map,” at the 2015 BYU Conference on Family History and Genealogy. Last week I wrote about his presentation. See “Ron Tanner Discusses Family Tree Road Map at - #BYUFHGC.” Today, I’ll present the questions and answers. These are not exact quotes.

Q: Performance of FamilySearch.org is so bad on Sunday, it makes us want to stop doing genealogy. What are you doing to fix it?

A: The trick is, don’t do all your genealogy on Sunday. Seriously, we don’t want to discourage that. We’re working very hard to get off of New FamilySearch (NFS). Family Tree was designed for 10 times the capacity. Now we are running at 18 times. We are converting every system that we have to a new database and new technologies in order to make the site more responsive, no matter what day you come.

Q: Are you going to preserve the combine page of New FamilySearch?

A: We’re getting rid of NFS.

The usual intent of users of that page arose from the belief that a person miscombined. In reality, the machine did a lot of combining. I apologize for our past sins. People assumed they’d see their past contributions to Ancestral File and Pedigree Resource File, so we preloaded them. Since that resulted in lots of duplicates, we ran computer algorithms to combine them. That’s where most of the bad combines happened.

[Another use of the combine page was to see the original information. There is a better way to do that.] What was NFS made up of? The IGI, Ancestral File, Pedigree Resource File, and Church [of Jesus Christ of Latter-day Saints] membership records. That information [except membership records] is now sitting under the Records section of FamilySearch.org. We are planning on adding sources for Ancestral File and Pedigree Resource File like we’ve done for the IGI.

There are generally two situations causing issues today: someone working from old GEDCOM files or two lines incorrectly coming together. If you find an incorrect ancestor, correct it. If two of you using the same [PID for two] persons, create a new person with your ancestor’s information. Do this only when you find two lines combined.

We will not preserve the combine page of NFS. We will not make a copy of it. It would take 20 terabytes of data if we made it available as another tree.

Q. Are you through changing the colors of the icons?

A. I can neither confirm nor deny that we are done changing anything. Seriously, we wanted the hint icon to pop out. Blue stands out more.

Q. Should we dismiss duplicate hints?

A. When FamilySearch captured records, sometimes they microfilmed a record twice. Accept both so you don’t mess up the hinting algorithm. When you indicate a record is not a match, when actually it is but is just a duplicate, you confuse the system. Specify Not a Match only when it is truly not a match.
[Ed.: Duplicate filming s is probably not the source of the duplicate records. The common scenario is a record that was filmed once, indexed once, but migrated twice. The two migration paths, EASy and ODM, preserved different information. Consequently, FamilySearch decided to publish both until such time as the two could be detected and merged.]

Q. When will there be a new handbook for teachers?

A. They are expensive and must be translated into all FamilySearch languages. We will start to work on something that will be available online. Until then, look for help online.)

Q. My mother died last year. Duplicates of her record keep popping up in the Tree. What should we do when family members die?

A. You have a copy. Mark them deceased. It will become public. A bunch of duplicates can pop up if relatives do the same. Merge them together. By the way, there is an issue you should know about. If there is a member of the Church who has been deceased for some time but whose record doesn’t show up, call support. There has been an issue in Family Tree for the last ten months or so. If the clerk enters death information on their Church membership record, they aren’t marked dead in Family Tree. We are currently rewriting the membership system interaction to move it from NFS to Family Tree.

Q. A lot of work from the past doesn’t have sources. Should we add sources?

A. Absolutely.

Friday, August 14, 2015

Last Day of “Fuel the Find”

Today’s the last day to participate in the FamilySearch Worldwide Indexing Event, “Fuel the Find.” The goal is to have 100,000 people index a record during the week. The current count of participants as I write this is 74,476. If FamilySearch is going to reach the goal, we all need to step up.

Visit https://familysearch.org/indexingevent2015 for more information and to participate.

Thursday, August 13, 2015

Ron Tanner Discusses Family Tree Road Map at - #BYUFHGC

Ron Tanner at RootsTech 2015 Ron Tanner, product manager for FamilySearch Family Tree spoke to the topic “FamilySearch Family Tree Road Map” at the 2015 BYU Conference on Family History and Genealogy. Perhaps because of his no nonsense presentation style, attendees also peppered him with a lot of tough questions. Today I’ll present his prepared material. Next week I’ll share the questions and answers.

FamilySearch Family Tree is different from any other tree on the Internet, Ron said. The Tree is open. Anyone can fix errors. Someone new reaps the benefits of all the work that has come before. Some studies say as much as 80% of research is duplicate work. We are running about 500 thousand new persons added to the Tree every week. There are now about 1.1 billion people in the tree. The duplication rate is monitored very closely.

In 2015 FamilySearch has added many new features.

Tip tray. Down in the bottom right hand corner is a light bulb icon. Click the icon and a tray slides in from the right with tips for using the page. Not every page has one.

Landscape tree view. FamilySearch put in pictures and marriage information. They get complaints that because of these additions users cannot see as many people on screen. Click Show in the upper right corner to turn these on and off. If you turn everything off, you can see more than before these changes.

Dismiss suggestions. Suggested record hints can be dismissed. Click “Not a Match.” This dismisses it for everyone.

Ron shared features planned for Family Tree.

User messaging. Collaboration in Family Tree is extremely important. One change can affect hundreds of people working on that line. But some people can’t be contacted because they are not comfortable sharing their email address. FamilySearch is very close to releasing a messaging system that allows users to exchange messages without revealing email addresses. The messaging system is currently available on beta.familysearch.org. Ron invited attendees to get a friend and send some messages back and forth. To send a message, go to a conclusion, click on the name of the contributor. At the bottom of the person’s information is a link to send a message. The system adds a link to the person in question. You add a message. When the recipient logs onto FamilySearch, at the top they will see the number of messages they have received, but not read. Click messages to go to your Inbox. Both sent and received messages are shown in the Inbox, but you can delete them.

Stop synchronizing with NFS. Family Tree synchronizes with New FamilySearch (NFS) because NFS contains some code not yet implemented in Family Tree. Once synchronization between the two has ceased, there will be no issues preventing merging of duplicate records, there will be no automated contributions attributed to FamilySearch, and performance will improve.

Impendence features. FamilySearch is working on ways to discourage or impede improper changes, without preventing proper ones. Here are several under consideration:

Allow you to delete a person only if you are the creator and only contributor. Otherwise, you must submit a support request.
Show a list of all those watching a person. List the contact names. The idea is that a user, seeing all the people watching a person, will think twice before making changes.
Provide faster change notifications, perhaps daily or immediately.

Sharing of living persons. Today, persons in the tree exist in either a public space or a private space. Each user has their own private space containing all the living persons they created or FamilySearch created for them. When you change a living person’s record, it changes only the copy in your private space. No one else sees the changes. FamilySearch is planning to create a third type of view: a shared view. You create another space and invite others to see it. Participants can be given moderator, read/write, or read-only access. Participants can put stories and photos on the living family members in the shared view. Everyone sees everything in the shared view.

Hinting on mobile app. Users of the Family Tree mobile app will be able to see and accept hints.

Wednesday, August 12, 2015

Book Your Conference Hotels Early

#NGS2016GEN - I see that the NGS 2016 conference website is now live. Registration doesn’t open until 1 December 2015, but as the website states, “it is not too soon to think about hotel reservations.” The hotel adjacent to the conference center and the most inexpensive hotel both sell out well before the conference.

The conference will be held 4-7 May 2016 in Ft. Lauderdale, Florida. For more information, about accommodations, see http://conference.ngsgenealogy.org/accommodations/.

#RootsTech - RootsTech 2016 is even sooner and is also coming fast. Between skiers (Salt Lake ski resorts are really close to the city), FHL patrons, attendees of other conferences at the Salt Palace, and the huge crowd drawn by RootsTech, adjacent hotels fill quickly. RootsTech will be held 3-6 February 2016 at the Salt Palace convention center in Salt Lake City. Registration opens 15 September 2015. While maintaining the number of classes, RootsTech is shifting its schedule to begin with two classes Wednesday afternoon and end with two classes on Saturday. This allows you to fly in Wednesday morning and fly out Saturday evening, thus saving one hotel night.

Speaking of hotels, for more information about RootsTech lodging, see http://rootstech.org/attend/hotels.

Tuesday, August 11, 2015

Ancestry.com Hiring Shows Future Plans

I happened across Ancestry.com’s job listing site. According to one job listing, Ancestry’s employee count is 1,400. Many of the job openings look like they are expanding. And they reveal some of Ancestry’s future plans.

They are hiring scanning technicians in various places: Honolulu, Hawaii; Richmond Virginia; Toronto, Ontario; and Dallas/Ft. Worth, Texas. One can only guess what records they are acquiring at those locations. Another set of scanning technician listings is for scanning technicians for two different shifts in Provo, Utah; a third shift in Provo; and a fourth shift in Provo. It’s apparent that Ancestry is scanning records in Provo from 6 am to 10 pm! These listings also indicate Ancestry is utilizing part time labor in the Provo area, which is abundant due in part to the presence of 53,000 students at two large universities.

The job listings also reveal that “while most of Ancestry's subscribers are in the US, the company has a strong presence in the UK, Canada, and Australia, and is in the process of a large international expansion into Eastern Europe and Mexico.” They are hiring a marketing manager for Mexico. They are hiring a senior manager for global marketing campaigns. Other international hires are for one employee in Munich, Germany and two in Dublin, Ireland.

As a former software engineer at Ancestry’s Provo location, I find it interesting that they are expanding their software development at their San Francisco office with ten open positions, including two Android developers.

Their ProGenealogists division seems to be doing well. They have open positions for a genealogist account manager, an associate genealogist, a genealogist research manager, and an assistant genealogist.

A variety of positions show Ancestry’s interest in expanding their direct to consumer DNA and health offerings: an epidemiologist to manage large genetics studies, a director of genomics, a clinical genomics scientist to do computational algorithms, a vice president of business development to lead licensing and partnerships, a senior data scientist, and various software development positions explicitly for AncestryHealth.

With hackers accomplishing major incursions in companies around the world, Ancestry is hiring a Chief Information Security Officer as well as a senior engineer for information security. Given Ancestry’s possession of customers’ intimate DNA data, this seems prudent. Thank you, Ancestry!

Thursday, August 6, 2015

Meldrum Explains FamilySearch Online Book Collection - #BYUFHGC

Dennis Meldrum speaks about FamilySearch's online family history books. FamilySearch Family History Books was created over 10 years ago to facilitate usage, sharing, and preservation of genealogy and family history books, said Dennis Meldrum, partnership manager for FamilySearch Family History Books. Dennis spoke to the topic “Finding and Sharing Your Family’s Story in Family History Books” at the 2015 BYU Conference on Family History and Genealogy.

The Genealogical Society of Utah (GSU), forerunner to FamilySearch, was organized in 1894. By 1907 the GSU library contained over 800 books. By 1920 the library contained over 5,000 books. In 1928, Boston bookseller, F.J. Wilder, wrote, “Within ten years your Society is destined to become the largest and strongest in the world…You will see in years to come people from all parts of the West and the East flocking to your city to spend days and weeks studying.”

FamilySearch Family History Books (FHB) at books.familysearch.org has 55,000 visitors a month, with 30% coming from outside North America. It has 220,000 books.

FHB contains many compiled family histories. In 2012 managers questioned the value of family history books compared to vital records. FHB did a statistical analysis to determine the average number of names on a page and how many already existed in FamilySearch Family Tree. They found that the average book contained 11.5 names per page and only one-third were in the New FamilySearch tree (now Family Tree).

Back when the library microfilmed books, it would not keep them. As FHB has scanned books, most have come off the shelf, some for copyright reasons. FamilySearch lawyers have said that a library’s fair use rights allows scanning copyrighted books so long as only one person is allowed to view the book or its copy at one time, and access is restricted to the library or a family history center or an affiliated library. Consequently, once scanned, copyrighted books are removed from the library, but are preserved in case FamilySearch ever has to show that they own a copy. Only one person can use the digitized book at a time unless the author has given permission otherwise, or the book is not protected by copyright. Even though FamilySearch could legally do so, it does not digitize copyrighted books whose author or publisher has requested their books not be digitized.

FHB has 14 digitizing operations throughout the United States. They have 36 production scanners, with over 150 volunteers who work more than 90,000 hours each year. FHB has just 5 professional staff.

Opportunities to volunteer are available at many locations. Volunteers don’t have to work full time. You need basic computer and internet experience. Good eyesight is critical.

The largest scanning center is in West Valley, Utah. There FamilySearch has two full time missionary coordinators and over 50 church service missionaries working part time.
There is a need for a couple of volunteers in Allen County Public Library, Fort Wayne, Indiana.
They need a couple at the Midwest Genealogy Center at the Mid-Continent public library in Independence, Missouri.
They need a couple at the Historical Society of Pennsylvania.
They need a couple now in Syracuse, New York at the Onondaga County Public Library and will need another in January.
They need volunteers in Pocatello, Ogden, and (maybe) San Diego.

In 2014, FHB had 16.9 million images, 185 million names, and 67,700 books. That was a 69% increase over 2013. The main goal for 2015 is to improve the user experience. They want to get a new book viewer. On FHB today, you have to download an entire book. By contrast, on Internet Archive you download one page at a time, which gives a much quicker user experience.

Dennis demonstrated how to use the FHB website. I’ve already written about that. See “FamilySearch’s Electronic Books - #BYUFHGC” on my blog.

FamilySearch’s Record Search does not currently search FHB. Record Search depends upon fielded information. FHB book indexes are every-word indexes that are not fielded. FamilySearch is performing a pilot where several stakes are creating fielded indexes so they can be searched in Record Search. Record Search will then be able to reliably find people mentioned in FHB’s digitized books.

Wednesday, August 5, 2015

Aaron Orr Talks Ancestry DNA at BYU Conference – #BYUFHGC

Jeff Orr Talks about AncestryDNA at the 2015 BYU Conference on Family History and Genealogy. Aaron Orr, product manager at Ancestry.com, spoke to the topic “Ancestry.com: Using AncestryDNA to Further Your Research” at the 2015 BYU Conference on Family History and Genealogy.

Aaron asked how many attendees had taken an Ancestry DNA test. Many had. He asked how many felt like they had learned new information because of it. Only a few hands went up. He said the class would help us get more value out of our DNA test results.

First, Aaron reviewed some inheritance concepts that help people understand common questions with DNA test results.

Siblings from the same parent inherit different DNA from their parent. If you go down several generations, you have less and less DNA from a particular ancestor. After enough generations, it is possible for cousins to share no DNA. If you have your children tested, it is possible that they won’t all have the same ethnicity, depending on what they inherit from you and what they inherit from your spouse.

Next, Aaron presented several features that can be used to get more value from your Ancestry DNA test results.

An AncestryDNA DNA Circle for James Davenport DNA Circle – A DNA circle is a group of likely descendants of an ancestor. The circle is built using a combination of the data from your Ancestry Member Tree and the data from your DNA. The beta label indicates that AncestryDNA is actively improving the algorithms. Aaron said the results are conclusive enough that AncestryDNA can share them with you. Click the ancestor portrait to see a page about that DNA circle.

Scroll down a little to see a diagram (below) showing the relationships of those in the circle.

Diagram of the relationships in the AncestryDNA DNA Circle for James Davenport

In the DNA Circle relationship diagram, the thick orange lines show the group members with which you share DNA. What about the others in the circle? Even though you don’t match some members of the group, they match others in the group with which you do match.

Rather than show every single member of the circle, very-closely related persons are grouped together. In the example above, the group highlighted at the top, the Thompson Family Group, consists of five closely-related persons.

An AncestryDNA New Ancestor Discovery for William Lauder Payne New Ancestor Discovery – The New Ancestor Discovery feature hints to a possible ancestor. To differentiate it from a full DNA circle, it is surrounded by a dotted line and has a leaf like a record hint. Your DNA matches a DNA circle well enough that you are possibly a descendant of that ancestor.

Click on the portrait and AncestryDNA creates a popup (below) that shows a little information about the possible ancestor.

Popup of the New Ancestor Discovery for William Lauder Payne

Click LEARN ABOUT and AncestryDNA gives a Life Story about the possible ancestor. This can give you clues about how this person might be related to you. Click on See Your Connection to see the DNA Circle. When you view it, you are shown outside the circle, indicating you are not a proven member of the circle. The ancestor doesn’t exist in your Ancestry Member Tree.

Diagram of the relationships in the DNA Circle of William Lauder Payne, a possible ancestor

Just as with DNA Circles, you may not share DNA with every member of the possible ancestor’s DNA Circle. Orange lines show shared DNA, as shown in the above example between myself and the Harris and Van Orden family groups. While I share no DNA with the E. B. family group, its members share DNA with both the Harris and Van Orden families.

Shared Ancestor Hints – A shared ancestor hint shows how you are related to a DNA match. The relationship is determined using yours and his Ancestry Member Trees. In your match list, a leaf icon indicates a shared ancestor hint is present. Hopefully, you have public trees. That is how you get Shared Ancestor Hints.

A Shared Ancestor Hint

DNA Matches – A lot of work goes on behind the scenes to find a match. Every single sample is compared with the million samples. Don’t be timid about sending a message to a DNA match who may be able to help you with your research. Try and start a dialog. Be specific with what you know and what you suspect, particularly if you don’t have a Shared Ancestor Hint.

After presenting these features, Aaron took a few questions. Here are a few that I thought you would be of interest.

Q. What is being done with the Sorenson database samples? A. The samples are currently being used only as ethnicity reference points.

Q. Why did you get rid of the Y DNA tests? A. Aaron indicated he was not the best person to answer that question. Each test has its uses and strengths. His understanding is that AncestryDNA has found that the autosomal test gives the best ability to establish cousins.

Q. What is the Snavely tool? Snavely’s [AncestryDNA Helper] tool is a Chrome plugin. It works with AncestryDNA to provide additional ways to work with your AncestryDNA results.

Aaron closed by telling us about other ways to get answers to our questions about AncestryDNA. Look for the question mark icons on the AncestryDNA web pages. Also, check out the free Ancestry Academy course, “DNA 101: An Insider’s Scoop on AncestryDNA Testing.”

Tuesday, August 4, 2015

“Angels Round About” - The Cokeville Miracle – #BYUFHGCON

T.C. Christensen addresses the 2015 BYU Conference on Family History and Genealogy. Several of the children who survived the Cokeville school bombing related having seen angels protect them.

“Most of the children that had a spiritual experience with this event identified an angel that helped them as being their ancestor,” said T. C. Christensen. T.C. gave the closing keynote, “Angels Round About,” at the 2015 BYU Conference on Family History and Genealogy. T.C. Christensen is a filmmaker whose most recent movie is the Cokeville Miracle.

The movie retells the true story of the school bombing at Cokeville, Wyoming on 16 May 1986. A madman, David Young, and wife took 136 children and 18 adults hostage at the Cokeville Elementary School. Armed with firearms and a bomb triggered to his wrist, Young demanded a ransom of two million dollars per hostage. Several hours into the standoff, Young transferred the trigger onto his wife’s wrist. She accidentally triggered the bomb, causing a massive explosion. It should have blown the entire building down. Many consider it a miracle that no one was killed, although dozens were injured. In the days that followed, the children began to tell amazing stories.

T.C. played several clips from the movie. In one clip, a young boy told his parents there were others in the room, all dressed in white. A woman told him to stand by the window and he would be okay when the bomb went off. He thought she was his grandmother who lived in a different town. Later, he identified her with a picture of his deceased Grandmother Elliot. A young girl saw a lady dressed in white who told her to go stand by the window with her brother and everything would be okay. Another girl, Katie Walker, saw angels stand between the children and the “bad guy.” Afterwards, Katie’s mother showed her a locket with a picture of her own mother who had died when she was just fifteen. Katie identified the woman in the locket as her angel. Another girl, Jennie Sorensen, said a woman helped her from the room after the bomb went off. She later identified the woman as her aunt who had died 10 years earlier. At the end of the clip, the first boy also said people, bright like light bulbs, were holding hands in a circle around the bomb. When the bomb went off, they went up through the ceiling, which investigators had determined was the main direction of the bomb blast.

“You are interested in family history?” he asked. “I can’t think of a better justification for what you are doing than the Cokeville event.”

T.C. briefly reminded attendees of another incident. In 1999 a gunman entered the Family History Library in Salt Lake City and began shooting people, two fatally. Survivors told of seeing men in white preventing the shooter from leaving the orientation room. (See Emiley Morgan, “Desire to serve: Even a gunshot couldn't keep Nellie Leighton from being a missionary,” Deseret News, 2 July 2015, web edition (http://www.deseretnews.com : accessed 31 July 2015).

After his main presentation, T.C. turned the time over to Katie Walker, now Katie Payne. She told us that she had never before seen the woman or the photo in the locket. Her mother was very traumatized by her own mother’s death and kept the locket tucked away. The only photo she had of her was the photo in the locket.

“I’m grateful to be able to be a witness to those miracles,” she said. “There is a loving Heavenly Father that allows his angels to come back and help us in our time of need.”

She shared with us that she has had conversations with survivors of the Columbine shooting. One told her that an ancestor ran with them from the school.

Jennie Sorensen, now Jennie Johnson, was also present at the keynote. An attendee asked if she immediately recognized her angel. She said, no, she just followed her thinking she was a teacher. At one point she stopped to get her shoe. She had lost it and she was afraid her mother would be mad. The lady told her not to go back. One day while transferring photographs from a photo album, she saw the woman and asked, “When will I have that teacher?” It was then that she learned the woman was her mother’s favorite aunt who had passed away several years before.

An audience member asked if they had post traumatic stress syndrome. Katy said they did. She personally suffers from a fear of men, loud noises, and the smell of gasoline. While camping, the campfire smell can trigger nightmares. Episodes have become less frequent, but they are still there. Making and watching the film has been hard, but she thought that while hard, it has assisted healing.

T.C. was asked why people are not always saved in these incidents.

“Even In Christ’s day not every leper was healed, not every blind person was made to see. But when we see God’s hand, we should recognize it, praise and be thankful for it. And that’s what we do today in speaking of Cokeville.”

Monday, August 3, 2015

Monday Mailbox: AncestryDNA Study Projects

The Ancestry Insider's Monday Mailbox Dear Ancestry Insider,

I have a concern about this [DNA] database usage by Ancestry in the future and I don't recall seeing you comment on it.
The only reason that I have not submitted by DNA is because I don't know what Ancestry has the right to do with the information or whom they can give it to in future or what their usage will be for generations to come.
I've read Ancestry.com's agreement but it does not specifically address my concern.
Do you know?
Thank you,
ToBe

Dear ToBe,

I put off giving my DNA for a long time, for the same reason. I wish I had the time today to carefully consider your question. But at least I can relate something that happened recently. More on that in just a minute.

Signed,
The Ancestry Insider

Dear Ancestry Insider,

Aren't you going to discuss Ancestry's newly announced sale of all of their DNA to a pharmaceutical company and their "partnership" to develop based on their users' DNA and family histories? I believe this is a major development that needs coverage and discussion.

Thank you,
A Gate Pass

Dear readers,

Paragraph 2.vi of the AncestryDNA privacy policy states

vi) To perform research: AncestryDNA will internally analyze Users’ results to make discoveries in the study of genealogy, anthropology, evolution, languages, cultures, medicine, and other topics.

Paragraph 6.i. states

i) Non-Personal Information. Your use of the Service may result in the storage or collection of information that does not personally identify you (“Non-Personal Information”)… Non-Personal Information also includes personal information that has been aggregated in a manner such that the end-product does not personally identify you… Because Non-Personal Information does not personally identify you, we may use Non-Personal Information for any purpose, including sharing that information with the Ancestry Group Companies and with other third parties.

Ancestry recently announced a partnership with Calico, “a Google-funded research and development company whose mission is to harness advanced technologies to increase our understanding of the biology that controls lifespan.” I assume this is the event A Gate Pass is referring to. The announcement says that “together, they will evaluate anonymized data from millions of public family trees and a growing database of over one million genetic samples.” If Ancestry anonymizes the data through aggregation, then Ancestry is complying with its stated privacy policy.

There is another agreement under which AncestryDNA can share your DNA information. AncestryDNA offers users the opportunity to participate in “the research project.” Participation is optional. The research project consent agreement describes rather far reaching uses of your DNA. The project

collects, preserves and analyzes genealogical pedigrees, historical records, surveys, family health data, medical and health records, genetic information, and other information (collectively, "Information") from people all around the world in order to conduct research studies to better understand, among other things, human evolution and migration, population genetics, population health issues, ethnographic diversity and boundaries, genealogy, and the history of our species ("The Project").

…

Discoveries made as a result of this Project could be used in the study of genealogy, anthropology, population genetics, population health issues, cultures, medicine (for example, to identify drug response, health risks, etc.), and other topics.

The agreement seems to envision multiple projects within The Project.

Each time an individual study is undertaken with a third party researcher or anytime the results of the Project or an individual study are to be published, Information that traditionally permits identification of specific individuals, such as names and birth dates, is removed from the Information.

The studies might result in scientific results which AncestryDNA scientists can publish. The studies may result in commercial products. Results will not be shared with you or your doctor.

The data that Ancestry is providing to Calico may be under this agreement. The way I read it, anonymizing your DNA may be as simple as removing your name and other conventional identifying information.

I’m under some time constraints and did not carefully read these agreements. You may wish to do so before accepting my evaluations.

Read the entire Ancestry/Calico partnership press release on the Ancestry corporate website. Read the AncestryDNA privacy policy and the Consent Agreement on the Ancestry website.

Biography

The Ancestry Insider was a readers’ choice for the top four genealogy news and resources blogs, part of Family Tree Magazine’s “40 Best Genealogy Blogs” for 2010. He reports on the two big genealogy organizations, Ancestry.com and FamilySearch. He was named a “Most Popular Genealogy Blogs” by ProGenealogists, and has received Family Tree Magazine’s “101 Best Web Sites” award every year since 2008. A genealogical technologist, the Insider has a post-graduate technology degree and holds a dozen technology patents in the United States and abroad. He has done genealogy since 1972 and has worked in the computer industry since 1978. He was Time Magazine Man of the Year in both 1966 and 2006. And he really is descended from an Indian princess.

Subscribe by Email