Thursday, May 10, 2012

Ancestry.com VIP Briefing

Fruit-ka-bob trees at Ancestry.com VIP receptionI was lucky enough to get an invitation to Ancestry.com’s Wednesday evening VIP briefing at the 2012 annual conference of the National Genealogical Society. Here’s some of the stuff they covered:

First, the presentation of the refreshments was fantastic. Fruit-ka-bobs stuck into pineapple-trunks of tropical trees. Eye-popping good.

Ancestry favored us with three presenters.

Ancestry DNA

John Pereira spoke about AncestryDNA. You’ve heard most of the hoopla and I talked a little bit about it yesterday. (See “Ancestry.com Q & A at NGS Conference.”)

To give you an idea of the scope of the new product, while the old Y-test compared 46 markers, the new one uses 700,000.

Ancestry DNA ethnicity pie chart and mapAs shown to the right, the test shows your ethnicity divided up on a pie chart and marked on an adjoining map.

Possible cousins are identified. First to Fourth cousins are indicated with percentage confidence level. More distant cousins are shown with a confidence level of 50% or less.

If you have an Ancestry tree, the Map and Location feature indicates the number of ancestors from each region of the world. If your cousin also has a tree, the Pedigree and Surname feature shows your common ancestor and the lines of descent for your cousin and yourself.

Content

Dan Jones talked about Ancestry’s content. I thought it was a great sign that Ancestry values content enough to have a person dedicated to acquire and manage it.

Statistics (most are current as of the end of March):

  • Years spent acquiring, digitizing, indexing, and publishing content: 15
  • Dollars spent so doing: $115 million
  • Records online: 10 billion
  • Collections online: 30,000
  • Trees created: 33 million
  • People in trees: 4 billion
  • Photos and stores uploaded: 115 million
  • User additions and corrections: 44 million
  • New collections in 2011: 485

Recent 2012 releases:

  • Massachusetts Vital Records 1620-1920 (the Holbrook Collection)
  • They finished the 1911 UK Census on Thursday
  • Pennsylvania Church and Town Records 1708-1985
  • Titanic Collection
  • London Land Tax
  • London Electoral Registers

Ancestry has republished their city directories using a fielded OCR technology that makes the city directories much easier to search and use. (See my recent article, “Data Extraction Technology at Ancestry.com.”) At the same time, they’ve doubled the size of the collection.

As shown in the graphic below, the comparison of before and after is impressive. Searching the directories before was about the same as looking through a “bag of words.” Today, fielded information makes it possible to reliably search for names and places. The change has produced a major uptick in Ancestry’s record count. If I understand their counting methodology correctly, the old collection contained 6.6 million records (bags of words), whereas now it contains 1 billion records (the people named in the directories). These new records can be attached to trees and can be corrected. Already, users have discovered 6.2 million people (110,000 a day) and submitted 92,000 corrections.

Ancestry.com U.S. City Directories - Then & Now

Ancestry is looking at additional printed content for this technology, such as printed family histories. I think if they can get that working, that would be phenomenal.

When it comes to the 1940 census, Jones said that Ancestry considered joining the 1940 U.S. Census Community Project, but ultimately decided that controlling their own index put them in a better position. They are indexing more fields and have made a partnership with IPUMS, the Minnesota Population Center at the University of Minnesota.

Jones presented the timeline for Ancestry’s first release of the census. He warned us that he had some of the time zones wrong. I think I fixed them, but you’ve been warned.

  • 2 April 12:01am – Sabrina & Josh (Ancestry employees) pick up images from NARA
  • 2 April 12:20am – Images arrive at Ancestry DC office
  • 2 April 12:37am – First 4 rolls imported and converting
  • 2 April 1:22am – First images live on Ancestry.com
  • 2 April 2:00am – Drives containing images fly back to HQ
  • 3 April 3:00pm – First indexed data arrives at Ancestry.com HQ
  • 5 April 4:00pm – Complete DE and NV live on Ancestry.com.
  • 6 April 4:15am – All images live on Ancestry.com

The collection has been popular. On April 6th alone, the 1940 census images were viewed more than all eight open UK censuses are viewed in a typical month!

Product Improvements

Eric Shoup talked about Ancestry product improvements. Ancestry has improved several things about its hinting feature. Notifications occur in the website header in addition to the old e-mail system. Hinting has been extended to your entire tree. (I didn’t know it wasn’t doing the entire tree.) Ancestry is generating more photo and story hints as well as hints on new collections. Hints can be turned off for individual trees. An All Hints page allows quick review and disposition of new hints across an entire tree. Soon, possible extensions to family trees will be indicated on the pedigree itself.

The Ancestry mobile app continues to be popular; they have reached 3 million downloads. They are ready to release a new family view. The application is no where close to where they want it to be. As he mentioned at RootsTech, they are increasingly thinking of mobile applications before desktop, so they are forced into the discipline imposed by a mobile application.

Synchronizing Family Tree Maker (FTM) with Ancestry Member Trees has been popular. Since September over 140 thousand people have set up synchronizing between their trees. Trees can be quite complex. They’re seeing an average of 2,047 source citations per tree and 130 media items. I’ve told you my experience. I have so many media items that it took hours to synchronize. Fortunately, FTM did the operation in the background.

Shoup showed off their new census viewer, currently available for U.S. 1930 and U.K. 1911. As you scroll about the census, the viewer displays the people’s names even when scrolled off the page. They will soon show column headers. Hover over a field and a popup shows the contents for those who have problems reading the handwriting. The person of interest is highlighted in yellow and the household is highlighted in green.

Ancestry.com new image viewer has headers, highlights, and field popups

Eric Shoup answers questions at VIP receptionShoup also took questions from attendees.

He couldn’t give answers to several questions about life after the Archives.com acquisition.  “We can’t plan our lives together until we’re together.” We’ll do what makes sense.

One attendee asked if Ancestry will open up its APIs to allow 3rd party vendors to synchronize with Ancestry Member Trees. Shoup said that they have no strategic objection, but there are tactical concerns. Getting FTM to synch was a major undertaking. Ancestry would hate to establish all the support necessary for an outside vendor and then not have sufficient interest.

To index the 1940 census, Ancestry is using a select number of offshore vendors, vendors with which they have an established relationship. Shoup said they are “dialing up” everything about the 1940 census: size, scope, quality, number of fields, and so forth.

Stay tuned for more National Genealogical Society Conference coverage…

1 comment:

  1. "Hinting has been extended to your entire tree."

    I have not seen this. I get 'hints' without a specific search when looking at an individual's overview, or by looking at a pedigree- or family- view box chart. Then the 'hints' are low-hanging-fruit: tree hints, or hints for items saved by others to their trees (for the individual in my tree, if the program has made a "same person" ID). Or post-1880 US Census items, still with a lot of server confusion about women's birth- and married-surnames.

    ReplyDelete