Thursday, February 23, 2017

Town Hall Meeting at #RootsTech – Tree Edition

(L to r) Shon Watkins, Stephen Valentine, Rod DeGiulio, and Craig Miller prepare for FamilySearch town hallThis is the second in a series of articles about FamilySearch executives’ town hall meeting during RootsTech 2017. Yesterday I published questions and answers pertaining to records. Today the topic is Family Tree.

As I warned yesterday, I didn’t always capture correctly everything that was said. What you’ll read below may or may not bear any resemblance to what was actually said.

Q: What happened with FamilySearch in 2012? A lot of changes in Family Tree are attributed to FamilySearch in that year.

A: Family Tree indicates 2012 because that year we migrated systems. It indicates FamilySearch contributed the information when we don’t know who contributed it. Long ago, back in the 70s and 80s and so forth, we received contributions for which we don’t have a valid user ID in our system. But since we require everything to have an owner, we assigned FamilySearch as the owner.

Q: Will FamilySearch ever have a place to post DNA results?

A: DNA is a valid record type. But we don’t want our Church members to feel like they are expected to spend $100 to fulfil their responsibilities. Having a place for DNA in the system is under advisement.

Q: There were a lot of questions on the ability of sharing living records. Why can’t we see living persons in FamilySearch.org? When are we going to have shared spaces for living persons?

A: We are currently inventing that new feature. It is going to be awhile because it is a big job to preserve privacy while allowing sharing, to keep private all those who wish to remain private, and make public all those who wish to be public. There’s a lot of legal work to do. There’s a lot of coding to do. It is going to be awhile. Stay tuned.

This is driven by the principle that individuals and families are the gatherers of their families. Since this is a family effort, lets make sure families can see both deceased and living information so they can do this as a family.

Q: I have added a lot of photos for the living. Why can’t other people see them?

A: The model with sharing will be that you can create a private space and invite people to go into that space. Everyone puts living persons in that space. Those persons are visible to everyone else in the space. A person can be members of multiple private spaces. When someone adds a photo, everyone can have access to the photo.

Q: [I didn’t catch the follow up question or most of the answer.]

A: If you tag a living person in a photograph, then it will be private. [I was a little confused at this point. I think that all photos and documents on FamilySearch are visible to anyone and everyone who has a URL to the photo or document. I think if a photo is tagged to both a living and a deceased person, then anyone can find the photo through the deceased person. What tagging a living person does, is hide the photo from Google’s search engine. At least that’s my understanding.]

Q: Why did you incorporate the name LDS Membership as a source in FamilySearch.org?

A: [Let me take a stab at a more detailed answer to this question than provided by Craig Miller.

You may recall that until June 2016, Family Tree was linked to the backend of the archaic NFS. NFS treated the LDS Church Membership database as if it were an actual user. When that pseudo-user made changes, those changes were attributed to “LDS Church Membership.” That is why FamilySearch incorporated that name as a source of changes in FamilySearch Family Tree.

Incidentally, once the link with NFS was broken, that pseudo-user ceased to exist. It no longer contributes or owns any data in Family Tree.]

Wednesday, February 22, 2017

Town Hall Meeting at #RootsTech – Records Edition

(L to r) Shon Watkins, Stephen Valentine, Rod DeGiulio, and Craig Miller prepare for FamilySearch town hallFamilySearch executives held a town hall meeting during RootsTech 2017 and answered questions for an hour. Today I will write about the questions and answers pertaining to records, online or on microfilm, and partners.

Present were:

  • Steve Rockwood – President and Chief Executive Officer.
  • David Rencher – Chief Genealogical Officer.
  • Stephen Valentine – Vice president, partners and records. He handles record ingest and publication.
  • Rod DeGiulio –Vice president, priesthood and area support. He handles record acquisition and management of all FamilySearch employees outside the Salt Lake area.
  • Craig Miller – Vice president, product and engineering, including the website FamilySearch.org.
  • Shon Watkins – representing Diane Loosle, vice president, patron services. Diane’s division includes customer service, the Family History Library, and family history centers.

No one was representing the member and public outreach (marketing) division. They were busy doing some incidental project called RootsTech. Perhaps you’ve heard of it.

I’m sure some of these people are senior vice presidents—maybe all of them—but the conference app didn’t list them or their titles.

I need to warn you that I was typing like a madman trying to take notes of what was asked and said. I didn’t do a very good job. Consequently, what you’ll read below may or may not bear any resemblance to what was actually said.

Q: Is there any way to suggest acquisitions, such as a small community in Germany?

A: There is a dedicated team that develops that strategy. Today, we don’t have the capacity to do lots of small archives, but we are experimenting with ways to do so. We have a Record Capture Kit, for example, that could be loaned to a local society.

Q: Paul Nauta of FamilySearch in early 2017 wrote that over 30% of the 2.4 million rolls of microfilms in the vault have been digitized and published. That seems low. Is he right?

A: About 50% of the vault has been digitized. We don’t ever say when we will be done, but we are making really good progress. We go in priority order, with vital records and civil registration and census being digitized first, pretty much from around the world. Those are primarily done.

Q: Why don’t I see all of them?

A: The reason for that is rights. In some cases, in some countries, we do not have the full rights to put all that content online. We can preserve it, sometimes we can even loan it on microfilm. We are working on the digital rights. That is something that Rod’s team is constantly working on, to improve our rights, and we are having lots of successes. We see laws change. We just had a huge opening in France; those laws are changing and we are starting to digitize. That’s on ongoing process. Over the next couple of years we are trying to wrap up digitization.

It isn’t going to take decades to finish digitizing the vault. We’re down to just a few years left until it will be done.

One of the drivers is just the incredible cost of microfilm. It just keeps going up and up and up. So even if we were in love with microfilm—and we’re not—price is pushing us to get this done much faster.

[This is TAI speaking, here: My understanding is that the manufacture of microfilm could cease almost without warning at any time. That could be really bad.]

Q: When will the microfilm of the membership records of The Church of Jesus Christ of Latter-day Saints be completed?

A: We actually already digitized and indexed it. That was been done as a closed indexing project several years ago. But it has not been released. We are working with the Church History Department who actually owns those records to see how we can get them available to you.

Q: [I couldn’t hear the question, something about removing films from the Family History Library?]

A: Only the films with your ancestors will be removed. We have a very complex algorithm that knows which films you need. Those are the films we remove. (Laughter)

As films become digital, we decide whether to leave them in the library or not. The films that need to be in the library because of rights, they will remain. It will keep changing.

Q: Some record sets that were once visible online are no longer visible once they were indexed. How come?

A: The only reason we would ever do that is rights. It’s that simple. It’s rare, but it happens. A law could change. We’re constantly working with our legal staff around the world. Data privacy laws change. That’s become more restrictive around the world, data privacy. And we do everything we possibly can to not turn those records off. But if we do, that would be the only reason.

Q. Why is it that Ancestry has a lot of those images that FamilySearch doesn’t? Why would other websites have records that we don’t?

A: We are more conservative than other companies. We are going to err on the side of protecting data privacy, more so than other companies are.

I’ll tell you, privacy laws are one of the most difficult things we have to work with. Every country is different, even in the United States every state is different. And it changes constantly. And so there will be collections that come and go. Fortunately, there are lot more that are coming available than those that are being taken away.

Be aware of some of the legislation on the issues that come up. It does work when we [discuss] and work with our congressmen and senators. But be aware that that is something you can help us with. Coalitions of genealogists have really helped. France is a great example of that, opening up access that was once much more restrictive.

Q: What are your plans for future partners?

A: We are always on the lookout for more partners. We are also looking for partners outside the traditional genealogy space to create a richer experience for you. And we’re eager to find partners in other countries. Geneanet is a new partner in France. We have the largest collection of Chinese genealogies outside china, so we are looking into a partner in China. If you are aware of partners, contact me.

Q: [One guy tried to ask a question and people kept interrupting him:] There is an extension from the Google app store, but it puts all the partners in there, it auto populates… Record Search? Record Search. Record Seek. Record Seek? Record Search. Search. It’s Record Search. There’s two different ones.

[I feel like I’m watching the Tonight Show.]

A: I’m not familiar with that one. So what was it called again?
Q: It’s called Record Search.
A: Record Search.
Q: It is a Google extension. Record Search.
A: Oh, it’s a Google extension. A Google extension, Record Search.

[Yup. They are definitely channeling Higgins and Fallon.]

Q: Any chance of a newspaper acquisition or collaboration?

A: We are working with some great newspaper companies and the obituaries you are now using on FamilySearch.org are due to a partnership with NewsBank. We want to bring more and more of that newspaper content. It is in the works.

Q: Have you contacted national libraries that call out those historical newspapers, many of which are public domain?

A: Yes. We are working to expand our newspaper holdings. Until two years ago we were doing nothing in the newspaper space, so this is a new area for us. We just did 26 million obituaries where the computer did the entire thing. OCR has been around a long time, but now we are using technology to understand the text. And we really want libraries outside of this country. Absolutely.

Q: As new partners are added to FamilySearch.org, how do you suggest we learn to use the partner’s technology?

A: Traditionally we have not done a lot to help use our partners products. We generally send you to the partner to learn. We are working on playing a bigger role, but the first thing will always be to go learn at the partner website. But where our two products interact, we are working on how to do that.

 

 

Tune in next time for more questions and answers.

Tuesday, February 21, 2017

Passionate Genealogist is Core - Tim Sullivan at #RootsTech

Tim Sullivan at RootsTechI had the opportunity during RootsTech to sit down with Tim Sullivan, president and CEO of Ancestry. One of the things we discussed was the dichotomy between giving experienced users powerful tools while giving new users an engaging experience. I’ve always felt they compromised the power of core tools by watering them down to suit the new user.

Tim said the passionate genealogist is still the core of their business. DNA is proving to be a phenomenal way to interest new people. While they will give some thought to making core genealogy tools accessible to new users, they are allowing themselves to shift back, improving tools critical to the experienced user. DNA is expanding massively and it is giving them the opportunity and personnel to do some improvements on functionality for the serious genealogist. He mentioned adding intelligent hint prioritization, and “using our big tree to really improve the quality and relevance of hints.”

Tim went on to explain more about their Big Tree. Big Tree is an internal term they use for an effort they’ve been engaged in for many years to stitch together the millions of member trees on Ancestry.com. They are applying machine learning technologies and authority systems and are getting more accurate every day.

“That has always been a little bit of a holy grail, to find that one tree,” Tim said. They are taking a different approach than FamilySearch, but then again, their purposes are different. The Big Tree is not intended to be a product, but something that allows them to develop “some pretty cool capabilities.” One application is their new We’re Related app.

The We’re Related app is a free, entertaining tool for engaging more people in genealogy. “It’s just another way to get a whole new group of people inspired,” Tim said. “What we hope that does is lead them to want to become serious researchers.” The challenge is getting people connected into the Big Tree. Something like 2/3 of people downloading the app are able to connect to the Big Tree, even though they may not have an Ancestry Tree. This is why a Facebook account is instrumental. It helps Ancestry build out  living persons and their relationships. The other necessary step is for Ancestry to add famous living persons to the Big Tree and making certain their branches are correct.

We Remember is another product Ancestry is developing to attract a new audience into genealogy. Just announced at RootsTech, We Remember will allow people to create memorial pages for loved ones. Tim said that We Remember is not a replacement for obituaries. But about the time a loved one passes away there is a lot of energy and motivation to memorialize and capture their life. About a year ago they realized there could be  a better experience for doing this, so they built one. It is social and it is free. “Our goal is that this be very, very broadly adopted,” Tim said. I was, unfortunately, not able to attend the class where it was introduced, so I don’t have any details. While he didn’t give a release date, he said it would be soon.

Saturday, February 18, 2017

Ancestry UK Collection Free Through Monday, 20 February 2017

imageTo celebrate Presidents Day in the United States, Ancestry has opened up more than one billion records from its United Kingdom collection for free access. I jest a bit. Presidents Day and free UK access may just be a coincidence. (But it does makes me think of the meme that goes around on the 4th of July: “Happy Treason Day, Ungrateful colonials.”) To quote Ancestry:

Ancestry is opening up their site for you to explore more than 1 billion UK records—so you can find out if you're one of the 60 million plus Americans with British ancestry. It's for three days only, so now's the time to find those crown jewels hidden in your family tree.

To search the collection, click here. To see a list of the databases, click here.

Access to the records in the featured collections will be free from February 17, 2017 at 2:00 p.m. ET to February 20, 2017 at 11:59 p.m. ET. After the free access period ends, you will only be able to view the records in the featured collections using an Ancestry World Explorer or All Access paid membership.

Thursday, February 16, 2017

Robert Kehrer’s Industry Trends and Outlook – #RootsTech

Robert Kehrer at RootsTech 2017Robert Kehrer, product manager at FamilySearch, took part of a panel discussion titled “Industry Trends and Outlook” at the Innovators Summit portion of RootsTech 2017. Robert wrestles with big data technology problems at FamilySearch.

One of the hardest things Robert faced in preparing his presentation was narrowing down the areas that he wanted to talk about. He narrowed things down to three categories of innovation: technology, process, and data.

The first technology innovation he sees coming is automated transcription—the ability of a computer to transcribe a document. There have been some recent advances, particularly in the area of handwriting recognition. Today automated transcription works well on typescript documents and pretty well on print handwriting. The ability to do recognize cursive writing is showing promise. However, there are really messy documents that automated transcription is not likely.

Robert Kehrer says automated transcription of some documents is harder based on handwriting style

Another area where technology innovation is happening is named entity recognition. A computer takes transcripted text and, using a process called natural language processing, picks out the names, dates, locations, relationships, and so forth. Progress is being made in this area.

Innovation is happening in neural networks and machine learning and is important in combination with automated transcription and named entity recognition. Machine learning is not difficult to understand when demonstrated with a simple example. Machine learning could make it possible to show the machine many images of the name William. Subsequently, when names are shown to the machine, it can pick out those that are William.

Robert Kehrer demystifies machine learning Robert Kehrer demystifies machine learning

Don’t think that these technologies are going to replace human indexers. These technologies must be trained using data indexed by people. And these technologies free up people to do what only people can do.

Innovation is happening in fuzzy search advancements. Fuzzy is a funny word that he used to refer to non-exact search results. This is familiar stuff like wildcards and name variants. Robert feels like there could be some innovation here less complicated than an artificial intelligence hint matching system but more sophisticated than the search engines of today.

DNA will and is having a massive impact on genealogy.

Process innovations are going to be important as well. Today, organizations have a centralized process for determining what records to acquire. Robert thinks we will see more distributed decision making on what collections to digitize. He envisions a world where local archives, libraries, church congregations (like LDS stakes and wards), and individuals take the responsibility to identify, digitize, and index collections. We see this a little already with apps like FamilySearch Memories or BillionGraves.

Data innovation was Robert’s final category. There is a lot of data out there that is highly valuable, but there is a risk that it will be lost. Records can be at-risk because of poor archival conditions, political instability, natural disaster, or scheduled destruction. India destroys their censuses before the decade is over. Lastly, there are hundreds of millions of “records” stored in memorized genealogies in certain cultures, many throughout Africa. FamilySearch has an active and growing program to capture these “oral genealogies.”

Robert Kehrer says some records are at risk because of poor archival condition. Robert Kehrer says some records are at risk because of political instability. Robert Kehrer says some records are at risk because of natural disaster Robert Kehrer says some records are at risk because of scheduled destruction

The last data innovation is one of Robert’s hopes. There is so much good genealogy data locked up in the record managers on genealogists’ computers. It is not shared freely. Robert envisions a world where tree data is more readily available and shared more freely among all the different sites. Websites could compete on best features, user experience, and records rather than on availability of member submitted trees.