The Ancestry Insider: October 2008

Friday, October 31, 2008

Visiting NARA: NARA Publications

NARA has many publications that can help make your research at the archives more productive. Some are available online and some are printed. Of the printed publications, some cost money and some are free.

The best publication to consult before making your trip is the 410 page

Guide to Genealogical Research in the National Archives of the United States, Third Edition.

The softcover edition is $25, although I wouldn't buy it unless you plan on making extensive use of NARA records. However, since many NARA records are now available online via NARA's commercial partnerships, it no longer requires a trip to D.C. to be a regular user of NARA's records. Still, I recommend checking local libraries and family history centers for this book before purchasing it. If your local library and family history center don't have it, recommend they acquire the hardback edition ($39).

If I forget to include it in this Visiting NARA series, remind me to review the contents of this book and give examples of using it.

I picked up hard copies of these free publications when I got to Archives I:

Military Service Records at the National Archives, Reference Information Paper (RIP) 109.
Using Civilian Records for Genealogical Research in the National Archives, Washington, DC, Area, (RIP 110). Replaces Using Records in the National Archives for Genealogical Research (GIL 5).
Select List of Publications. (GIL 3)
Citing Records in the National Archives. (GIL 17)
National Archives and Records Administration Regional Archives:
Rocky Mountain Region, (GIL 44) [2003] 16 pp. - You'll probably be interested in getting the leaflet for your regional archive.
National Archives and Records Administration Regional Archives: Northeast Region--Boston, (GIL 46) [2003] 16 pp.
Research in the Land Entry Files of the General Land Office, (GIL 67) [rev. 1998.] Has been replaced by Research in the Land Entry Files of the General Land Office (RIP 114).

These two publications are especially important in understanding what to expect at the National Archives. I'm covering some of this information in my articles.

National Archives of the United States, General Information Leaflet (GIL) 1, [rev. 2005], 9 pp.
The National Archives in the Nation's Capital — Information for Researchers, (GIL 71) [revised 2004], 30 pp.

Legend of symbols used in list of NARA publications Here are three lists that contain publications not all listed above that you might find helpful:

To order free publications, contact:

National Archives and Records Administration
Archives I Research Support Branch (NWCC1)
700 Pennsylvania Avenue, NW
Washington, DC 20408-0001

Telephone: (toll-free) 866-325-7208 or 202-357-5332

If you don't get the free publications in advance, look for them when you get to the archives. As you pass between the elevators, turn to your right towards the cashier and look at the display stand against the wall on your right.

To order publications by credit card, visit http://estore.archives.gov for VISA, MasterCard, Discover or AMEX orders. To pay by check, please call 1-800-234-8861.

Thursday, October 30, 2008

Record Search 27 October 2008 Update

The collection news dated 27 October 2008 on the FamilySearch Record Search pilot website (http://pilot.familysearch.org) indicates several updates have occurred or are currently in the pipeline.

Texas Deaths, 1890-1976 has been posted with "improved searchability." I assume that means problems have been fixed.
1850 U.S. Census, all schedules - 3 states added for a total of 33 states/territories, 92% of the population.
1860 U.S. Census - Coming soon with more than 17 states.
1870 U.S. Census - 4 states added, bringing the total to 35 states with 74% of the population complete.
1865 Massachusetts Census - Coming soon; initially includes browse images only.

Because this is a pilot, the links above to the collections will stop working some time in the coming months.

JewishGen data available on Ancestry.com

First, sorry about the double weirdness yesterday, the two identical posts titled "JewishGen Belarus Databases..." That was an attempt to scoop Ancestry.com by announcing the JewishGen databases before Ancestry.com's press release. But the post slipped through my fingers before I put any finish on it. Oh well. I guess I partially scooped them.

The press release is now out. You can read it here: "Ancestry.com Partners with JewishGen and the American Jewish Joint Distribution Committee (JDC) to Provide Access to Millions of Jewish Family History Records for People around the World."

When Ancestry.com and JewishGen first announced their partnership back in August at the IAJGS conference in Chicago, the best reporting of the event came from Schelly Talalay Dardashti at Tracing the Tribe. If you're interested in details, read these posts. Don't pass up the comments.

Chicago 2008: JewishGen-Ancestry press release - Text of the joint press release.
Chicago 2008: A new chapter, Part 1 - Chicago 2008 conference announcement. Speakers were The Generations Network (parent of Ancestry.com) CFO, David Rinn; David G. Marwell of the Museum of Jewish Heritage and Ancestry.com Indexing Manager, Crista Cowan.
Chicago 2008: A new chapter, Part 2 - Cowan comments. Marwell and JewishGen managing director Warren Blatt answer questions.
Chicago 2008: Logo controversy - New JewishGen logo proposed. Some think Ancestry.com is to blame. (JewishGen never adopted the new logo.)

Wednesday, October 29, 2008

Opinion piece: Ancestry.com / USGenWeb squabble

The well publicized squabble between Ancestry.com and U.S. GenWeb Project (USGenWeb), in my opinion, has hurt both. But perhaps the greatest damage has been suffered by USGenWeb and has been of its own doing.

USGenWeb is an unincorporated non-profit association of volunteers that maintain a set of geographically organized web sites. Separate, but linked, web sites exist for every county and state in the country. The binding philosophy among all these non-commercial web sites is, "Keeping Internet Genealogy Free." Many had made use of RootsWeb's free genealogy web site hosting service. When Ancestry.com acquired RootsWeb, they continued the program, despite dire predictions by some that Ancestry.com would discontinue it.

The squabble arose when Ancestry.com announced that the RootsWeb.com address was being automatically replaced with RootsWeb.Ancestry.com and that mandatory headers would be automatically added to the free genealogical web sites hosted by RootsWeb. For some sites, the headers were merely a change from the mandatory top and bottom advertisements that Ancestry.com added to the sites. For USGenWeb sites, the headers were new.

While the organization's bylaws allowed "a website [to] acknowledge any entities who may host their website (i.e., provide server space at no cost)" (Article IX, Section 2.), some web site coordinators feared the worst. (See this post or this for a couple of examples.) USGenWeb sites contain genealogical data gathered through thousands of hours of volunteer work. The mere specter of Ancestry.com assimilating these contributions led some web site coordinators to move their sites off RootsWeb. Even the national site made a quick decision to move off RootsWeb, temporarily using a private server donated by a member before moving the site to IX web hosting.

"After many years at RootsWeb, we made a quick move to another option for web hosting," Mike St. Clair, USGenWeb Advisor Board Member later reported. He advised the board that, "a more organized evaluation of the options available would be useful before we decide to confirm that quick decision for the longer term."

Those sites that have moved have spent focus and time on the task, and many are still not finished. (See for examples, ILGenWeb, Town of Essex and the Kidz Project.) Changing URLs have produced broken links, upsetting easy navigation among sites, and cutting off some outside traffic.

I just experienced a case in point

Visiting the Peabody Essex Museum's web site, I found the Phillips Library page on featured collections highlighted Essex County (Massachusetts) genealogy. The web site referred interested persons to "RootsWeb" for more information. Don't bother clicking the link, it points to www.rootsweb.com/~maessex, a dead URL. I know because I clicked the link.

When I found the link was dead, I assumed the link was to the RootsWeb resource page for Essex County, so I searched RootsWeb and noticed a link to www.rootsweb.ancestry.com/~macessex. That URL, differing by just the letter "c" surely was related, so I followed the link.

The address was for the USGenWeb Project's Essex City, Essex County site, so the Peabody's bad link must have been to a USGenWeb site. According to the Internet Archive, it was. The site was active from as far back as 18 August 2000, when it was part of the RootsWeb Genealogical Data Cooperation or GenConnect, until as recently as 24 December 2007, when it was part of USGenWeb.

Well, I was sitting on the Essex City web site. It should have been a simple matter to get to the county. I just clicked on the link to the county and...

...I was back to the dead URL www.rootsweb.com/~maessex. I used a search engine to locate the Essex County site at http://essexcountymagenweb.com, although http://essexcountyma.net will work as well. There, I found the address of the Massachusetts state web site had changed from www.rootsweb.com/~magenweb to http://magenweb.bettysgenealogy.org.

What a mess. And so I suppose it goes across the width and breadth of the U.S. GenWeb Project.

Pages spurned, Lessons Learned

From what I think I've learned from this experience, I would offer the following advice to the U.S. GenWeb Project:

Domain names should be uniform (http://cc.ss.usgenweb.org) and centrally controlled. State and county coordinators would still arrange for their own web hosting and the national organization would set the DNS address to resolve to the current host. A site could change web hosting services and one DNS change by the national organization would heal all links to the site.
Keeping data free is easier than preventing commercial exploitation. Richard Stallman, founder of the free software movement learned this the hard way when firms commercialized free software he developed. This led to the development of such copyleft copyright licenses as GPL and Creative Commons. Scientists in the Creative Commons project have abandoned attempts to prevent commercial exploitation in order to achieve their primary goal of keeping scientific data free. USGenWeb should likewise reexamine the relative importance of making data available for free versus preventing commercial exploitation of that data.
Copyright provides very little protection to USGenWeb data. While the documents as a whole on USGenWeb web sites and in the archives are copyrighted, it is by no means clear if the data in those documents are protected. There are plenty of legal justifications for anyone that wanted to "harvest" that data. The U.S. Copyright Office says, "What is not protected? ... Information that is common property [such as] lists or tables taken from public documents or other common sources." (Circular #1, p. 3.) See also, "Can You Copyright Your [Genealogy] Data," and "7th Circuit Rules that Extraction of Public Domain Data from Copyright-Protected Database Is Not Copyright Infringement." Ultimately, the decision would require judicial interpretation. An unfunded volunteer cooperative would be no legal match for a determined, cash-rich corporation. If USGenWeb is intent on preventing commercial exploitation of its data, it should seek the advice of a nationally recognized Intellectual Property (IP) lawyer. Law schools may be the place to find individuals sympathetic to their cause.
The transition away from RootsWeb would have been a great time to convert the USGenWeb Project to wiki format. Site coordinators that were moving their sites anyway could have moved the content into wiki pages. Other coordinators who had to update links to the sites that moved, could have moved their sites or simply changed the links to point to the appropriate wiki pages. A consistent page naming scheme would allow all coordinator to know what the wiki page URL would be. For sites that didn't move, wiki pages could be created with links out to the appropriate web site. Site copyrights would become page copyrights. Or members could entertain placing the copyrights in the national organization. Editing rights could be restricted to current coordinators, or opened up to any registered member. Templates could be used to encourage uniform layouts by desired groups of coordinators.

In fairness, I should write about Ancestry.com mistakes in their relationship with the USGenWeb project. I envision a piece outlining how they should have engaged the entire free genealogy community from the moment they bought RootsWeb. That's going to take hours to write. And they still don't have it right. And it's late. And I'm off to bed, so if you have an opinion, leave a comment.

Tuesday, October 28, 2008

JewishGen Belarus Databases Name Index - Ancestry.com

Search JewishGen Belarus Databases Name Index - Ancestry.com: "JewishGen Belarus Databases Name Index [database on-line]. Provo, UT, USA: The Generations Network, Inc., 2008. Original data: Index compiled from JewishGen Belarus SIG databases. This data is provided in partnership with JewishGen.org."

Monday, October 27, 2008

Visiting NARA: Maps and restaurant reviews

I recently made my first visit to the National Archives and Record Administration (NARA). This is one in a series of articles inspired by that visit to help you make your first visit to the National Archives. Earlier, I talked about lodging and transportation to the National Archives. This time, I'll review some resources available to help you prepare for your visit.

Maps and restaurant reviews

One thing I forgot to suggest last time was to pick up some maps along the way. Our rental car agency didn't offer much, but you'll want to get one anyway because it may contain driving directions for returning the car. You can also pick up a Metro System-MD/DC Route Map when you get to the Metro station. This is a large, good map and includes several bus systems. It includes a nice map of downtown D.C.

You'll probably stop at your lodging before you get to the Metro station, so avail yourself of the free area map offered at your hotel. Like ours, it probably shows central D.C., a metro map and a road map of the area surrounding your hotel. (If it doesn't show how to get to the Metro station, ask.) It will likely suggest area restaurants that paid to be listed. We tried and liked several of those suggested by ours in the Greenbelt/College Park area.

I love milkshakes and breakfast food served all day, so the Silver Diner located on Greenbelt Road got 4 of 4 stars from me. Your table jukebox doesn't require money, although a quarter slot beckons for it.
I like McDonalds for breakfast on the go and they were not disappointing; also 4 stars. Also on Greenbelt, plan on circling the building twice to navigate the drive through.
We love Outback Steakhouse so drove down to Hyattsville where it is located opposite the Prince George Plaza Metro Station (also on the Green Line). And while It took a bit of navigating to get there, we were not disappointed. We gave it 4 stars. There's also an Olive Garden in that same area that we didn't get a chance to try.
Chipotle Mexican Grill, back on Greenbelt Road, is my sister-in-law's favorite fast food so we tried it out. I give it 3 stars for the loud, modern music. I'm not hip enough I suppose.
Google reviewers highly suggested another place that paid for an ad on our area map: the 94th Aero Squadron restaurant in College Park. This one also took a bit of navigating, and we almost didn't find it because it was dusk and their sign on the parkway was not lit. We took a chance driving down the little lane and parking in the dark. We shuffled around and found what we thought was the entrance. When we left we saw strings of hundreds of little, white lights hanging about the trees. Had they been on to greet us, we might have had a better experience.

This is the kind of mood restaurant where you pay a little more for the atmosphere, the kind of place where your company holds their holiday party. The airfield part of the atmosphere was lost to us, entering in the dark as we did and sitting next to an expansive window looking out onto... not onto the College Park airport runway, but... well,... blackness. We were disappointed. Eventually they turned two outside spots on, illuminating some old farm equipment, in keeping with their World War I era French farm home motif.

The decor was pleasant, although they made no effort to dress the servers according to their theme. The food was marginally above average, but not enough to justify the price in the absence of the proper mood. We don't drink, so I have no idea if the wines offered were worthy of their French farmhouse theme. Still, when we left and saw two large aircraft sitting inches away on the lawn, now illuminated in the soft light of the trees, the sight was awesome. 2 and 1/2 stars.
Also on Greenbelt Road, KFC was old and the service slow. The employees constantly bickered, mostly in another language. And the drinking water was brown; 1 star.

Friday, October 24, 2008

Ancestry.com starting Alabama projects

Ancestry.com owner, the Generations Network, is starting a preservation project in Montgomery, Alabama as evidenced by this advertisement on Yahoo!hotjobs. Contacted for comment, spokesperson Mike Ward confirmed the project.

"We’re working with the Alabama Department of Archives & History to digitize some Civil War records on-site," disclosed Ward.

Ward said that Ancestry.com is also working on the Alabama State Census, which is being indexed through Ancestry.com's volunteer indexing program, the World Archives Project. A check of the project page did not show any Alabama State Census images currently available for download, but active indexers of any database will be able to see the resulting Alabama State Census database once it is published. For more information, visit www.ancestry.com/worldarchivesproject.

Thursday, October 23, 2008

Ancestry's Ranked Search

-- Updated 9-November-2008 with my apologies --

GNW writes,

I don't like any of the searches at Ancestry.com. It takes too much time to weed through all the results that have nothing that connects to your search. If you put in a name, dates, family members and they lived in that same county and state all of their lives, married there, and then died there, why should they start out with people who lived 1,000 miles from that location and was born 30 years after that person died? That is unforgiveable [sic] and simply put, STUPID.

Let me put together the reasons why this happens and tell you if something is being done about it.

Everyone needs a good search strategy

Ancestry's Relevance Ranked search works pretty much under the same assumptions as television's Dr. House:

Everybody lies
Everybody screws up

(Before I get into my discussion of ranked searching, let me say that if checking the Exact search box in the new search interface doesn't work as expected, you need to inform Ancestry.com. Find a current discussion on New Search on the official Ancestry.com Blog and leave a comment.)

The faulty world

Put in the context of genealogical research, Dr. House's philosophy translates to, "take nothing for granted." Take for example, a census record. On any given page of the census somewhere you can find with 95% certainty at least one of the following faults:

The census forms, questions or process gathered imprecise or ambiguous information.
The respondent gave the enumerator incorrect information or avoided him altogether. Concepts of exactness in spelling and dating have not always been as strict as today, so the spelling of names could vary wildly. Neighbors were sometimes called upon to give information for those not at home. Respondents sometimes gave information for far away relatives they feared might not be counted.
The enumerator wrote down incorrect information or didn't record everything and everyone that he was supposed to do. Sometimes fraudulent names and data were added.
Often, a second copy of each census schedule was hand copied, introducing inadvertent errors. Sometimes, these copies are all that have survived for use today.
While using the census records for their original purposes, names and information were overwritten, making some information illegible, some inconsistent with other information on the page and some incorrect.
The census records were not always properly conserved and might no longer be legible or even extant. As ink fades, the lighter strokes of cursive handwriting can change the apparent spelling of names and places. Some were microfilmed out of focus and then the originals destroyed.
The information on the census was incorrectly abstracted (i.e., extracted or indexed). Or one or more names or pages were skipped. Sometimes information vital to the interpretation of a census entry was written outside the normal fields or the abstraction software was not capable of capturing it.
The electronic search index includes errors making some records impossible to find. It might exclude some names or groups of names. Sometimes information is incorrectly indexed because of faulty standardization or handling of abbreviations, names, dates and places.
Sometimes you, the user, make typographical errors when typing information into search forms. And sometimes the targets of our searches show up in unexpected times and places.

A similar list can be produced for other types of records. Simply put, people screw up. A good searcher takes each of these errors into account and devices a search strategy accordingly. Have you ever used a successive term-dropping round-robin search to find a misindexed name? (Drop the first name, then the middle name, then the last name.) Have you ever used the successive term-dropping technique to find a person when you only had a vague guess about their location? But strip away the romance of performing dozens or hundreds of searches for one target record and the search strategy is pretty consistent. And pretty repeatable. And pretty mundane.

The ideal world

Wow! That's exactly what computers do better than humans. Lots and lots and lots of redundant tasks. So let's program the computer to do the ideal search strategy for us. I'm talking about the ideal world here, for a moment. Neither Ancestry.com nor anyone else has it right... yet.

Don't make me try all the nicknames, or even trust me to know or remember them all. Don't make me study out all the common name spellings. Don't make me study historical linguistics to find out how German pronunciation would affect phonetic name spellings. Let some expert somewhere do it once and let us all benefit from it. Don't make me explicitly search the census for family members to try and find my guy. The computer has my tree; do that search for me. Don't make me do successive term-dropping to account for the faults from the list above. Do it for me. Don't make me figure out every different name that a location was ever known by. Look them up and try them all for me. Hey, and while you're at it, can you account for common transliterations and other typos?

The real world

I'm happy to announce that Ancestry.com has been working on just such a feature for several years now. Some of the kinks are worked out. Some are not. It is called Relevance Ranked searching.

The reason you get results 30 years after the death date is because the death date you entered might be wrong or the death date on results listed might be wrong.
The reason you get results 1,000 miles away is because a location might be wrong.
The reason you get results with different names is... well you get the picture.

So it is entirely normal to get results that don't match all of your criteria. That is by design. It is entirely normal to get way too many results. They are sorted from best to worst. Look through the results until your superior brain says, "I've reached the point where the quality of the results is less than what I am willing to wade through." Then let your superior brain zero in on a particular record collection or database. Or change the search criteria. Click the exact box on selected items. Then try another search. Gradually release the autopilot and take greater control of the search. But do it after you've let the ranked search take its best crack at it.

Ancestry.com has stated that they think their current algorithm has a big problem: it ranks results by how many search terms match but doesn't penalize non-matches. Kendall Hulet discussed that here and Anne Mitchell brought it up again in this comment. Will they be able to fix this problem?

What does your brain do differently when it says, "poppycock, that's not a match!" versus "There he is! In Kansas?" If they can figure that out, then they can fix this problem.

Dear GNW

I hope that explains why you get ranked results that don't match the input criteria. As you can see, that is sometimes good and sometimes bad and as I mentioned, Ancestry.com has plans to improve this.

Give me the "name, dates, family members" that you typed into the search form. You said they "lived in that same county and state all of their lives, married there, and then died there." If I understand you correctly, you say that the very first results "start out with people who lived 1,000 miles from that location and [were] born 30 years after that person died." Send me the example and I'll make certain it gets to the right people.

Oh, and please don't read through all 24,521 results of a ranked search. When you get that many results in Google you say, "Wow! Google's awesome." But you don't try every single result.

Lastly, I'd like to remind everyone that providing Ancestry.com with detailed, actionable examples is essential to communicating your complaints. Above all, avoid unfounded emotionalism as it distracts from the real problems in New Search.

Thanks,
-- The Ancestry Insider

Wednesday, October 22, 2008

Visiting NARA: The exhibits

If this is your first visit to the National Archives, plan on spending some time to see the public exhibits. The focal exhibit is, of course, the "Charters of Freedom" housed in the central rotunda. (Click here to see an online version of the exhibit.) When we visited mid-week in October at 10:15am, the line was about a dozen people long. If you visit on a weekend during the summer, after waiting in line outside for who-knows-how-long, it will take you an hour once you enter the building.

As we entered the ante-room to the rotunda, there on our right was a 1297 A.D. copy of the Magna Carta (sold for $21 million dollars last year)! Tourists were waltzing past it with nary a twitch of recognition of what they were passing. We stood and gawked for several minutes (at this historic document, not the clueless tourists). We've seen one of the surviving four originals in the chapter house at Salisbury Cathedral near Stonehenge where English tourists crowded around.

If there are no crowds, take a look at whatever exhibit is showing in the O'Brien Gallery. From now until 29-January-2009 you can see the signature page from the treaty ending the Revolutionary War and related items. The Public Vaults is a permanent exhibit with interactive displays and some treasures from NARA's private vault. (Some, clearly marked, are reproductions of documents or items too fragile for public display.) Louisa May Alcott is highlighted in an early (1850?) census register. It may surprise you to see just how large the original pages are and how they look bound in book form. (Supplementing, but not duplicating, the Public Vaults experience, the National Archives has produced an online Digital Vaults experience that might interest you.)

More information for planning this part of your visit is available on the National Archives web site. You can consult a PDF map showing the exhibit entrance on Constitution Avenue versus the research entrance on Pennsylvania Avenue. Admission to the exhibits is free. The exhibit hours shown on the map are correct.

March 15 to Labor Day: visitors admitted 10:00am-6:30pm, closes at 7:00pm, every day.
Off season: visitors admitted 10:00am-5:00pm, closes at 5:30pm, every day but Thanksgiving and Christmas.

The research hours shown on the map are (currently) incorrect. Research hours are

9 am - 5 pm: Mondays, Tuesdays and Saturdays
9 am - 9 pm: Wednesdays, Thursdays and Fridays
Closed Sundays and Federal holidays

The Cafeteria

If you're going to last all day, you'll want to take a midday break, eat something and recharge your batteries. A cafeteria is conveniently available in the basement. Bring some cash; they don't take plastic. And while you're there...

How would you like a photograph of yourself posing with the signers of the Declaration or the Constitution? Copies of the 1936 Faulkner murals painted around the top of the rotunda can be found on the walls of the small dining area of the cafeteria. The figures are near life-size, maybe bigger. If you've brought your camera, you can stand in front of the wall and have a fellow diner take your picture. Move just enough of the tables and chairs out of the shot and no one will know you're not hanging in midair in the rotunda. (OK, maybe the light level will give you away. Or maybe the fact that it is impossible to hover in midair. But otherwise...)

Be warned. A mirrored copy of the cafeteria (in fact, sharing the same cashier) exists on the exhibits side of the building, but the dining room with the Faulkner murals can only be accessed from the research side of the building.

Tuesday, October 21, 2008

Shootout at the OK Corral

I got upset in a recent discussion on Ancestry.com's new search interface. I usually wait a day or two before responding. This time, I didn't and I thought you might be amused by part of my reply.

I'm going to generalize and combine several recent encounters into one. This no longer accurately represents the position of any single individual. The criticisms go something like this:

The new search is a step backwards and they shouldn't be pushing it on us. They obviously don't understand how real genealogists do their work. They don't listen to their customers.

I always use exact searches because the other kind [ranked searches] doesn't work. It returns so many results as to be neither manageable nor credible. Even a simple perusal shows that after a few correct results at the top, the remaining search results are preposterous, falling outside the person's lifetime or physical location.

Tree-based searching? I will never enter a tree on Ancestry.com because they will use it to make money.

My reply wasn't completely rational, but I had fun writing it. What you see below is also an combination of a couple of replies.

My reply

Some people didn't want to switch from DOS and WordPerfect to Word and Windows. That doesn't mean that Windows and Word should not have been developed. When it comes to new tools, the customer is not always right. It is painful to stop and learn how to use a drill, let alone stop to sharpen the saw, when there is so much research to do and so little money or time to do it. That doesn't mean there isn't a place for drills.

I recently attended an hour long class on search techniques for genealogy databases like Ancestry. The instructor spent 2 minutes announcing that one should never use relevance ranked searches on Ancestry and 58 minutes teaching us how to work around the problems caused by exact searches: soundex, wild cards, nicknames, multiple searches with all known alternate spellings, searching for family members, mis-filed dates, etc.

Not to be too immodest, but I'm pretty much the fastest Ancestry.com searcher on the planet and I can tell you that sometimes I prefer exact searching and sometimes I prefer ranked. Sometimes I prefer the old search and sometimes I prefer the new. I'll face anybody at the OK Coral for a shootout. You use your single shot, one gun and I'll use all four of mine. I've got to warn you, that I'm also going to attach results into a tree and hit you with a tree-based search assault. I'll be firing off rounds, moving generation to generation faster than you can perform all 38 searches on the common misspellings of just one of your ancestors.

Put up or shut up

Further, I can communicate clearly and accurately enough to convince the new search team what the problems are so they can be fixed. Let me say, and I mean this in the least rude, most kind way possible, put up or shut up. Give exact use cases comparing old and new search that show how new search is inferior.

Here's an example showing what I mean by an exact case.

Steps with old search:

Click on the Search tab.
If Historical Records is not selected, select it.
Check the Exact matches only box.
Enter the name Benjamin Wiser.
Search.
Click on Massachusetts Town Birth Records.
Expected result: see the 5 children of Benjamin Wiser.

Steps with new search:

Click on Search tab (or link).
Click on Show Advanced.
Result: list of results from all sorts of databases.

With exact instructions, Ancestry can see what they have messed up. In my example, once they realized they had dropped the ability to view exact results summarized by category, they added that capability back in.

So give actionable examples. Or buckle on your holster and grab your single-shot, old, exact search. You know where to find me. And you know what heat I'll be packing.

Monday, October 20, 2008

Yearbooks free thru October 30

Ancestry.com has announced that access to their yearbooks is free through October 30 as part of a promotion highlighting the doubling of the size of their yearbook collection to 6 million names. (The card catalog shows a size of 2,909,046. Is that the number of pages?) Click here to start searching. You will need a free account, so don't be surprised if you are asked for your name and e-mail address. However, you won't need a credit card number. If you've already got an account, you can access the yearbooks with or without a current subscription.

Ancestry.com has decided to build their yearbook collection as large as possible. Its value thus far has been limited by the sparse coverage it provides in any generation or locale. So they've started a program to digitize any yearbooks not under copyright that you're willing to donate. If you have a collection of 25 or more, they'll accept a loan rather than a donation.

The following yearbooks are eligible:

Yearbooks printed before 1963. These are now in the public domain unless copyrighted in the name of an individual author.
Yearbooks printed between 1963 and 1977 without a copyright notice. During this period, a notice was required to create copyright protection; without the notice, the yearbook is in the public domain.
Other yearbooks have copyright protection and are not eligible without signed permission from the copyright holder. These are yearbooks printed after 1977 of with a copyright notice after 1963 or before 1963 with a copyright notice in the name of an individual.

Click here for more information.

Friday, October 17, 2008

Visiting NARA: Staging a visit to Archives I

I recently made my first visit to the National Archives and Record Administration (NARA). This is one in a series of articles inspired by that visit to help you make your first visit to the National Archives. Last time I talked about Archives I vs. Archives II. This time I'll talk about where I stayed and how I got to Archives I, which is the main National Archives location in downtown District of Columbia.

Staging a visit to Archives I

Archive I - The National Archives building in downtown Washington DC I’m a westerner. It doesn’t matter where I go, I rent a car. Out west one can’t so much as stop at a gas station without a car. I doubt I could brush my teeth without a car sitting outside. So after flying into Baltimore Washington International (BWI) airport, we rented a car. I'm glad we did.

We stayed at the Greenbelt Courtyard Marriott. Did it seem a little below par because our room had been smoked in recently? Or was it just a little lower quality than I prefer? There are other choices nearby. While I didn't care much for the room, I did like the price: $109 a night. And the location. It was close to highway 295, which provides quick and ready access to BWI airport. There were familiar fast food and chain restaurants along Greenbelt Road. Since we had a rental car, it was easy to zip up and down the road without needing a GPS unit.

And, it was just a short drive to the Greenbelt station on the Green Line of the D.C. Metro.

The D.C. Metro subway system is a fast, clean, safe way to get to the National Archives. The Archives station on the Green Line is—surprise, surprise—directly across Pennsylvania Ave. from Archives I. We picked our lodging so that we were close to a green line station far enough out of the city to get lodging rates that we liked.

We purchased a SmarTrip^® card when we entered the metro station because it is required to pay the parking fee (currently $4.25 for the Greenbelt station) when you exit the metro station parking lot. If you arrive at the parking lot prior to 10:00am on a weekday, be careful not to park in a Reserved space. The card costs $5 plus whatever amount you put on the card for paying parking and metro fares. A very courteous metro employee stepped us through the process of buying one using a credit card.

A postcard showing a metro train, a message board and the ceiling of an undergrand station Fare amounts are posted on the vending machine so that you can figure out beforehand how much money to put on the SmarTrip card. If special arrangements are made, seniors and the disabled can ride at half the regular fare. Reduced fare is charged on weekends. Currently, Greenbelt to Archives is regularly $3.70, $1.85 senior/disabled and $2.35 reduced fare. The distance is 11.51 miles and the expected travel time is 33 minutes.

It's all pretty straight forward and there are nice people to help you out if need it. So! Am I tempting any of you to attempt a trip some day? The economic circumstances might not allow it at the moment. I'll talk about some of the preparation you'll want to do before waltzing off to Washington. There's no reason you can't start preparing now!

What arrangements have you used to visit the National Archives? Do you have a favorite hotel? Mass transit line? What tips can you share about arrangements?

Thursday, October 16, 2008

NFS Rollout update for 14-Oct-2008

New FamilySearch Rollout Map for 14-Oct-2008

Salt Lake City ran into winter this week like a brick wall. There's a lot of people walking around with red noses. Speaking of red, the only remaining red--or yellow for that matter--left on the map is in two groups: the "extended" Wasatch Front (what I call the red zone) and the Orient (where red is considered the color of good fortune).

Since the last map update six temples have gone live with New FamilySearch (NFS):

7-Oct-2008: Anchorage, Montreal, Oklahoma and Portland
14-Oct-2008: Nigeria and New Zealand

The major milestones reached since the last map update are

With New Zealand live, the isles of the Pacific and the southern hemisphere are both done.
Nigeria went live, so Africa is complete.
Montreal was the last temple needed to complete Canada.
Other than the red zone, the conversion of Anchorage, Oklahoma and Portland completes the United States.

Other map changes:

I added the new temples announced in conference (albeit without temple names in hover help).
I moved Edmonton a bit, as I may have had it in the wrong spot.
I fixed a couple of temples that should have been green in the last map.

I misspoke last time about the timing of the next release. If there is a release at the midpoint of each quarter, then the next two releases of NFS will be mid-November 2008 and mid-February 2009. I'm not certain when the next release of FamilySearch Family Tree (now on labs.familysearch.org) will be. It's just as well; since the release schedule has never been shared publicly, even if I knew I wouldn't be able to tell you.

Here's something I do know. A meeting was held this morning to discuss the release of NFS to the red zone!!! Yahoo!

That's it...

That's all...

I mean that's all I know.

What! You don't think they would tell the Ancestry Insider what they decided, do you? Duh!

If you hear anything that you're at liberty to share, give me a shout at AncestryInsider@gmail.com .

Wednesday, October 15, 2008

Hot Keys in the Ancestry.com New Search User Interface

Ancestry.com has added hot keys in the New Search User Interface. Hot keys are a common means of adding power tools for power users. In this case, Ancestry.com is using hot keys to make it possible to be as productive in the new search interface as one could be in the old.

A common criticism of the new search interface is the additional mouse clicks required in iterative searching. Some difficult to locate records require iteratively refining the search parameters. In the old search interface, a single click on Refine your search takes you to a full search form, pre-populated with the information from your previous search. You can then switch surnames or locales or any other information that might bring up that elusive record. With new search, all the parameters were buried in the interface and had to be individually clicked before changes could be made.

With this new change, users of the new search can now match the productivity of old search for iterative searching. Further, by assigning the action to a keystroke, Ancestry allows users to iterate without the need to lift one hand off the keyboard and move it to the mouse--an operation that is surprisingly time-expensive, as indexers have learned.

The hot keys are

Hot Key	Action
r	Refine the current search
n	start a New search
p	show first Preview
< key (comma)	show Previous preview
> key (period)	show Next preview

A preview is the popup you get when you hover your mouse cursor over the title of a record in the list of matching results. Yet again, this gives you a keyboard substitute for an operation normally requiring you move your hand from the keyboard to the mouse.

A preview is the popup you get when you hover your mouse over a result

Press P to bring up the first preview. Thereafter use the period key to move to the next record and the comma key to move to the previous record. These were chosen so the < and > symbols on those keys can serve as mnemonics for moving back and forth, respectively. The J and K keys also work for forward and back.

A little known extra provided by the Preview is that it presents a bit more information to non-subscribers than is shown in the result list. (It used to show all the information that only subscribers were supposed to see. Sorry! Ancestry.com fixed that bug.)

I hope Ancestry.com will add the < and > keys also, in case someone forgets and uses the shift key along with the mnemonic key.

Monday, October 13, 2008

Visiting NARA: The Insider goes inside the National Archives

I recently made my first visit to the National Archives and Record Administration (NARA). This is the first in a series of articles inspired by that visit to help you make your first visit to the National Archives.

Have you ever thought about visiting the National Archives and Records Administration (a.k.a. the National Archives, or simply, NARA) in Washington DC? I have for years. When a job assignment took me to Washington DC, I took the opportunity to make my first visit. I thought if I shared what I learned, maybe some of you might be emboldened enough to try it also. I learned a lot. There is so much to share; where should I begin?

I and II

In my riddle last week I pointed out that NARA has two locations in the Washington DC area. Archives I is the original NARA building downtown where Nicholas Cage stole the Declaration of Independence. Archives I contains virtually all collections of use to genealogists. These collections include census, military (pre-WWI and WWI Navy, immigration, naturalization and other records.

NARA’s collection became too large for their downtown facility. Some libraries switch from open stacks to closed stacks when running out of space. NARA may have always had closed stacks to protect and preserve the precious records of our nation’s past. (If you know, leave us a comment.) To deal with their space issues and to handle new technologies, NARA built a second facility in the greater DC area on land provided by the University of Maryland (UM) in College Park. An official at the conference last week thought that NARA might be paying a dollar a year, or some arrangement like that, for use of the land. (I gathered from the conference that UM has excellent library and/or archive programs and that the two have a lot of synergy.)

The National Archives Building in College Park, Maryland Archives II, as the College Park location is known, houses collections in photographic, audio, video and electronic formats as well as old technologies such as architectural drawings and maps. This doesn’t mean there isn’t anything useful for genealogists at Archives II. But a typical patron is not typically going to find information specific to a typical ancestor.

During normal business hours (M-F, 8am-5pm), a staff shuttle runs every hour, on the hour, from each location to the other. On a space-available basis, researchers can also use this free service. I never rode the bus, so I can't give you exact information. Catch the shuttle at Archives I at the Metrobus shelter located on 7th Street between Pennsylvania and Constitution Avenues. At Archives II, use the shuttle bus stop adjacent to the archives building. The two locations are 10 miles apart. Google estimates the travel time to be about 24 minutes, but I imagine it actually takes 30 minutes or more.

Friday, October 10, 2008

Answer: Where's the Insider?

Pennsylvanians are in me.
But I am not in Pennsylvania.

I am not the one ("Archive I") on Pennsylvania;
I am north of Pennsylvania, too ("Archive II").

The answer? I've been at the National Archives at College Park, Maryland for Partnerships in Innovation (2008) II: From Vision to Reality and Beyond.

The Pennsylvania of the first couplet is the state. The Pennsylvania of the second couplet is Pennsylvania Avenue.

So, how'd you do? See ya' next week.

Thursday, October 9, 2008

Where's the Insider? Clue 4

Clue 4:

Pennsylvanians are in me.
But I am not in Pennsylvania.

I am not the one on Pennsylvania;
I am north of Pennsylvania, too.

This is your last clue. Part of the riddle is that I am not the "I" speaking the riddle. Post your official guess to the first message on Monday. (Click here to post your guess.) If you change your mind, post the new guess. Your latest dated answer will be your official entry. The earliest dated correct answer wins. I'll announce where I was on Friday. Stay tuned...

Wednesday, October 8, 2008

Where's the Insider? Clue 3

Clue 3:

Pennsylvanians are in me.
But I am not in Pennsylvania.

I am north of Pennsylvania.

I'll give you one clue per day, posted at noon Utah time. Post your one official guess to the first message on Monday. If you change your mind, delete the old guess and post the new guess. The earliest dated correct answer wins. I'll announce where I was on Friday. Stay tuned...

Tuesday, October 7, 2008

Where's the Insider? Clue 2

Clue 2:

Pennsylvanians are in me.
But I am not in Pennsylvania.

Monday, October 6, 2008

Where's Waldo... er... the Ancestry Insider? Clue 1

The Ancestry Insider's appearance in Waldo style

A year ago the Family Tree Magazine staff Simpsonized themselves and then challenged me to reveal my Simpsonized appearance. My Simpsonized self does not meet the FamilySearch dress code, so when I came across http://www.findwaldo.com/avatar, I took the opportunity to Waldo-ize myself with white shirt and tie. Say, I look like I'm on my way somewhere.

Which I am. I'm on my way to... Well... Maybe I'll let you guess. I'll give you one clue per day, posted at noon Utah time. Respond to this message with your one official guess. If you change your mind, delete the old guess and post the new guess. The earliest dated correct answer wins. I'll announce where I was on Friday.

Clue 1: Pennsylvanians are in me.

Stay tuned...

Name counts in table-style databases

In my article about Ancestry.com database records, we saw that Ancestry.com databases usually contain more than one name in each database record. The same will be true of other vendors as well. For this reason, Ancestry.com and other vendors like to communicate the size of database by talking about name counts rather than record counts. Unfortunately, name counts are more open to interpretation than database record counts.

Table-style databases

In my article on records I referred to table-style databases. I've also heard them called fielded databases. These are databases stored like tables or spreadsheets. Census databases are table-style and are stored internally in tables not unlike the ones the enumerators filled out.

Table-style databases are stored internally in tables not unlike census forms.
The example, above, shows President Bush ancestors in 1900.
Image courtesy FamilySearch. © 2008 by Intellectual Reserve, Inc. All rights reserved.

When Ancestry.com first started reporting database sizes by name counts, it had to rely on estimates of the number of names in each record because they had no mechanism to count the number of names actually present. For example, the "U.S. Phone and Address Directories, 1993-2002" database is a table-style database that has 313,282,124 records. Each record can contain two names since a telephone listing can include a spouse name. In the table there is one column for the primary name and another for a second name.

A simple estimate of the name count would be to multiply the number of records by two, which would give 626,564,248 names, the number reported by the new card catalog. But many, maybe most, telephone listings don't include a second name. If one were to assume that 1/3 of the listings contain a second name, then the actual name count would be smaller than what is reported by some 200 million names!

Name counts are open to abuse because the term is open to interpretation. Look up the U.S. Phone Directories in the old card catalog and you'll find the number of names estimated to be 862,075,337! That's more names than there are places in the database to store names! And it's probably about 400 million more names than are present in the database. That's 100 million names short of an over-count of a half-billion names for a single database!

I recommend that any vendors that publish name counts for table-style databases report the actual number of names present. And when an estimated count is published, the vendor should plainly designate it as such. I wouldn't mind seeing the math symbol (≈) meaning about before estimated numbers.

When I return, I'll talk about database types that require estimated name counts. (You may recall a previous article on that topic, Unbelievable Name Count Claims.)

Friday, October 3, 2008

FamilySearch Record Search Update: 29 Million New Records Added

FamilySearch released this message earlier this week.

29 Million More Records Added to Record Search Pilot

02 October 2008

Over 29 million names or record images were added this week to FamilySearch’s Record Search pilot. Significant data was published from 2 indexing projects (1860 and 1870 U.S. Censuses), 3 digital image collections (Vermont Probate Files, Quebec Parish Registers, and Cheshire Church Records), and 3 enhanced vital record index collections (Mexico and Germany Baptisms).

The entire collection can be searched for free directly online at http://pilot.familysearch.org or through the Search Records feature at FamilySearch.org.

Following is a chart of the new data added the week of September 29, 2008.

Collection Name	Indexed Records	Digital Images	Comments
1860 US Census	7,015,614	7,015,614	Updated – Illinois reworked; added 22 *New* states with image links to Footnote.com
1870 US Census	3,308,819		Updated - 3 new states (MA,NJ,MS)
Vermont Probate Files		205,527	*New*
Quebec Parish Registers		1,361,289	*New*
Cheshire Church Records		698,970	*New*
Mexico Baptisms	17,038,268 *New*, 19,682,189 reloaded	N/A	Updated - reworked to improve search experience with surnames, dates, etc.
Germany Baptisms	20,626,866 reloaded	N/A	Updated - reworked to improve search experience with surnames, dates, etc.
Germany Marriages	4,439,954 reloaded	N/A	Updated - reworked to improve search experience with surnames, dates, etc.

Biography

The Ancestry Insider was a readers’ choice for the top four genealogy news and resources blogs, part of Family Tree Magazine’s “40 Best Genealogy Blogs” for 2010. He reports on the two big genealogy organizations, Ancestry.com and FamilySearch. He was named a “Most Popular Genealogy Blogs” by ProGenealogists, and has received Family Tree Magazine’s “101 Best Web Sites” award every year since 2008. A genealogical technologist, the Insider has a post-graduate technology degree and holds a dozen technology patents in the United States and abroad. He has done genealogy since 1972 and has worked in the computer industry since 1978. He was Time Magazine Man of the Year in both 1966 and 2006. And he really is descended from an Indian princess.

Subscribe by Email